updated usage and modifications rst

This commit is contained in:
Chanumask
2023-12-08 13:05:35 +01:00
parent cb44c7ea5d
commit d8ae71bf69
2 changed files with 79 additions and 35 deletions

View File

@@ -3,14 +3,16 @@ How to modify the environment or write modules
Modifying levels Modifying levels
---------------- ----------------
Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [levels](marl_factory_grid/levels) for examples). Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see `levels`_ for examples).
Define which *level* to use in your *config file* as: Define which *level* to use in your *config file* as:
>>> General: .. _levels: marl_factory_grid/levels
level_name: rooms # 'double', 'large', 'simple', ...
... or create your own , maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_. >>> General:
Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor, `D` for `Doors`_. level_name: rooms # 'simple', 'narrow_corridor', 'eight_puzzle',...
... or create your own. Maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_.
Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor and `D` for `Doors`_.
Other Entities (define your own) may bring their own `Symbols`. Other Entities (define your own) may bring their own `Symbols`.
.. _Walls: marl_factory_grid/environment/entity/wall.py .. _Walls: marl_factory_grid/environment/entity/wall.py
@@ -19,26 +21,32 @@ Other Entities (define your own) may bring their own `Symbols`.
Modifying Entites Modifying Entites
---------------- ----------------
Entites are `Objects`_ that can additionally be assigned a position. Entities are `Objects`_ that can additionally be assigned a position.
Abstract Entities are provided. Abstract Entities are provided.
If you wish to introduce new entities to the enviroment just create a new module, ...
If you wish to introduce new entities to the environment just create a new module that implements the entity class. If
necessary, provide additional classe such as custom actions or rewards and load the entity into the environment using
the config file.
.. _Objects: marl_factory_grid/environment/entity/object.py .. _Objects: marl_factory_grid/environment/entity/object.py
Modifying Groups Modifying Groups
---------------- ----------------
`Groups`_ are entity Sets that provide administrative access to all group members. `Groups`_ are entity Sets that provide administrative access to all group members.
All `Entities`_ are available at runtime as EnvState property. All `Entity Collections`_ are available at runtime as a property of the env state.
If you add an entity, you probably also want a collection of that entity.
.. _Groups: marl_factory_grid/environment/groups/objects.py .. _Groups: marl_factory_grid/environment/groups/objects.py
.. _Entities: marl_factory_grid/environment/entity/global_entities.py .. _Entity Collections: marl_factory_grid/environment/entity/global_entities.py
Modifying Rules Modifying Rules
---------------- ----------------
`Rules`_ define how the environment behaves on microscale. `Rules`_ define how the environment behaves on micro scale.
Each of the hookes (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) Each of the hooks (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) provide env-access to implement custom
provide env-access to implement customn logic, calculate rewards, or gather information. logic, calculate rewards, or gather information.
If you wish to introduce new rules to the environment....
If you wish to introduce new rules to the environment make sure it implements the Rule class and override its' hooks
to implement your own rule logic.
.. _Rules: marl_factory_grid/environment/entity/object.py .. _Rules: marl_factory_grid/environment/entity/object.py
@@ -47,8 +55,7 @@ If you wish to introduce new rules to the environment....
Modifying Results Modifying Results
---------------- ----------------
`Results`_ provide a way to return `rule` evaluations such as rewards and state reports `Results`_ provide a way to return `rule` evaluations such as rewards and state reports back to the environment.
back to the environment.
.. _Results: marl_factory_grid/utils/results.py .. _Results: marl_factory_grid/utils/results.py

View File

@@ -5,24 +5,61 @@ Environment objects, including agents, entities and rules, that are specified in
Using ``quickstart_use`` creates a default config-file and another one that lists all possible options of the environment. Using ``quickstart_use`` creates a default config-file and another one that lists all possible options of the environment.
Also, it generates an initial script where an agent is executed in the environment specified by the config-file. Also, it generates an initial script where an agent is executed in the environment specified by the config-file.
The script initializes the environment, monitoring and recording of the environment, and includes the reinforcement learning loop: After initializing the environment using the specified configuration file, the script enters a reinforcement learning loop.
The loop consists of episodes, where each episode involves resetting the environment, executing actions, and receiving feedback.
>>> path = Path('marl_factory_grid/configs/default_config.yaml') Here's a breakdown of the key components in the provided script. Feel free to customize it based on your specific requirements:
factory = Factory(path)
factory = EnvMonitor(factory) 1. **Initialization:**
factory = EnvRecorder(factory)
for episode in trange(10): >>> path = Path('marl_factory_grid/configs/default_config.yaml')
_ = factory.reset() factory = Factory(path)
done = False factory = EnvMonitor(factory)
if render: factory = EnvRecorder(factory)
factory.render()
action_spaces = factory.action_space - The `path` variable points to the location of your configuration file. Ensure it corresponds to the correct path.
agents = [] - `Factory` initializes the environment based on the provided configuration.
while not done: - `EnvMonitor` and `EnvRecorder` are optional components. They add monitoring and recording functionalities to the environment, respectively.
a = [randint(0, x.n - 1) for x in action_spaces]
obs_type, _, reward, done, info = factory.step(a) 2. **Reinforcement Learning Loop:**
if render:
factory.render() >>> for episode in trange(10):
if done: _ = factory.reset()
print(f'Episode {episode} done...') done = False
break if render:
factory.render()
action_spaces = factory.action_space
agents = []
- The loop iterates over a specified number of episodes (in this case, 10).
- `factory.reset()` resets the environment for a new episode.
- `factory.render()` is used for visualization if rendering is enabled.
- `action_spaces` stores the action spaces available for the agents.
- `agents` will store agent-specific information during the episode.
3. **Taking Actions:**
>>> while not done:
a = [randint(0, x.n - 1) for x in action_spaces]
obs_type, _, reward, done, info = factory.step(a)
if render:
factory.render()
- Within each episode, the loop continues until the environment signals completion (`done`).
- `a` represents a list of random actions for each agent based on their action space.
- `factory.step(a)` executes the actions, returning observation types, rewards, completion status, and additional information.
4. **Handling Episode Completion:**
>>> if done:
print(f'Episode {episode} done...')
- After each episode, a message is printed indicating its completion.
Evaluating the run
----
If monitoring and recording are enabled, the environment states will be traced and recorded automatically.
Plotting. At the moment a plot of the evaluation score across the different episodes is automatically generated.