mirror of
https://github.com/illiumst/marl-factory-grid.git
synced 2025-06-18 18:52:52 +02:00
updated usage and modifications rst
This commit is contained in:
@ -3,14 +3,16 @@ How to modify the environment or write modules
|
||||
|
||||
Modifying levels
|
||||
----------------
|
||||
Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [levels](marl_factory_grid/levels) for examples).
|
||||
Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see `levels`_ for examples).
|
||||
Define which *level* to use in your *config file* as:
|
||||
|
||||
>>> General:
|
||||
level_name: rooms # 'double', 'large', 'simple', ...
|
||||
.. _levels: marl_factory_grid/levels
|
||||
|
||||
... or create your own , maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_.
|
||||
Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor, `D` for `Doors`_.
|
||||
>>> General:
|
||||
level_name: rooms # 'simple', 'narrow_corridor', 'eight_puzzle',...
|
||||
|
||||
... or create your own. Maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_.
|
||||
Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor and `D` for `Doors`_.
|
||||
Other Entities (define your own) may bring their own `Symbols`.
|
||||
|
||||
.. _Walls: marl_factory_grid/environment/entity/wall.py
|
||||
@ -19,26 +21,32 @@ Other Entities (define your own) may bring their own `Symbols`.
|
||||
|
||||
Modifying Entites
|
||||
----------------
|
||||
Entites are `Objects`_ that can additionally be assigned a position.
|
||||
Entities are `Objects`_ that can additionally be assigned a position.
|
||||
Abstract Entities are provided.
|
||||
If you wish to introduce new entities to the enviroment just create a new module, ...
|
||||
|
||||
If you wish to introduce new entities to the environment just create a new module that implements the entity class. If
|
||||
necessary, provide additional classe such as custom actions or rewards and load the entity into the environment using
|
||||
the config file.
|
||||
|
||||
.. _Objects: marl_factory_grid/environment/entity/object.py
|
||||
|
||||
Modifying Groups
|
||||
----------------
|
||||
`Groups`_ are entity Sets that provide administrative access to all group members.
|
||||
All `Entities`_ are available at runtime as EnvState property.
|
||||
All `Entity Collections`_ are available at runtime as a property of the env state.
|
||||
If you add an entity, you probably also want a collection of that entity.
|
||||
|
||||
.. _Groups: marl_factory_grid/environment/groups/objects.py
|
||||
.. _Entities: marl_factory_grid/environment/entity/global_entities.py
|
||||
.. _Entity Collections: marl_factory_grid/environment/entity/global_entities.py
|
||||
|
||||
Modifying Rules
|
||||
----------------
|
||||
`Rules`_ define how the environment behaves on microscale.
|
||||
Each of the hookes (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`)
|
||||
provide env-access to implement customn logic, calculate rewards, or gather information.
|
||||
If you wish to introduce new rules to the environment....
|
||||
`Rules`_ define how the environment behaves on micro scale.
|
||||
Each of the hooks (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) provide env-access to implement custom
|
||||
logic, calculate rewards, or gather information.
|
||||
|
||||
If you wish to introduce new rules to the environment make sure it implements the Rule class and override its' hooks
|
||||
to implement your own rule logic.
|
||||
|
||||
.. _Rules: marl_factory_grid/environment/entity/object.py
|
||||
|
||||
@ -47,8 +55,7 @@ If you wish to introduce new rules to the environment....
|
||||
|
||||
Modifying Results
|
||||
----------------
|
||||
`Results`_ provide a way to return `rule` evaluations such as rewards and state reports
|
||||
back to the environment.
|
||||
`Results`_ provide a way to return `rule` evaluations such as rewards and state reports back to the environment.
|
||||
|
||||
.. _Results: marl_factory_grid/utils/results.py
|
||||
|
||||
|
@ -5,24 +5,61 @@ Environment objects, including agents, entities and rules, that are specified in
|
||||
Using ``quickstart_use`` creates a default config-file and another one that lists all possible options of the environment.
|
||||
Also, it generates an initial script where an agent is executed in the environment specified by the config-file.
|
||||
|
||||
The script initializes the environment, monitoring and recording of the environment, and includes the reinforcement learning loop:
|
||||
After initializing the environment using the specified configuration file, the script enters a reinforcement learning loop.
|
||||
The loop consists of episodes, where each episode involves resetting the environment, executing actions, and receiving feedback.
|
||||
|
||||
>>> path = Path('marl_factory_grid/configs/default_config.yaml')
|
||||
factory = Factory(path)
|
||||
factory = EnvMonitor(factory)
|
||||
factory = EnvRecorder(factory)
|
||||
for episode in trange(10):
|
||||
_ = factory.reset()
|
||||
done = False
|
||||
if render:
|
||||
factory.render()
|
||||
action_spaces = factory.action_space
|
||||
agents = []
|
||||
while not done:
|
||||
a = [randint(0, x.n - 1) for x in action_spaces]
|
||||
obs_type, _, reward, done, info = factory.step(a)
|
||||
if render:
|
||||
factory.render()
|
||||
if done:
|
||||
print(f'Episode {episode} done...')
|
||||
break
|
||||
Here's a breakdown of the key components in the provided script. Feel free to customize it based on your specific requirements:
|
||||
|
||||
1. **Initialization:**
|
||||
|
||||
>>> path = Path('marl_factory_grid/configs/default_config.yaml')
|
||||
factory = Factory(path)
|
||||
factory = EnvMonitor(factory)
|
||||
factory = EnvRecorder(factory)
|
||||
|
||||
- The `path` variable points to the location of your configuration file. Ensure it corresponds to the correct path.
|
||||
- `Factory` initializes the environment based on the provided configuration.
|
||||
- `EnvMonitor` and `EnvRecorder` are optional components. They add monitoring and recording functionalities to the environment, respectively.
|
||||
|
||||
2. **Reinforcement Learning Loop:**
|
||||
|
||||
>>> for episode in trange(10):
|
||||
_ = factory.reset()
|
||||
done = False
|
||||
if render:
|
||||
factory.render()
|
||||
action_spaces = factory.action_space
|
||||
agents = []
|
||||
|
||||
- The loop iterates over a specified number of episodes (in this case, 10).
|
||||
- `factory.reset()` resets the environment for a new episode.
|
||||
- `factory.render()` is used for visualization if rendering is enabled.
|
||||
- `action_spaces` stores the action spaces available for the agents.
|
||||
- `agents` will store agent-specific information during the episode.
|
||||
|
||||
3. **Taking Actions:**
|
||||
|
||||
>>> while not done:
|
||||
a = [randint(0, x.n - 1) for x in action_spaces]
|
||||
obs_type, _, reward, done, info = factory.step(a)
|
||||
if render:
|
||||
factory.render()
|
||||
|
||||
- Within each episode, the loop continues until the environment signals completion (`done`).
|
||||
- `a` represents a list of random actions for each agent based on their action space.
|
||||
- `factory.step(a)` executes the actions, returning observation types, rewards, completion status, and additional information.
|
||||
|
||||
4. **Handling Episode Completion:**
|
||||
|
||||
>>> if done:
|
||||
print(f'Episode {episode} done...')
|
||||
|
||||
- After each episode, a message is printed indicating its completion.
|
||||
|
||||
|
||||
Evaluating the run
|
||||
----
|
||||
|
||||
If monitoring and recording are enabled, the environment states will be traced and recorded automatically.
|
||||
|
||||
Plotting. At the moment a plot of the evaluation score across the different episodes is automatically generated.
|
||||
|
Reference in New Issue
Block a user