Merge remote-tracking branch 'origin/documentation' into documentation

This commit is contained in:
Steffen Illium 2023-12-12 08:52:13 +01:00
commit ca01dc6d3b
3 changed files with 187 additions and 170 deletions

228
README.md
View File

@ -1,176 +1,68 @@
# EDYS
# About EDYS
### Tackling emergent dysfunctions (EDYs) in cooperation with Fraunhofer-IKS.
Collaborating with Fraunhofer-IKS, this project is dedicated to investigating Emergent Dysfunctions (EDYs)
within multi-agent environments.
### Project Objectives:
- Create an environment that provokes emerging dysfunctions.
- This is achieved by creating a high level of background noise in the domain, where various entities perform diverse tasks,
resulting in a deliberately chaotic dynamic.
- The goal is to observe and analyze naturally occurring emergent dysfunctions within the complexity generated in this dynamic environment.
- Observational Framework:
- The project introduces an environment that is designed to capture dysfunctions as they naturally occur.
- The environment allows for continuous monitoring of agent behaviors, actions, and interactions.
- Tracking emergent dysfunctions in real-time provides valuable data for analysis and understanding.
- Compatibility
- The Framework allows learning entities from different manufacturers and projects with varying representations
of actions and observations to interact seamlessly within the environment.
- Placeholders
- One can provide an agent with a placeholder observation that contains no information and offers no meaningful insights.
- Later, when the environment expands and introduces additional entities available for observation, these new observations can be provided to the agent.
- This allows for processes such as retraining on an already initialized policy and fine-tuning to enhance the agent's performance based on the enriched information.
Tackling emergent dysfunctions (EDYs) in cooperation with Fraunhofer-IKS
## Setup
Install this environment using `pip install marl-factory-grid`.
Install this environment using `pip install marl-factory-grid`. For more information click [here](docs/source/installation.rst).
Refer to [quickstart](_quickstart) for specific scenarios.
## First Steps
## Usage
### Quickstart
Most of the env. objects (entites, rules and assets) can be loaded automatically.
Just define what your environment needs in a *yaml*-configfile like:
The majority of environment objects, including entities, rules, and assets, can be loaded automatically.
Simply specify the requirements of your environment in a [*yaml*-configfile](marl_factory_grid/configs/default_config.yaml).
<details><summary>Example ConfigFile</summary>
# Default Configuration File
General:
# RNG-seed to sample the same "random" numbers every time, to make the different runs comparable.
env_seed: 69
# Individual vs global rewards
individual_rewards: true
# The level.txt file to load from marl_factory_grid/levels
level_name: large
# View Radius; 0 = full observatbility
pomdp_r: 3
# Print all messages and events
verbose: false
# Run tests
tests: false
# Agents section defines the characteristics of different agents in the environment.
# An Agent requires a list of actions and observations.
# Possible actions: Noop, Charge, Clean, DestAction, DoorUse, ItemAction, MachineAction, Move8, Move4, North, NorthEast, ...
# Possible observations: All, Combined, GlobalPosition, Battery, ChargePods, DirtPiles, Destinations, Doors, Items, Inventory, DropOffLocations, Maintainers, ...
# You can use 'clone' as the agent name to have multiple instances with either a list of names or an int specifying the number of clones.
Agents:
Wolfgang:
Actions:
- Noop
- Charge
- Clean
- DestAction
- DoorUse
- ItemAction
- Move8
Observations:
- Combined:
- Other
- Walls
- GlobalPosition
- Battery
- ChargePods
- DirtPiles
- Destinations
- Doors
- Items
- Inventory
- DropOffLocations
- Maintainers
# Entities section defines the initial parameters and behaviors of different entities in the environment.
# Entities all spawn using coords_or_quantity, a number of entities or coordinates to place them.
Entities:
# Batteries: Entities representing power sources for agents.
Batteries:
initial_charge: 0.8
per_action_costs: 0.02
# ChargePods: Entities representing charging stations for Batteries.
ChargePods:
coords_or_quantity: 2
# Destinations: Entities representing target locations for agents.
# - spawn_mode: GROUPED or SINGLE. Determines how destinations are spawned.
Destinations:
coords_or_quantity: 1
spawn_mode: GROUPED
# DirtPiles: Entities representing piles of dirt.
# - initial_amount: Initial amount of dirt in each pile.
# - clean_amount: Amount of dirt cleaned in each cleaning action.
# - dirt_spawn_r_var: Random variation in dirt spawn amounts.
# - max_global_amount: Maximum total amount of dirt allowed in the environment.
# - max_local_amount: Maximum amount of dirt allowed in one position.
DirtPiles:
coords_or_quantity: 10
initial_amount: 2
clean_amount: 1
dirt_spawn_r_var: 0.1
max_global_amount: 20
max_local_amount: 5
# Doors are spawned using the level map.
Doors:
# DropOffLocations: Entities representing locations where agents can drop off items.
# - max_dropoff_storage_size: Maximum storage capacity at each drop-off location.
DropOffLocations:
coords_or_quantity: 1
max_dropoff_storage_size: 0
# GlobalPositions.
GlobalPositions: { }
# Inventories: Entities representing inventories for agents.
Inventories: { }
# Items: Entities representing items in the environment.
Items:
coords_or_quantity: 5
# Machines: Entities representing machines in the environment.
Machines:
coords_or_quantity: 2
# Maintainers: Entities representing maintainers that aim to maintain machines.
Maintainers:
coords_or_quantity: 1
# Zones: Entities representing zones in the environment.
Zones: { }
# Rules section specifies the rules governing the dynamics of the environment.
Rules:
# Environment Dynamics
# When stepping over a dirt pile, entities carry a ratio of the dirt to their next position
EntitiesSmearDirtOnMove:
smear_ratio: 0.2
# Doors automatically close after a certain number of time steps
DoorAutoClose:
close_frequency: 10
# Maintainers move at every time step
MoveMaintainers:
# Respawn Stuff
# Define how dirt should respawn after the initial spawn
RespawnDirt:
respawn_freq: 15
# Define how items should respawn after the initial spawn
RespawnItems:
respawn_freq: 15
# Utilities
# This rule defines the collision mechanic, introduces a related DoneCondition and lets you specify rewards.
# Can be omitted/ignored if you do not want to take care of collisions at all.
WatchCollisions:
done_at_collisions: false
# Done Conditions
# Define the conditions for the environment to stop. Either success or a fail conditions.
# The environment stops when an agent reaches a destination
DoneAtDestinationReach:
# The environment stops when all dirt is cleaned
DoneOnAllDirtCleaned:
# The environment stops when a battery is discharged
DoneAtBatteryDischarge:
# The environment stops when a maintainer reports a collision
DoneAtMaintainerCollision:
# The environment stops after max steps
DoneAtMaxStepsReached:
max_steps: 500
If you only plan on using the environment without making any modifications, use ``quickstart_use``.
This creates a default config-file and another one that lists all possible options of the environment.
Also, it generates an initial script where an agent is executed in the specified environment.
For further details on utilizing the environment, refer to the documentation [here](docs/source/usage.rst).
</details>
Existing modules include a variety of functionalities within the environment:
- [Agents](marl_factory_grid/algorithms) implement either static strategies or learning algorithms based on the specific configuration.
- Their action set includes opening [doors](marl_factory_grid/modules/doors/entitites.py), cleaning
[dirt](marl_factory_grid/modules/clean_up/entitites.py), picking up [items](marl_factory_grid/modules/items/entitites.py) and
delivering them to designated drop-off locations.
- Agents are equipped with a [battery](marl_factory_grid/modules/batteries/entitites.py) that gradually depletes over time if not charged at a chargepod.
- The [maintainer](marl_factory_grid/modules/maintenance/entities.py) aims to repair [machines](marl_factory_grid/modules/machines/entitites.py) that lose health over time.
Have a look in [\quickstart](./quickstart) for further configuration examples.
## Customization
### Make it your own
If you plan on modifying the environment by for example adding entities or rules, use ``quickstart_modify``.
This creates a template module and a script that runs an agent, incorporating the generated module.
More information on how to modify the levels, entities, groups, rules and assets [here](docs/source/modifications.rst).
#### Levels
Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [./environment/levels](./environment/levels) for examples).
### Levels
Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [levels](marl_factory_grid/levels) for examples).
Define which *level* to use in your *configfile* as:
```yaml
General:
@ -180,16 +72,16 @@ General:
Make sure to use `#` as [Walls](marl_factory_grid/environment/entity/wall.py), `-` as free (walkable) floor, `D` for [Walls](./modules/doors/entities.py).
Other Entites (define you own) may bring their own `Symbols`
#### Entites
### Entites
Entites are [Objects](marl_factory_grid/environment/entity/object.py) that can additionally be assigned a position.
Abstract Entities are provided.
#### Groups
### Groups
[Groups](marl_factory_grid/environment/groups/objects.py) are entity Sets that provide administrative access to all group members.
All [Entites](marl_factory_grid/environment/entity/global_entities.py) are available at runtime as EnvState property.
#### Rules
### Rules
[Rules](marl_factory_grid/environment/entity/object.py) define how the environment behaves on microscale.
Each of the hookes (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`)
provide env-access to implement customn logic, calculate rewards, or gather information.
@ -198,7 +90,7 @@ provide env-access to implement customn logic, calculate rewards, or gather info
[Results](marl_factory_grid/environment/entity/object.py) provide a way to return `rule` evaluations such as rewards and state reports
back to the environment.
#### Assets
### Assets
Make sure to bring your own assets for each Entity living in the Gridworld as the `Renderer` relies on it.
PNG-files (transparent background) of square aspect-ratio should do the job, in general.
@ -207,5 +99,3 @@ PNG-files (transparent background) of square aspect-ratio should do the job, in
<html &nbsp&nbsp&nbsp&nbsp html>
<img src="/marl_factory_grid/environment/assets/agent/agent.png" width="5%">

View File

@ -1,5 +1,70 @@
How to modify the environment or write modules
===============================================
Modifying levels
----------------
Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see `levels`_ for examples).
Define which *level* to use in your *config file* as:
.. _levels: marl_factory_grid/levels
>>> General:
level_name: rooms # 'simple', 'narrow_corridor', 'eight_puzzle',...
... or create your own. Maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_.
Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor and `D` for `Doors`_.
Other Entities (define your own) may bring their own `Symbols`.
.. _Walls: marl_factory_grid/environment/entity/wall.py
.. _Doors: modules/doors/entities.py
Modifying Entites
----------------
Entities are `Objects`_ that can additionally be assigned a position.
Abstract Entities are provided.
If you wish to introduce new entities to the environment just create a new module that implements the entity class. If
necessary, provide additional classe such as custom actions or rewards and load the entity into the environment using
the config file.
.. _Objects: marl_factory_grid/environment/entity/object.py
Modifying Groups
----------------
`Groups`_ are entity Sets that provide administrative access to all group members.
All `Entity Collections`_ are available at runtime as a property of the env state.
If you add an entity, you probably also want a collection of that entity.
.. _Groups: marl_factory_grid/environment/groups/objects.py
.. _Entity Collections: marl_factory_grid/environment/entity/global_entities.py
Modifying Rules
----------------
`Rules`_ define how the environment behaves on micro scale.
Each of the hooks (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) provide env-access to implement custom
logic, calculate rewards, or gather information.
If you wish to introduce new rules to the environment make sure it implements the Rule class and override its' hooks
to implement your own rule logic.
.. _Rules: marl_factory_grid/environment/entity/object.py
.. image:: ./images/Hooks_FIKS.png
:alt: Hooks Image
Modifying Results
----------------
`Results`_ provide a way to return `rule` evaluations such as rewards and state reports back to the environment.
.. _Results: marl_factory_grid/utils/results.py
Modifying Assets
----------------
Make sure to bring your own assets for each Entity living in the Gridworld as the `Renderer` relies on it.
PNG-files (transparent background) of square aspect-ratio should do the job, in general.
.. image:: ./marl_factory_grid/environment/assets/wall.png
:alt: Wall Image
.. image:: ./marl_factory_grid/environment/assets/agent/agent.png
:alt: Agent Image

View File

@ -1,3 +1,65 @@
How to use the environment with your agents
Using the environment with your agents
===========================================
Environment objects, including agents, entities and rules, that are specified in a *yaml*-configfile will be loaded automatically.
Using ``quickstart_use`` creates a default config-file and another one that lists all possible options of the environment.
Also, it generates an initial script where an agent is executed in the environment specified by the config-file.
After initializing the environment using the specified configuration file, the script enters a reinforcement learning loop.
The loop consists of episodes, where each episode involves resetting the environment, executing actions, and receiving feedback.
Here's a breakdown of the key components in the provided script. Feel free to customize it based on your specific requirements:
1. **Initialization:**
>>> path = Path('marl_factory_grid/configs/default_config.yaml')
factory = Factory(path)
factory = EnvMonitor(factory)
factory = EnvRecorder(factory)
- The `path` variable points to the location of your configuration file. Ensure it corresponds to the correct path.
- `Factory` initializes the environment based on the provided configuration.
- `EnvMonitor` and `EnvRecorder` are optional components. They add monitoring and recording functionalities to the environment, respectively.
2. **Reinforcement Learning Loop:**
>>> for episode in trange(10):
_ = factory.reset()
done = False
if render:
factory.render()
action_spaces = factory.action_space
agents = []
- The loop iterates over a specified number of episodes (in this case, 10).
- `factory.reset()` resets the environment for a new episode.
- `factory.render()` is used for visualization if rendering is enabled.
- `action_spaces` stores the action spaces available for the agents.
- `agents` will store agent-specific information during the episode.
3. **Taking Actions:**
>>> while not done:
a = [randint(0, x.n - 1) for x in action_spaces]
obs_type, _, reward, done, info = factory.step(a)
if render:
factory.render()
- Within each episode, the loop continues until the environment signals completion (`done`).
- `a` represents a list of random actions for each agent based on their action space.
- `factory.step(a)` executes the actions, returning observation types, rewards, completion status, and additional information.
4. **Handling Episode Completion:**
>>> if done:
print(f'Episode {episode} done...')
- After each episode, a message is printed indicating its completion.
Evaluating the run
----
If monitoring and recording are enabled, the environment states will be traced and recorded automatically.
Plotting. At the moment a plot of the evaluation score across the different episodes is automatically generated.