Merge remote-tracking branch 'origin/documentation' into documentation

2025-07-10 23:22:40 +02:00 · 2023-12-12 08:52:13 +01:00
parent 06e176cfd5 d8ae71bf69
commit ca01dc6d3b
3 changed files with 187 additions and 170 deletions
--- a/README.md
+++ b/README.md
@ -1,176 +1,68 @@
-# EDYS
+# About EDYS
+
+### Tackling emergent dysfunctions (EDYs) in cooperation with Fraunhofer-IKS. 
+
+Collaborating with Fraunhofer-IKS, this project is dedicated to investigating Emergent Dysfunctions (EDYs)
+within multi-agent environments.
+
+### Project Objectives:
+
+- Create an environment that provokes emerging dysfunctions.
+
+  - This is achieved by creating a high level of background noise in the domain, where various entities perform diverse tasks,
+    resulting in a deliberately chaotic dynamic.
+  - The goal is to observe and analyze naturally occurring emergent dysfunctions within  the complexity generated in this dynamic environment.
+
+
+- Observational Framework:
+
+  - The project introduces an environment that is designed to capture dysfunctions as they naturally occur.
+  - The environment allows for continuous monitoring of agent behaviors, actions, and interactions.
+  - Tracking emergent dysfunctions in real-time provides valuable data for analysis and understanding.
+
+
+- Compatibility
+  - The Framework allows learning entities from different manufacturers and projects with varying representations
+  of actions and observations to interact seamlessly within the environment.
+
+
+- Placeholders
+  
+  - One can provide an agent with a placeholder observation that contains no information and offers no meaningful insights. 
+  - Later, when the environment expands and introduces additional entities available for observation, these new observations can be provided to the agent.
+  - This allows for processes such as retraining on an already initialized policy and fine-tuning to enhance the agent's performance based on the enriched information. 

-Tackling emergent dysfunctions (EDYs) in cooperation with Fraunhofer-IKS

 ## Setup
-Install this environment using `pip install marl-factory-grid`.
+Install this environment using `pip install marl-factory-grid`. For more information click [here](docs/source/installation.rst).
+Refer to [quickstart](_quickstart) for specific scenarios.

-## First Steps
+## Usage

-### Quickstart
-Most of the env. objects (entites, rules and assets) can be loaded automatically. 
-Just define what your environment needs in a *yaml*-configfile like:
+The majority of environment objects, including entities, rules, and assets, can be loaded automatically. 
+Simply specify the requirements of your environment in a [*yaml*-configfile](marl_factory_grid/configs/default_config.yaml).

-<details><summary>Example ConfigFile</summary>    
-    
-    # Default Configuration File
-    
-    General:
-      # RNG-seed to sample the same "random" numbers every time, to make the different runs comparable.
-      env_seed: 69
-      # Individual vs global rewards
-      individual_rewards: true
-      # The level.txt file to load from marl_factory_grid/levels
-      level_name: large
-      # View Radius; 0 = full observatbility
-      pomdp_r: 3
-      # Print all messages and events
-      verbose: false
-      # Run tests
-      tests: false
-    
-    # Agents section defines the characteristics of different agents in the environment.
-    
-    # An Agent requires a list of actions and observations.
-    # Possible actions: Noop, Charge, Clean, DestAction, DoorUse, ItemAction, MachineAction, Move8, Move4, North, NorthEast, ...
-    # Possible observations: All, Combined, GlobalPosition, Battery, ChargePods, DirtPiles, Destinations, Doors, Items, Inventory, DropOffLocations, Maintainers, ...
-    # You can use 'clone' as the agent name to have multiple instances with either a list of names or an int specifying the number of clones.
-    Agents:
-      Wolfgang:
-        Actions:
-          - Noop
-          - Charge
-          - Clean
-          - DestAction
-          - DoorUse
-          - ItemAction
-          - Move8
-        Observations:
-          - Combined:
-              - Other
-              - Walls
-          - GlobalPosition
-          - Battery
-          - ChargePods
-          - DirtPiles
-          - Destinations
-          - Doors
-          - Items
-          - Inventory
-          - DropOffLocations
-          - Maintainers
-    
-    # Entities section defines the initial parameters and behaviors of different entities in the environment.
-    # Entities all spawn using coords_or_quantity, a number of entities or coordinates to place them.
-    Entities:
-      # Batteries: Entities representing power sources for agents.
-      Batteries:
-        initial_charge: 0.8
-        per_action_costs: 0.02
-    
-      # ChargePods: Entities representing charging stations for Batteries.
-      ChargePods:
-        coords_or_quantity: 2
-    
-      # Destinations: Entities representing target locations for agents.
-      # - spawn_mode: GROUPED or SINGLE. Determines how destinations are spawned.
-      Destinations:
-        coords_or_quantity: 1
-        spawn_mode: GROUPED
-    
-      # DirtPiles: Entities representing piles of dirt.
-      # - initial_amount: Initial amount of dirt in each pile.
-      # - clean_amount: Amount of dirt cleaned in each cleaning action.
-      # - dirt_spawn_r_var: Random variation in dirt spawn amounts.
-      # - max_global_amount: Maximum total amount of dirt allowed in the environment.
-      # - max_local_amount: Maximum amount of dirt allowed in one position.
-      DirtPiles:
-        coords_or_quantity: 10
-        initial_amount: 2
-        clean_amount: 1
-        dirt_spawn_r_var: 0.1
-        max_global_amount: 20
-        max_local_amount: 5
-    
-      # Doors are spawned using the level map.
-      Doors:
-    
-      # DropOffLocations: Entities representing locations where agents can drop off items.
-      # - max_dropoff_storage_size: Maximum storage capacity at each drop-off location.
-      DropOffLocations:
-        coords_or_quantity: 1
-        max_dropoff_storage_size: 0
-    
-      # GlobalPositions.
-      GlobalPositions: { }
-    
-      # Inventories: Entities representing inventories for agents.
-      Inventories: { }
-    
-      # Items: Entities representing items in the environment.
-      Items:
-        coords_or_quantity: 5
-    
-      # Machines: Entities representing machines in the environment.
-      Machines:
-        coords_or_quantity: 2
-    
-      # Maintainers: Entities representing maintainers that aim to maintain machines.
-      Maintainers:
-        coords_or_quantity: 1
-    
-      # Zones: Entities representing zones in the environment.
-      Zones: { }
-    
-    
-    # Rules section specifies the rules governing the dynamics of the environment.
-    Rules:
-      # Environment Dynamics
-      # When stepping over a dirt pile, entities carry a ratio of the dirt to their next position
-      EntitiesSmearDirtOnMove:
-        smear_ratio: 0.2
-      # Doors automatically close after a certain number of time steps
-      DoorAutoClose:
-        close_frequency: 10
-      # Maintainers move at every time step
-      MoveMaintainers:
-    
-      # Respawn Stuff
-      # Define how dirt should respawn after the initial spawn
-      RespawnDirt:
-        respawn_freq: 15
-      # Define how items should respawn after the initial spawn
-      RespawnItems:
-        respawn_freq: 15
-    
-      # Utilities
-      # This rule defines the collision mechanic, introduces a related DoneCondition and lets you specify rewards.
-      # Can be omitted/ignored if you do not want to take care of collisions at all.
-      WatchCollisions:
-        done_at_collisions: false
-    
-      # Done Conditions
-      # Define the conditions for the environment to stop. Either success or a fail conditions.
-      # The environment stops when an agent reaches a destination
-      DoneAtDestinationReach:
-      # The environment stops when all dirt is cleaned
-      DoneOnAllDirtCleaned:
-      # The environment stops when a battery is discharged
-      DoneAtBatteryDischarge:
-      # The environment stops when a maintainer reports a collision
-      DoneAtMaintainerCollision:
-      # The environment stops after max steps
-      DoneAtMaxStepsReached:
-        max_steps: 500
+If you only plan on using the environment without making any modifications, use ``quickstart_use``.
+This creates a default config-file and another one that lists all possible options of the environment.
+Also, it generates an initial script where an agent is executed in the specified environment.
+For further details on utilizing the environment, refer to the documentation [here](docs/source/usage.rst).

-   </details>
+Existing modules include a variety of functionalities within the environment:
+- [Agents](marl_factory_grid/algorithms) implement either static strategies or learning algorithms based on the specific configuration.
+- Their action set includes opening [doors](marl_factory_grid/modules/doors/entitites.py), cleaning
+[dirt](marl_factory_grid/modules/clean_up/entitites.py), picking up [items](marl_factory_grid/modules/items/entitites.py) and 
+delivering them to designated drop-off locations.
+- Agents are equipped with a [battery](marl_factory_grid/modules/batteries/entitites.py) that gradually depletes over time if not charged at a chargepod.
+- The [maintainer](marl_factory_grid/modules/maintenance/entities.py) aims to repair [machines](marl_factory_grid/modules/machines/entitites.py) that lose health over time.

-Have a look in [\quickstart](./quickstart) for further configuration examples.
+## Customization

-### Make it your own
+If you plan on modifying the environment by for example adding entities or rules, use ``quickstart_modify``.
+This creates a template module and a script that runs an agent, incorporating the generated module. 
+More information on how to modify the levels, entities, groups, rules and assets [here](docs/source/modifications.rst).

-#### Levels
-Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [./environment/levels](./environment/levels) for examples).
+### Levels
+Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [levels](marl_factory_grid/levels) for examples).
 Define which *level* to use in your *configfile* as: 
 ```yaml
 General:
@ -180,16 +72,16 @@ General:
 Make sure to use `#` as [Walls](marl_factory_grid/environment/entity/wall.py), `-` as free (walkable) floor, `D` for [Walls](./modules/doors/entities.py).
 Other Entites (define you own) may bring their own `Symbols`

-#### Entites
+### Entites
 Entites are [Objects](marl_factory_grid/environment/entity/object.py) that can additionally be assigned a position.
 Abstract Entities are provided.

-#### Groups
+### Groups
 [Groups](marl_factory_grid/environment/groups/objects.py) are entity Sets that provide administrative access to all group members. 
 All [Entites](marl_factory_grid/environment/entity/global_entities.py) are available at runtime as EnvState property.


-#### Rules
+### Rules
 [Rules](marl_factory_grid/environment/entity/object.py) define how the environment behaves on microscale.
 Each of the hookes (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) 
 provide env-access to implement customn logic, calculate rewards, or gather information.
@ -198,7 +90,7 @@ provide env-access to implement customn logic, calculate rewards, or gather info

 [Results](marl_factory_grid/environment/entity/object.py) provide a way to return `rule` evaluations such as rewards and state reports 
 back to the environment.
-#### Assets
+### Assets
 Make sure to bring your own assets for each Entity living in the Gridworld as the `Renderer` relies on it.
 PNG-files (transparent background) of square aspect-ratio should do the job, in general.

@ -207,5 +99,3 @@ PNG-files (transparent background) of square aspect-ratio should do the job, in
 <html &nbsp&nbsp&nbsp&nbsp html> 
 <img src="/marl_factory_grid/environment/assets/agent/agent.png"  width="5%">

-
-
--- a/docs/source/modifications.rst
+++ b/docs/source/modifications.rst
@ -1,5 +1,70 @@
 How to modify the environment or write modules
 ===============================================

+Modifying levels
+----------------
+Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see `levels`_ for examples).
+Define which *level* to use in your *config file* as:
+
+.. _levels: marl_factory_grid/levels
+
+>>> General:
+    level_name: rooms  # 'simple', 'narrow_corridor', 'eight_puzzle',...
+
+... or create your own. Maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_.
+Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor and `D` for `Doors`_.
+Other Entities (define your own) may bring their own `Symbols`.
+
+.. _Walls: marl_factory_grid/environment/entity/wall.py
+.. _Doors: modules/doors/entities.py


+Modifying Entites
+----------------
+Entities are `Objects`_ that can additionally be assigned a position.
+Abstract Entities are provided.
+
+If you wish to introduce new entities to the environment just create a new module that implements the entity class. If
+necessary, provide additional classe such as custom actions or rewards and load the entity into the environment using
+the config file.
+
+.. _Objects: marl_factory_grid/environment/entity/object.py
+
+Modifying Groups
+----------------
+`Groups`_ are entity Sets that provide administrative access to all group members.
+All `Entity Collections`_ are available at runtime as a property of the env state.
+If you add an entity, you probably also want a collection of that entity.
+
+.. _Groups: marl_factory_grid/environment/groups/objects.py
+.. _Entity Collections: marl_factory_grid/environment/entity/global_entities.py
+
+Modifying Rules
+----------------
+`Rules`_ define how the environment behaves on micro scale.
+Each of the hooks (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) provide env-access to implement custom
+logic, calculate rewards, or gather information.
+
+If you wish to introduce new rules to the environment make sure it implements the Rule class and override its' hooks
+to implement your own rule logic.
+
+.. _Rules: marl_factory_grid/environment/entity/object.py
+
+.. image:: ./images/Hooks_FIKS.png
+   :alt: Hooks Image
+
+Modifying Results
+----------------
+`Results`_ provide a way to return `rule` evaluations such as rewards and state reports back to the environment.
+
+.. _Results: marl_factory_grid/utils/results.py
+
+Modifying Assets
+----------------
+Make sure to bring your own assets for each Entity living in the Gridworld as the `Renderer` relies on it.
+PNG-files (transparent background) of square aspect-ratio should do the job, in general.
+
+.. image:: ./marl_factory_grid/environment/assets/wall.png
+   :alt: Wall Image
+.. image:: ./marl_factory_grid/environment/assets/agent/agent.png
+   :alt: Agent Image
--- a/docs/source/usage.rst
+++ b/docs/source/usage.rst
@ -1,3 +1,65 @@
-How to use the environment with your agents
+Using the environment with your agents
 ===========================================

+Environment objects, including agents, entities and rules, that are specified in a *yaml*-configfile will be loaded automatically.
+Using ``quickstart_use`` creates a default config-file and another one that lists all possible options of the environment.
+Also, it generates an initial script where an agent is executed in the environment specified by the config-file.
+
+After initializing the environment using the specified configuration file, the script enters a reinforcement learning loop.
+The loop consists of episodes, where each episode involves resetting the environment, executing actions, and receiving feedback.
+
+Here's a breakdown of the key components in the provided script. Feel free to customize it based on your specific requirements:
+
+1. **Initialization:**
+
+>>> path = Path('marl_factory_grid/configs/default_config.yaml')
+    factory = Factory(path)
+    factory = EnvMonitor(factory)
+    factory = EnvRecorder(factory)
+
+    - The `path` variable points to the location of your configuration file. Ensure it corresponds to the correct path.
+    - `Factory` initializes the environment based on the provided configuration.
+    - `EnvMonitor` and `EnvRecorder` are optional components. They add monitoring and recording functionalities to the environment, respectively.
+
+2. **Reinforcement Learning Loop:**
+
+>>> for episode in trange(10):
+        _ = factory.reset()
+        done = False
+        if render:
+            factory.render()
+        action_spaces = factory.action_space
+        agents = []
+
+    - The loop iterates over a specified number of episodes (in this case, 10).
+    - `factory.reset()` resets the environment for a new episode.
+    - `factory.render()` is used for visualization if rendering is enabled.
+    - `action_spaces` stores the action spaces available for the agents.
+    - `agents` will store agent-specific information during the episode.
+
+3. **Taking Actions:**
+
+>>> while not done:
+        a = [randint(0, x.n - 1) for x in action_spaces]
+        obs_type, _, reward, done, info = factory.step(a)
+        if render:
+            factory.render()
+
+    - Within each episode, the loop continues until the environment signals completion (`done`).
+    - `a` represents a list of random actions for each agent based on their action space.
+    - `factory.step(a)` executes the actions, returning observation types, rewards, completion status, and additional information.
+
+4. **Handling Episode Completion:**
+
+>>> if done:
+        print(f'Episode {episode} done...')
+
+    - After each episode, a message is printed indicating its completion.
+
+
+Evaluating the run
+----
+
+If monitoring and recording are enabled, the environment states will be traced and recorded automatically.
+
+Plotting. At the moment a plot of the evaluation score across the different episodes is automatically generated.