From cb44c7ea5da1f289616cf75bfa5ee774f79deb2d Mon Sep 17 00:00:00 2001
From: Chanumask <joelfriedrich@gmx.de>
Date: Tue, 5 Dec 2023 10:57:43 +0100
Subject: [PATCH] rewrote readme, usage and modifications rst

---
 README.md                     | 228 +++++++++-------------------------
 docs/source/modifications.rst |  58 +++++++++
 docs/source/usage.rst         |  27 +++-
 3 files changed, 143 insertions(+), 170 deletions(-)
diff --git a/README.md b/README.md
index da6344d..dc45c9d 100644
--- a/README.md
+++ b/README.md
@@ -1,176 +1,68 @@
-# EDYS
+# About EDYS
+
+### Tackling emergent dysfunctions (EDYs) in cooperation with Fraunhofer-IKS. 
+
+Collaborating with Fraunhofer-IKS, this project is dedicated to investigating Emergent Dysfunctions (EDYs)
+within multi-agent environments.
+
+### Project Objectives:
+
+- Create an environment that provokes emerging dysfunctions.
+
+  - This is achieved by creating a high level of background noise in the domain, where various entities perform diverse tasks,
+    resulting in a deliberately chaotic dynamic.
+  - The goal is to observe and analyze naturally occurring emergent dysfunctions within  the complexity generated in this dynamic environment.
+
+
+- Observational Framework:
+
+  - The project introduces an environment that is designed to capture dysfunctions as they naturally occur.
+  - The environment allows for continuous monitoring of agent behaviors, actions, and interactions.
+  - Tracking emergent dysfunctions in real-time provides valuable data for analysis and understanding.
+
+
+- Compatibility
+  - The Framework allows learning entities from different manufacturers and projects with varying representations
+  of actions and observations to interact seamlessly within the environment.
+
+
+- Placeholders
+  
+  - One can provide an agent with a placeholder observation that contains no information and offers no meaningful insights. 
+  - Later, when the environment expands and introduces additional entities available for observation, these new observations can be provided to the agent.
+  - This allows for processes such as retraining on an already initialized policy and fine-tuning to enhance the agent's performance based on the enriched information. 
 
-Tackling emergent dysfunctions (EDYs) in cooperation with Fraunhofer-IKS
 
 ## Setup
-Install this environment using `pip install marl-factory-grid`.
+Install this environment using `pip install marl-factory-grid`. For more information click [here](docs/source/installation.rst).
+Refer to [quickstart](_quickstart) for specific scenarios.
 
-## First Steps
+## Usage
 
-### Quickstart
-Most of the env. objects (entites, rules and assets) can be loaded automatically. 
-Just define what your environment needs in a *yaml*-configfile like:
+The majority of environment objects, including entities, rules, and assets, can be loaded automatically. 
+Simply specify the requirements of your environment in a [*yaml*-configfile](marl_factory_grid/configs/default_config.yaml).
 
-<details><summary>Example ConfigFile</summary>    
-    
-    # Default Configuration File
-    
-    General:
-      # RNG-seed to sample the same "random" numbers every time, to make the different runs comparable.
-      env_seed: 69
-      # Individual vs global rewards
-      individual_rewards: true
-      # The level.txt file to load from marl_factory_grid/levels
-      level_name: large
-      # View Radius; 0 = full observatbility
-      pomdp_r: 3
-      # Print all messages and events
-      verbose: false
-      # Run tests
-      tests: false
-    
-    # Agents section defines the characteristics of different agents in the environment.
-    
-    # An Agent requires a list of actions and observations.
-    # Possible actions: Noop, Charge, Clean, DestAction, DoorUse, ItemAction, MachineAction, Move8, Move4, North, NorthEast, ...
-    # Possible observations: All, Combined, GlobalPosition, Battery, ChargePods, DirtPiles, Destinations, Doors, Items, Inventory, DropOffLocations, Maintainers, ...
-    # You can use 'clone' as the agent name to have multiple instances with either a list of names or an int specifying the number of clones.
-    Agents:
-      Wolfgang:
-        Actions:
-          - Noop
-          - Charge
-          - Clean
-          - DestAction
-          - DoorUse
-          - ItemAction
-          - Move8
-        Observations:
-          - Combined:
-              - Other
-              - Walls
-          - GlobalPosition
-          - Battery
-          - ChargePods
-          - DirtPiles
-          - Destinations
-          - Doors
-          - Items
-          - Inventory
-          - DropOffLocations
-          - Maintainers
-    
-    # Entities section defines the initial parameters and behaviors of different entities in the environment.
-    # Entities all spawn using coords_or_quantity, a number of entities or coordinates to place them.
-    Entities:
-      # Batteries: Entities representing power sources for agents.
-      Batteries:
-        initial_charge: 0.8
-        per_action_costs: 0.02
-    
-      # ChargePods: Entities representing charging stations for Batteries.
-      ChargePods:
-        coords_or_quantity: 2
-    
-      # Destinations: Entities representing target locations for agents.
-      # - spawn_mode: GROUPED or SINGLE. Determines how destinations are spawned.
-      Destinations:
-        coords_or_quantity: 1
-        spawn_mode: GROUPED
-    
-      # DirtPiles: Entities representing piles of dirt.
-      # - initial_amount: Initial amount of dirt in each pile.
-      # - clean_amount: Amount of dirt cleaned in each cleaning action.
-      # - dirt_spawn_r_var: Random variation in dirt spawn amounts.
-      # - max_global_amount: Maximum total amount of dirt allowed in the environment.
-      # - max_local_amount: Maximum amount of dirt allowed in one position.
-      DirtPiles:
-        coords_or_quantity: 10
-        initial_amount: 2
-        clean_amount: 1
-        dirt_spawn_r_var: 0.1
-        max_global_amount: 20
-        max_local_amount: 5
-    
-      # Doors are spawned using the level map.
-      Doors:
-    
-      # DropOffLocations: Entities representing locations where agents can drop off items.
-      # - max_dropoff_storage_size: Maximum storage capacity at each drop-off location.
-      DropOffLocations:
-        coords_or_quantity: 1
-        max_dropoff_storage_size: 0
-    
-      # GlobalPositions.
-      GlobalPositions: { }
-    
-      # Inventories: Entities representing inventories for agents.
-      Inventories: { }
-    
-      # Items: Entities representing items in the environment.
-      Items:
-        coords_or_quantity: 5
-    
-      # Machines: Entities representing machines in the environment.
-      Machines:
-        coords_or_quantity: 2
-    
-      # Maintainers: Entities representing maintainers that aim to maintain machines.
-      Maintainers:
-        coords_or_quantity: 1
-    
-      # Zones: Entities representing zones in the environment.
-      Zones: { }
-    
-    
-    # Rules section specifies the rules governing the dynamics of the environment.
-    Rules:
-      # Environment Dynamics
-      # When stepping over a dirt pile, entities carry a ratio of the dirt to their next position
-      EntitiesSmearDirtOnMove:
-        smear_ratio: 0.2
-      # Doors automatically close after a certain number of time steps
-      DoorAutoClose:
-        close_frequency: 10
-      # Maintainers move at every time step
-      MoveMaintainers:
-    
-      # Respawn Stuff
-      # Define how dirt should respawn after the initial spawn
-      RespawnDirt:
-        respawn_freq: 15
-      # Define how items should respawn after the initial spawn
-      RespawnItems:
-        respawn_freq: 15
-    
-      # Utilities
-      # This rule defines the collision mechanic, introduces a related DoneCondition and lets you specify rewards.
-      # Can be omitted/ignored if you do not want to take care of collisions at all.
-      WatchCollisions:
-        done_at_collisions: false
-    
-      # Done Conditions
-      # Define the conditions for the environment to stop. Either success or a fail conditions.
-      # The environment stops when an agent reaches a destination
-      DoneAtDestinationReach:
-      # The environment stops when all dirt is cleaned
-      DoneOnAllDirtCleaned:
-      # The environment stops when a battery is discharged
-      DoneAtBatteryDischarge:
-      # The environment stops when a maintainer reports a collision
-      DoneAtMaintainerCollision:
-      # The environment stops after max steps
-      DoneAtMaxStepsReached:
-        max_steps: 500
+If you only plan on using the environment without making any modifications, use ``quickstart_use``.
+This creates a default config-file and another one that lists all possible options of the environment.
+Also, it generates an initial script where an agent is executed in the specified environment.
+For further details on utilizing the environment, refer to the documentation [here](docs/source/usage.rst).
 
-   </details>
+Existing modules include a variety of functionalities within the environment:
+- [Agents](marl_factory_grid/algorithms) implement either static strategies or learning algorithms based on the specific configuration.
+- Their action set includes opening [doors](marl_factory_grid/modules/doors/entitites.py), cleaning
+[dirt](marl_factory_grid/modules/clean_up/entitites.py), picking up [items](marl_factory_grid/modules/items/entitites.py) and 
+delivering them to designated drop-off locations.
+- Agents are equipped with a [battery](marl_factory_grid/modules/batteries/entitites.py) that gradually depletes over time if not charged at a chargepod.
+- The [maintainer](marl_factory_grid/modules/maintenance/entities.py) aims to repair [machines](marl_factory_grid/modules/machines/entitites.py) that lose health over time.
 
-Have a look in [\quickstart](./quickstart) for further configuration examples.
+## Customization
 
-### Make it your own
+If you plan on modifying the environment by for example adding entities or rules, use ``quickstart_modify``.
+This creates a template module and a script that runs an agent, incorporating the generated module. 
+More information on how to modify the levels, entities, groups, rules and assets [here](docs/source/modifications.rst).
 
-#### Levels
-Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [./environment/levels](./environment/levels) for examples).
+### Levels
+Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [levels](marl_factory_grid/levels) for examples).
 Define which *level* to use in your *configfile* as: 
 ```yaml
 General:
@@ -180,16 +72,16 @@ General:
 Make sure to use `#` as [Walls](marl_factory_grid/environment/entity/wall.py), `-` as free (walkable) floor, `D` for [Walls](./modules/doors/entities.py).
 Other Entites (define you own) may bring their own `Symbols`
 
-#### Entites
+### Entites
 Entites are [Objects](marl_factory_grid/environment/entity/object.py) that can additionally be assigned a position.
 Abstract Entities are provided.
 
-#### Groups
+### Groups
 [Groups](marl_factory_grid/environment/groups/objects.py) are entity Sets that provide administrative access to all group members. 
 All [Entites](marl_factory_grid/environment/entity/global_entities.py) are available at runtime as EnvState property.
 
 
-#### Rules
+### Rules
 [Rules](marl_factory_grid/environment/entity/object.py) define how the environment behaves on microscale.
 Each of the hookes (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`) 
 provide env-access to implement customn logic, calculate rewards, or gather information.
@@ -198,7 +90,7 @@ provide env-access to implement customn logic, calculate rewards, or gather info
 
 [Results](marl_factory_grid/environment/entity/object.py) provide a way to return `rule` evaluations such as rewards and state reports 
 back to the environment.
-#### Assets
+### Assets
 Make sure to bring your own assets for each Entity living in the Gridworld as the `Renderer` relies on it.
 PNG-files (transparent background) of square aspect-ratio should do the job, in general.
 
@@ -207,5 +99,3 @@ PNG-files (transparent background) of square aspect-ratio should do the job, in
 <html &nbsp&nbsp&nbsp&nbsp html> 
 <img src="/marl_factory_grid/environment/assets/agent/agent.png"  width="5%">
 
-
-
diff --git a/docs/source/modifications.rst b/docs/source/modifications.rst
index c811c93..a0db0c0 100644
--- a/docs/source/modifications.rst
+++ b/docs/source/modifications.rst
@@ -1,5 +1,63 @@
 How to modify the environment or write modules
 ===============================================
 
+Modifying levels
+----------------
+Varying levels are created by defining Walls, Floor or Doors in *.txt*-files (see [levels](marl_factory_grid/levels) for examples).
+Define which *level* to use in your *config file* as:
+
+>>> General:
+    level_name: rooms  # 'double', 'large', 'simple', ...
+
+... or create your own , maybe with the help of `asciiflow.com <https://asciiflow.com/#/>`_.
+Make sure to use `#` as `Walls`_ , `-` as free (walkable) floor, `D` for `Doors`_.
+Other Entities (define your own) may bring their own `Symbols`.
+
+.. _Walls: marl_factory_grid/environment/entity/wall.py
+.. _Doors: modules/doors/entities.py
 
 
+Modifying Entites
+----------------
+Entites are `Objects`_ that can additionally be assigned a position.
+Abstract Entities are provided.
+If you wish to introduce new entities to the enviroment just create a new module, ...
+
+.. _Objects: marl_factory_grid/environment/entity/object.py
+
+Modifying Groups
+----------------
+`Groups`_ are entity Sets that provide administrative access to all group members.
+All `Entities`_ are available at runtime as EnvState property.
+
+.. _Groups: marl_factory_grid/environment/groups/objects.py
+.. _Entities: marl_factory_grid/environment/entity/global_entities.py
+
+Modifying Rules
+----------------
+`Rules`_ define how the environment behaves on microscale.
+Each of the hookes (`on_init`, `pre_step`, `on_step`, '`post_step`', `on_done`)
+provide env-access to implement customn logic, calculate rewards, or gather information.
+If you wish to introduce new rules to the environment....
+
+.. _Rules: marl_factory_grid/environment/entity/object.py
+
+.. image:: ./images/Hooks_FIKS.png
+   :alt: Hooks Image
+
+Modifying Results
+----------------
+`Results`_ provide a way to return `rule` evaluations such as rewards and state reports
+back to the environment.
+
+.. _Results: marl_factory_grid/utils/results.py
+
+Modifying Assets
+----------------
+Make sure to bring your own assets for each Entity living in the Gridworld as the `Renderer` relies on it.
+PNG-files (transparent background) of square aspect-ratio should do the job, in general.
+
+.. image:: ./marl_factory_grid/environment/assets/wall.png
+   :alt: Wall Image
+.. image:: ./marl_factory_grid/environment/assets/agent/agent.png
+   :alt: Agent Image
diff --git a/docs/source/usage.rst b/docs/source/usage.rst
index ac74091..2b4ea45 100644
--- a/docs/source/usage.rst
+++ b/docs/source/usage.rst
@@ -1,3 +1,28 @@
-How to use the environment with your agents
+Using the environment with your agents
 ===========================================
 
+Environment objects, including agents, entities and rules, that are specified in a *yaml*-configfile will be loaded automatically.
+Using ``quickstart_use`` creates a default config-file and another one that lists all possible options of the environment.
+Also, it generates an initial script where an agent is executed in the environment specified by the config-file.
+
+The script initializes the environment, monitoring and recording of the environment, and includes the reinforcement learning loop:
+
+>>>     path = Path('marl_factory_grid/configs/default_config.yaml')
+        factory = Factory(path)
+        factory = EnvMonitor(factory)
+        factory = EnvRecorder(factory)
+        for episode in trange(10):
+            _ = factory.reset()
+            done = False
+            if render:
+                factory.render()
+            action_spaces = factory.action_space
+            agents = []
+            while not done:
+                a = [randint(0, x.n - 1) for x in action_spaces]
+                obs_type, _, reward, done, info = factory.step(a)
+                if render:
+                    factory.render()
+                if done:
+                    print(f'Episode {episode} done...')
+                    break