intermediate backup

This commit is contained in:
2025-05-02 14:36:19 +02:00
parent 980696aef5
commit 2b0a5728d4
16 changed files with 2780 additions and 316 deletions

68
rules.md Normal file
View File

@ -0,0 +1,68 @@
## Coding Style Rules & Paradigms
### Configuration Driven
* Uses **Pydantic** heavily (`utils/models.py`) to define configuration schemas.
* Configuration is loaded from a **YAML file** (`config.yaml`) at runtime (`main.py`).
* The `Config` object (or relevant sub-configs) is **passed down** through function calls, making parameters explicit.
* A **template configuration** (`_config.yaml`) is often included within the package.
### Modularity
* Code is organized into **logical sub-packages** (`io`, `processing`, `pipeline`, `visualization`, `synthesis`, `utils`, `validation`).
* Each sub-package has an `__init__.py`, often used to **expose key functions/classes** to the parent level.
* **Helper functions** (often internal, prefixed with `_`) are frequently used to break down complex logic within modules (e.g., `processing/surface_helper.py`, `pipeline/runner.py` helpers).
### Logging
* Uses the standard **`logging` library**.
* Loggers are obtained per module using `logger = logging.getLogger(__name__)`.
* **Logging levels** (`DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`) are used semantically:
* `DEBUG`: Verbose internal steps.
* `INFO`: Major milestones/stages.
* `WARNING`: Recoverable issues or deviations.
* `ERROR`: Specific failures that might be handled.
* `CRITICAL`: Fatal errors causing exits.
* **Root logger configuration** happens in `main.py`, potentially adjusted based on the `debug` flag in the config.
### Error Handling ("Fail Hard but Helpful")
* The main entry point (`main.py`) uses a **top-level `try...except` block** to catch major failures during config loading or pipeline execution.
* **Critical errors** are logged with tracebacks (`exc_info=True`) and result in `sys.exit(1)`.
* Functions often return a **tuple indicating success/failure** and results/error messages (e.g., `(result_data, error_message)` or `(success_flag, result_data)`).
* Lower-level functions may log errors/warnings but **allow processing to continue** if feasible and configured (e.g., `allow_segmentation_errors`).
* **Specific exceptions** are caught where appropriate (`FileNotFoundError`, `pydicom.errors.InvalidDicomError`, `ValueError`, etc.).
* **Pydantic validation errors** during config loading are treated as critical.
### Typing
* Consistent use of **Python type hints** (`typing` module: `Optional`, `Dict`, `List`, `Tuple`, `Union`, `Callable`, `Literal`, etc.).
* **Pydantic models** rely heavily on type hints for validation.
### Data Structures
* **Pydantic models** define primary configuration and result structures (e.g., `Config`, `ProcessingResult`, `CombinedDicomDataset`).
* **NumPy arrays** are fundamental for image/volume data.
* **Pandas DataFrames** are used for aggregating results, metadata, and creating reports (Excel).
* Standard **Python dictionaries** are used extensively for metadata and intermediate data passing.
### Naming Conventions
* Follows **PEP 8**: `snake_case` for variables and functions, `PascalCase` for classes.
* Internal helper functions are typically prefixed with an **underscore (`_`)**.
* Constants are defined in **`UPPER_SNAKE_CASE`** (often in a dedicated `utils/constants.py`).
### Documentation
* **Docstrings** are present for most functions and classes, explaining purpose, arguments (`Args:`), and return values (`Returns:`).
* Minimal **inline comments**; code aims to be self-explanatory, with docstrings providing higher-level context. (Matches your custom instructions).
### Dependencies
* Managed via `requirements.txt`.
* Uses standard **scientific Python stack** (`numpy`, `pandas`, `scipy`, `scikit-image`, `matplotlib`), **domain-specific libraries** (`pydicom`), **utility libraries** (`PyYAML`, `joblib`, `tqdm`, `openpyxl`), and `pydantic` for configuration/validation.
### Parallelism
* Uses **`joblib`** for parallel processing, configurable via the main config (`mainprocess_core_count`, `subprocess_core_count`).
* Parallelism can be **disabled** via configuration or debug mode.