## Coding Style Rules & Paradigms ### Configuration Driven * Uses **Pydantic** heavily (`utils/models.py`) to define configuration schemas. * Configuration is loaded from a **YAML file** (`config.yaml`) at runtime (`main.py`). * The `Config` object (or relevant sub-configs) is **passed down** through function calls, making parameters explicit. * A **template configuration** (`_config.yaml`) is often included within the package. ### Modularity * Code is organized into **logical sub-packages** (`io`, `processing`, `pipeline`, `visualization`, `synthesis`, `utils`, `validation`). * Each sub-package has an `__init__.py`, often used to **expose key functions/classes** to the parent level. * **Helper functions** (often internal, prefixed with `_`) are frequently used to break down complex logic within modules (e.g., `processing/surface_helper.py`, `pipeline/runner.py` helpers). ### Logging * Uses the standard **`logging` library**. * Loggers are obtained per module using `logger = logging.getLogger(__name__)`. * **Logging levels** (`DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`) are used semantically: * `DEBUG`: Verbose internal steps. * `INFO`: Major milestones/stages. * `WARNING`: Recoverable issues or deviations. * `ERROR`: Specific failures that might be handled. * `CRITICAL`: Fatal errors causing exits. * **Root logger configuration** happens in `main.py`, potentially adjusted based on the `debug` flag in the config. ### Error Handling ("Fail Hard but Helpful") * The main entry point (`main.py`) uses a **top-level `try...except` block** to catch major failures during config loading or pipeline execution. * **Critical errors** are logged with tracebacks (`exc_info=True`) and result in `sys.exit(1)`. * Functions often return a **tuple indicating success/failure** and results/error messages (e.g., `(result_data, error_message)` or `(success_flag, result_data)`). * Lower-level functions may log errors/warnings but **allow processing to continue** if feasible and configured (e.g., `allow_segmentation_errors`). * **Specific exceptions** are caught where appropriate (`FileNotFoundError`, `pydicom.errors.InvalidDicomError`, `ValueError`, etc.). * **Pydantic validation errors** during config loading are treated as critical. ### Typing * Consistent use of **Python type hints** (`typing` module: `Optional`, `Dict`, `List`, `Tuple`, `Union`, `Callable`, `Literal`, etc.). * **Pydantic models** rely heavily on type hints for validation. ### Data Structures * **Pydantic models** define primary configuration and result structures (e.g., `Config`, `ProcessingResult`, `CombinedDicomDataset`). * **NumPy arrays** are fundamental for image/volume data. * **Pandas DataFrames** are used for aggregating results, metadata, and creating reports (Excel). * Standard **Python dictionaries** are used extensively for metadata and intermediate data passing. ### Naming Conventions * Follows **PEP 8**: `snake_case` for variables and functions, `PascalCase` for classes. * Internal helper functions are typically prefixed with an **underscore (`_`)**. * Constants are defined in **`UPPER_SNAKE_CASE`** (often in a dedicated `utils/constants.py`). ### Documentation * **Docstrings** are present for most functions and classes, explaining purpose, arguments (`Args:`), and return values (`Returns:`). * Minimal **inline comments**; code aims to be self-explanatory, with docstrings providing higher-level context. (Matches your custom instructions). ### Dependencies * Managed via `requirements.txt`. * Uses standard **scientific Python stack** (`numpy`, `pandas`, `scipy`, `scikit-image`, `matplotlib`), **domain-specific libraries** (`pydicom`), **utility libraries** (`PyYAML`, `joblib`, `tqdm`, `openpyxl`), and `pydantic` for configuration/validation. ### Parallelism * Uses **`joblib`** for parallel processing, configurable via the main config (`mainprocess_core_count`, `subprocess_core_count`). * Parallelism can be **disabled** via configuration or debug mode.