Files
entrix_case_challange/rules.md
2025-05-02 14:36:19 +02:00

4.0 KiB

Coding Style Rules & Paradigms

Configuration Driven

  • Uses Pydantic heavily (utils/models.py) to define configuration schemas.
  • Configuration is loaded from a YAML file (config.yaml) at runtime (main.py).
  • The Config object (or relevant sub-configs) is passed down through function calls, making parameters explicit.
  • A template configuration (_config.yaml) is often included within the package.

Modularity

  • Code is organized into logical sub-packages (io, processing, pipeline, visualization, synthesis, utils, validation).
  • Each sub-package has an __init__.py, often used to expose key functions/classes to the parent level.
  • Helper functions (often internal, prefixed with _) are frequently used to break down complex logic within modules (e.g., processing/surface_helper.py, pipeline/runner.py helpers).

Logging

  • Uses the standard logging library.
  • Loggers are obtained per module using logger = logging.getLogger(__name__).
  • Logging levels (DEBUG, INFO, WARNING, ERROR, CRITICAL) are used semantically:
    • DEBUG: Verbose internal steps.
    • INFO: Major milestones/stages.
    • WARNING: Recoverable issues or deviations.
    • ERROR: Specific failures that might be handled.
    • CRITICAL: Fatal errors causing exits.
  • Root logger configuration happens in main.py, potentially adjusted based on the debug flag in the config.

Error Handling ("Fail Hard but Helpful")

  • The main entry point (main.py) uses a top-level try...except block to catch major failures during config loading or pipeline execution.
  • Critical errors are logged with tracebacks (exc_info=True) and result in sys.exit(1).
  • Functions often return a tuple indicating success/failure and results/error messages (e.g., (result_data, error_message) or (success_flag, result_data)).
  • Lower-level functions may log errors/warnings but allow processing to continue if feasible and configured (e.g., allow_segmentation_errors).
  • Specific exceptions are caught where appropriate (FileNotFoundError, pydicom.errors.InvalidDicomError, ValueError, etc.).
  • Pydantic validation errors during config loading are treated as critical.

Typing

  • Consistent use of Python type hints (typing module: Optional, Dict, List, Tuple, Union, Callable, Literal, etc.).
  • Pydantic models rely heavily on type hints for validation.

Data Structures

  • Pydantic models define primary configuration and result structures (e.g., Config, ProcessingResult, CombinedDicomDataset).
  • NumPy arrays are fundamental for image/volume data.
  • Pandas DataFrames are used for aggregating results, metadata, and creating reports (Excel).
  • Standard Python dictionaries are used extensively for metadata and intermediate data passing.

Naming Conventions

  • Follows PEP 8: snake_case for variables and functions, PascalCase for classes.
  • Internal helper functions are typically prefixed with an underscore (_).
  • Constants are defined in UPPER_SNAKE_CASE (often in a dedicated utils/constants.py).

Documentation

  • Docstrings are present for most functions and classes, explaining purpose, arguments (Args:), and return values (Returns:).
  • Minimal inline comments; code aims to be self-explanatory, with docstrings providing higher-level context. (Matches your custom instructions).

Dependencies

  • Managed via requirements.txt.
  • Uses standard scientific Python stack (numpy, pandas, scipy, scikit-image, matplotlib), domain-specific libraries (pydicom), utility libraries (PyYAML, joblib, tqdm, openpyxl), and pydantic for configuration/validation.

Parallelism

  • Uses joblib for parallel processing, configurable via the main config (mainprocess_core_count, subprocess_core_count).
  • Parallelism can be disabled via configuration or debug mode.