Overview ======== .. verified:: 2025-11-25 :reviewer: Christof Buchbender What is ops-db? --------------- ops-db is the core PostgreSQL database schema and SQLAlchemy ORM layer that serves as the **operational brain** of the CCAT Data Center. It tracks everything from observation planning → execution → data movement → archival → publication. The database is implemented using SQLAlchemy ORM, providing a Python interface to all operational data. It serves as the backend for two key interfaces: * **ops-db-api** - RESTful API for programmatic access by instruments and automated systems * **ops-db-ui** - Web interface for browsing observations and monitoring system state Design Philosophy ----------------- **Single Source of Truth** All operational metadata lives in ops-db, avoiding duplication across systems. This ensures consistency and simplifies data governance. **Polymorphic Models** Many entity types use SQLAlchemy's polymorphic inheritance to accommodate different subtypes: * :py:class:`~ccat_ops_db.models.Source` has :py:class:`~ccat_ops_db.models.FixedSource`, :py:class:`~ccat_ops_db.models.SolarSystemObject`, and :py:class:`~ccat_ops_db.models.ConstantElevationSource` subtypes * :py:class:`~ccat_ops_db.models.DataLocation` has :py:class:`~ccat_ops_db.models.DiskDataLocation`, :py:class:`~ccat_ops_db.models.S3DataLocation`, and :py:class:`~ccat_ops_db.models.TapeDataLocation` subtypes * :py:class:`~ccat_ops_db.models.PhysicalCopy` has :py:class:`~ccat_ops_db.models.RawDataFilePhysicalCopy`, :py:class:`~ccat_ops_db.models.RawDataPackagePhysicalCopy`, and :py:class:`~ccat_ops_db.models.DataTransferPackagePhysicalCopy` subtypes **Physical Copy Tracking** The database tracks not just what data exists, but **WHERE** it physically exists through :py:class:`~ccat_ops_db.models.PhysicalCopy` models. This enables safe deletion, staged unpacking, and complete audit trails. **Status-Driven Workflows** Entities use Status enums (:py:class:`~ccat_ops_db.models.Status`, :py:class:`~ccat_ops_db.models.PackageState`, :py:class:`~ccat_ops_db.models.PhysicalCopyStatus`) to track processing state, enabling automated workflows and retry logic. **Relationship-Rich** Heavy use of SQLAlchemy relationships maintains referential integrity and enables efficient queries across related entities. Major Entity Categories ----------------------- The database organizes data into several major categories: **Observatory Infrastructure** The physical telescope, instruments, and modules that produce data: :py:class:`~ccat_ops_db.models.Observatory`, :py:class:`~ccat_ops_db.models.Telescope`, :py:class:`~ccat_ops_db.models.Instrument`, :py:class:`~ccat_ops_db.models.InstrumentModule`. See :doc:`observatory_hierarchy` for details. **Scientific Planning** Observing programs, sub-programs, and observation units that define what to observe: :py:class:`~ccat_ops_db.models.ObservingProgram`, :py:class:`~ccat_ops_db.models.SubObservingProgram`, :py:class:`~ccat_ops_db.models.ObsUnit`, :py:class:`~ccat_ops_db.models.Source`, :py:class:`~ccat_ops_db.models.ObsMode`. See :doc:`observation_model` for details. **Execution Tracking** Records of actual observations with timing, conditions, and status: :py:class:`~ccat_ops_db.models.ExecutedObsUnit`. See :doc:`observation_model` for details. **Data Management** Files, packages, and physical copies across multiple storage locations: :py:class:`~ccat_ops_db.models.RawDataFile`, :py:class:`~ccat_ops_db.models.RawDataPackage`, :py:class:`~ccat_ops_db.models.DataTransferPackage`, :py:class:`~ccat_ops_db.models.PhysicalCopy`. See :doc:`data_model` for details. **Transfer Infrastructure** Sites, locations, routes that define how data moves through the system: :py:class:`~ccat_ops_db.models.Site`, :py:class:`~ccat_ops_db.models.DataLocation`, :py:class:`~ccat_ops_db.models.DataTransferRoute`. See :doc:`location_model` and :doc:`transfer_model` for details. **Archival & Staging** Long-term archive transfers and staging jobs for processing: :py:class:`~ccat_ops_db.models.LongTermArchiveTransfer`, :py:class:`~ccat_ops_db.models.StagingJob`. See :doc:`transfer_model` for details. **Access Control** Users, roles, and API tokens for authentication and authorization: :py:class:`~ccat_ops_db.models.User`, :py:class:`~ccat_ops_db.models.Role`, :py:class:`~ccat_ops_db.models.ApiToken`. How Data Flows Through ops-db ----------------------------- Conceptually, data flows through ops-db as follows: 1. **Planning** - Observing programs and observation units are added prior to observations. 2. **Execution** - Telescope systems create :py:class:`~ccat_ops_db.models.ExecutedObsUnit` records when observations run 3. **Data Registration** - Raw data files are registered and linked to executed observations 4. **Packaging** - Files are bundled into :py:class:`~ccat_ops_db.models.RawDataPackage` for efficient archiving and transfer 5. **Transfer** - Packages are transferred between sites via :py:class:`~ccat_ops_db.models.DataTransferPackage` and :py:class:`~ccat_ops_db.models.DataTransfer` records 6. **Archive** - Packages are archived to long-term storage via :py:class:`~ccat_ops_db.models.LongTermArchiveTransfer` 7. **Physical Copies** - :py:class:`~ccat_ops_db.models.PhysicalCopy` records track where each file/package exists at each stage For detailed workflow documentation, see the :doc:`/data-transfer/docs/index` documentation. What ops-db Does NOT Contain ---------------------------- ops-db is a **metadata database** - it tracks information about data, not the data itself: * **Actual data files** - Files are stored on disk/S3/tape; ops-db just tracks metadata and locations * **Processing results** - Processed data is equally stored on disk/S3/tape; ops-db just tracks metadata and locations * **Real-time telescope telemetry** - That is tracked in our housekeeping system (InfluxDB) * **Long log files** - Logs are stored on disk; ops-db has references to log file paths Integration Points ------------------ ops-db integrates with several other CCAT components: * **ops-db-api** - Provides RESTful endpoints for programmatic access to the database * **ops-db-ui** - Provides a web interface for browsing and managing database records * **data-transfer** - Reads/writes transfer and archive records, orchestrates actual file movements * **system-integration** - Handles deployment and infrastructure setup For details on integration, see :doc:`../integration/related_components`. Entity Relationships -------------------- .. mermaid:: graph TB subgraph Infrastructure["Observatory Infrastructure"] OBS[Observatory] TEL[Telescope] INST[Instrument] MOD[InstrumentModule] OBS --> TEL TEL --> INST INST --> MOD end subgraph Planning["Scientific Planning"] PROG[ObservingProgram] SUB[SubObservingProgram] OU[ObsUnit] SRC[Source] PROG --> SUB PROG --> OU SUB --> OU SRC --> OU end subgraph Execution["Execution"] EOU[ExecutedObsUnit] OU --> EOU end subgraph Data["Data Management"] RDF[RawDataFile] RDP[RawDataPackage] DTP[DataTransferPackage] PC[PhysicalCopy] EOU --> RDF EOU --> RDP RDF --> RDP RDP --> DTP RDF --> PC RDP --> PC DTP --> PC end subgraph Locations["Storage Locations"] SITE[Site] DL[DataLocation] SITE --> DL PC --> DL end MOD --> RDF MOD --> RDP Next Steps ---------- Now that you understand the high-level architecture: * **Learn the observatory hierarchy**: See :doc:`observatory_hierarchy` * **Understand observation planning**: See :doc:`observation_model` * **Explore data tracking**: See :doc:`data_model` * **Learn about storage locations**: See :doc:`location_model` * **Understand data transfer**: See :doc:`transfer_model` * **Browse the complete API**: See :doc:`../api_reference/models`