NEW: Dependency config design and content#2812

Open

tech3371 wants to merge 2 commits intoIMAP-Science-Operations-Center:devfrom

tech3371:dependency_config_doc

Contributor

tech3371 commented Mar 3, 2026

Change Summary

closes IMAP-Science-Operations-Center/sds-data-manager#1151

Overview

File changes

This contains final design of new config file. It contains information such as filename convention, new file content and required/optional fields and defaults used. The part that I need feedback the most is the time range options and the example content.

Testing


          NEW: Dependency config design and content

c222496

tech3371 requested review from lacoak21, laspsandoval, maxinelasp and subagonsouth

March 3, 2026 23:55

tech3371 self-assigned this

tech3371 commented

View reviewed changes

docs/source/data-access/data-dependency.rst

+              - ``p`` - pointing
+              - ``h`` - hourly
+              - ``d`` - days
+              - ``l`` - last_processed

Contributor Author

tech3371 Mar 4, 2026

similar to last_processed, we need nearest in the past.

But Hi wants nearest 7 irrespective of past or future.

Contributor Author

tech3371 Mar 4, 2026

options:
past_nearest, future_nearest, any_nearest.

or
(7p) - this means any future or past data.
(1n, 0n) - this means get me last nearest data from past.

What to do if hi science file event comes and then need to looks up for SWE dependency which is daily?
pointing number 9 file came, we look nearest 7 files, then derive date range using earlier and latest pointing id of that 7 files and look data range from the pointing table. Then use date range to query for SPICE and other dependency. If dependency is found,

Contributor Author

tech3371 Mar 4, 2026

one class for daily and pointing. Then parent class can be inherit those as needed for ENA or not.

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst Outdated Show resolved Hide resolved

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

-              This step is what determines if a instrument and level is ready for processing, by checking dependencies. For each file that arrives, the system checks to see what the downstream dependencies are -
-              meaning, what future files need this file in order to complete processing. For example, if a MAG L1A file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on
+              This step is what determines if a instrument and level is ready for processing, by checking dependencies.

Contributor

subagonsouth Mar 4, 2026 •

edited

Loading

Suggested change

      
            This step is what determines if a instrument and level is ready for processing, by checking dependencies.
          
            After indexing, the batch starter lambda is triggered in order to determine what jobs may be ready for processing.

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

Comment on lines +56 to +58

+              For each file that arrives, the system checks to see what the downstream dependencies are -
+              meaning, what future files need this file in order to complete processing. For example, if a MAG L1A
+              file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on

Contributor

subagonsouth Mar 4, 2026

Suggested change

      
            For each file that arrives, the system checks to see what the downstream dependencies are -
          
            meaning, what future files need this file in order to complete processing. For example, if a MAG L1A
          
            file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on
          
            For each file that arrives, the system checks to see what jobs may need to be run by looking at the downstream dependencies are. For example, if a MAG L1A
          
            file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

-              The status of different files is recorded in the status tracking table. This table records the status of each anticipated output file as "in progress", "complete", or "failed." Through this,
-              we can track processing for specific files and determine if a file exists quickly.
+              Then, for each anticipated job, the batch starter process checks to see if all the upstream

Contributor

subagonsouth Mar 4, 2026

Suggested change

      
            Then, for each anticipated job, the batch starter process checks to see if all the upstream
          
            Then, for each possible job, the batch starter process checks to see if all the upstream

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

+              dependencies are met. Although we know we have one of the upstream dependencies for an
+              expected job, it's possible that there are other required dependencies that have not yet
+              arrived. If we are missing any required dependencies, then the system does not kick off the
+              processing job. When the missing file arrives, it will trigger the same process of checking

Contributor

subagonsouth Mar 4, 2026

Suggested change

      
            processing job. When the missing file arrives, it will trigger the same process of checking
          
            processing job. When the missing upstream dependency arrives, it will trigger the same process of checking

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

Comment on lines +80 to +82

+              The status of different files is recorded in the status tracking table. This table records
+              the status of each anticipated output file as "in progress", "complete", or "failed." Through
+              this, we can track processing for specific files and determine if a file exists quickly.

Contributor

subagonsouth Mar 4, 2026

I think this is talking about the ProcessingJob table right?

Suggested change

      
            The status of different files is recorded in the status tracking table. This table records
          
            the status of each anticipated output file as "in progress", "complete", or "failed." Through
          
            this, we can track processing for specific files and determine if a file exists quickly.
          
            The status of each job is recorded in the status tracking table as "in progress", "complete", or "failed." Through this, we can track processing for specific files and determine if a file exists quickly.

I think that one piece that we are missing is checking for upstream dependencies that have jobs that are "in progress". I have added such a check to the Hi Goodtimes special handling. The idea is to avoid race conditions where multiple jobs for the same product get triggered in fast succession.

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

-              ~~~~~~~~~~~~~~~~~~~~
-              Primary descriptor can be one of the following:
+              Upstream Product Name

Contributor

subagonsouth Mar 4, 2026

I personally like Descriptor better for this. I think that product name has several possible meanings.

Contributor Author

tech3371 Mar 4, 2026

cool. Then let's keep descriptor and I will make remaining changes.

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

    
              - For science or ancillary data, the product names are defined by the instrument and SDC.

              - For ``spin`` and ``repoint`` data types, ``historical`` is the only valid descriptor.

              - For ``spice`` data types, ``historical`` and ``best`` are the valid product names.

Contributor

subagonsouth Mar 4, 2026

What about predict? I specify Hi spice dependencies individually for example:

ephemeris_reconstructed, spice, historical, hi, l1c, 45sensor-pset, HARD_NO_TRIGGER, DOWNSTREAM
ephemeris_predicted, spice, best, hi, l1c, 45sensor-pset, HARD_NO_TRIGGER, DOWNSTREAM

Contributor Author

tech3371 Mar 4, 2026

we had intension of using 'best' at one time but may be we didn't enforce it. We can remove that option.

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

-              ~~~~~~~~~~~~~~~~~~~~
-              Same as primary_data_type, but for the dependent file.
+              Kickoff_job (Optional)

Contributor

subagonsouth Mar 4, 2026

My nitpick on terminology: It seems like we use trigger more widely.

Suggested change

      
            Kickoff_job (Optional)
          
            Trigger_job (Optional)

subagonsouth reviewed

View reviewed changes

docs/source/data-access/data-dependency.rst

+                  - (imap_frames, spice, historical)
+                (l1b, 45sensor-goodtimes):
+                    - (hi, l1b, 45sensor-de, true, true, (-3p, 3p))

Contributor

subagonsouth Mar 4, 2026

One possible way to do this:

(l1b, 45sensor-goodtimes):
      # Entry for getting the 7 nearest pointings
      - (hi, l1b, 45sensor-de, {required: true, trigger: true, nearest: 7p})
      # Entry for getting the past 3 and future 3 pointints, if they exist
      - (hi, l1b, 45sensor-de, {required: false, trigger: true, past: 3p, future: 3p})
      # Entry for getting the 3 nearest available pointings in the past
      - (hi, l1b, 45sensor-de, {required: false, trigger: true, nearest_past: 3p})


          feedback changes

Co-authored-by: Tim Plummer <timothy.plummer@lasp.colorado.edu>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet