NEW: Dependency config design and content#2812
NEW: Dependency config design and content#2812tech3371 wants to merge 2 commits intoIMAP-Science-Operations-Center:devfrom
Conversation
| - ``p`` - pointing | ||
| - ``h`` - hourly | ||
| - ``d`` - days | ||
| - ``l`` - last_processed |
There was a problem hiding this comment.
similar to last_processed, we need nearest in the past.
But Hi wants nearest 7 irrespective of past or future.
There was a problem hiding this comment.
options:
past_nearest, future_nearest, any_nearest.
or
(7p) - this means any future or past data.
(1n, 0n) - this means get me last nearest data from past.
What to do if hi science file event comes and then need to looks up for SWE dependency which is daily?
pointing number 9 file came, we look nearest 7 files, then derive date range using earlier and latest pointing id of that 7 files and look data range from the pointing table. Then use date range to query for SPICE and other dependency. If dependency is found,
There was a problem hiding this comment.
one class for daily and pointing. Then parent class can be inherit those as needed for ENA or not.
|
|
||
| This step is what determines if a instrument and level is ready for processing, by checking dependencies. For each file that arrives, the system checks to see what the downstream dependencies are - | ||
| meaning, what future files need this file in order to complete processing. For example, if a MAG L1A file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on | ||
| This step is what determines if a instrument and level is ready for processing, by checking dependencies. |
There was a problem hiding this comment.
| This step is what determines if a instrument and level is ready for processing, by checking dependencies. | |
| After indexing, the batch starter lambda is triggered in order to determine what jobs may be ready for processing. |
| For each file that arrives, the system checks to see what the downstream dependencies are - | ||
| meaning, what future files need this file in order to complete processing. For example, if a MAG L1A | ||
| file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on |
There was a problem hiding this comment.
| For each file that arrives, the system checks to see what the downstream dependencies are - | |
| meaning, what future files need this file in order to complete processing. For example, if a MAG L1A | |
| file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on | |
| For each file that arrives, the system checks to see what jobs may need to be run by looking at the downstream dependencies are. For example, if a MAG L1A | |
| file arrived, this step would determine that the MAG L1B ``mago`` and ``magi`` files are dependent on |
|
|
||
| The status of different files is recorded in the status tracking table. This table records the status of each anticipated output file as "in progress", "complete", or "failed." Through this, | ||
| we can track processing for specific files and determine if a file exists quickly. | ||
| Then, for each anticipated job, the batch starter process checks to see if all the upstream |
There was a problem hiding this comment.
| Then, for each anticipated job, the batch starter process checks to see if all the upstream | |
| Then, for each possible job, the batch starter process checks to see if all the upstream |
| dependencies are met. Although we know we have one of the upstream dependencies for an | ||
| expected job, it's possible that there are other required dependencies that have not yet | ||
| arrived. If we are missing any required dependencies, then the system does not kick off the | ||
| processing job. When the missing file arrives, it will trigger the same process of checking |
There was a problem hiding this comment.
| processing job. When the missing file arrives, it will trigger the same process of checking | |
| processing job. When the missing upstream dependency arrives, it will trigger the same process of checking |
| The status of different files is recorded in the status tracking table. This table records | ||
| the status of each anticipated output file as "in progress", "complete", or "failed." Through | ||
| this, we can track processing for specific files and determine if a file exists quickly. |
There was a problem hiding this comment.
I think this is talking about the ProcessingJob table right?
| The status of different files is recorded in the status tracking table. This table records | |
| the status of each anticipated output file as "in progress", "complete", or "failed." Through | |
| this, we can track processing for specific files and determine if a file exists quickly. | |
| The status of each job is recorded in the status tracking table as "in progress", "complete", or "failed." Through this, we can track processing for specific files and determine if a file exists quickly. |
I think that one piece that we are missing is checking for upstream dependencies that have jobs that are "in progress". I have added such a check to the Hi Goodtimes special handling. The idea is to avoid race conditions where multiple jobs for the same product get triggered in fast succession.
| ~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| Primary descriptor can be one of the following: | ||
| Upstream Product Name |
There was a problem hiding this comment.
I personally like Descriptor better for this. I think that product name has several possible meanings.
There was a problem hiding this comment.
cool. Then let's keep descriptor and I will make remaining changes.
| - For science or ancillary data, the product names are defined by the instrument and SDC. | ||
|
|
||
| - For ``spin`` and ``repoint`` data types, ``historical`` is the only valid descriptor. | ||
| - For ``spice`` data types, ``historical`` and ``best`` are the valid product names. |
There was a problem hiding this comment.
What about predict? I specify Hi spice dependencies individually for example:
ephemeris_reconstructed, spice, historical, hi, l1c, 45sensor-pset, HARD_NO_TRIGGER, DOWNSTREAM
ephemeris_predicted, spice, best, hi, l1c, 45sensor-pset, HARD_NO_TRIGGER, DOWNSTREAM
There was a problem hiding this comment.
we had intension of using 'best' at one time but may be we didn't enforce it. We can remove that option.
| ~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| Same as primary_data_type, but for the dependent file. | ||
| Kickoff_job (Optional) |
There was a problem hiding this comment.
My nitpick on terminology: It seems like we use trigger more widely.
| Kickoff_job (Optional) | |
| Trigger_job (Optional) |
| - (imap_frames, spice, historical) | ||
|
|
||
| (l1b, 45sensor-goodtimes): | ||
| - (hi, l1b, 45sensor-de, true, true, (-3p, 3p)) |
There was a problem hiding this comment.
One possible way to do this:
(l1b, 45sensor-goodtimes):
# Entry for getting the 7 nearest pointings
- (hi, l1b, 45sensor-de, {required: true, trigger: true, nearest: 7p})
# Entry for getting the past 3 and future 3 pointints, if they exist
- (hi, l1b, 45sensor-de, {required: false, trigger: true, past: 3p, future: 3p})
# Entry for getting the 3 nearest available pointings in the past
- (hi, l1b, 45sensor-de, {required: false, trigger: true, nearest_past: 3p})
Co-authored-by: Tim Plummer <timothy.plummer@lasp.colorado.edu>
Change Summary
closes IMAP-Science-Operations-Center/sds-data-manager#1151
Overview
File changes
This contains final design of new config file. It contains information such as filename convention, new file content and required/optional fields and defaults used. The part that I need feedback the most is the time range options and the example content.
Testing