With Talos we have created a process which can both analyse data as a one-time event, and run re-analysis, building on previous results with each run. By referencing and building on previous results, we can prioritise new variants (novel classifications, or changed evidence) and reduce the amount of work needed to re-analyse a case by allowing users to filter out variants which were seen previously and have not changed.
Talos does this through storing a representation of each run's results, and feeding those forwards into future runs. By incorporating the variants and timestamps from a series of executions, we gradually build a record of all previously seen results, each with the original date of its observation. This allows us to see when a variant was first classified, and how its classification has changed over time. This is then reflected in the report, where the date shows the most recent observation of changed evidence.
For NextFlow, this is mediated through the history column in the input TSV.
- We advise that with each completed run, the
historyparameter is updated to the latest full_results JSON in the output folder. - If the file does not exist or was not provided, all variant discovery dates will be set to the time of the current run.
During the run, Talos analyses data as standard. Once the result set has been generated, the previous file is loaded up and current results are compared to its contents:
- if a variant is newly detected, the dates are maintained as the current date
- If a variant was seen before:
first_taggedis set to the date of its first observation, of any categoryevidence_last_updatedis set to the most recent date a category was assigned for the first time- if a variant was seen before, and is now seen with a novel comp-het partner,
evidence_last_updatedis today date_of_phenotype_matchis None, if there is no phenotype match, otherwise it is set to the earliest date a phenotype match was observed