Skip to content

Step 06: Modularising the DAGs#18

Open
sbfnk-bot wants to merge 3 commits intomainfrom
step06
Open

Step 06: Modularising the DAGs#18
sbfnk-bot wants to merge 3 commits intomainfrom
step06

Conversation

@sbfnk-bot
Copy link
Collaborator

@sbfnk-bot sbfnk-bot commented Feb 27, 2026

Summary

This PR closes #7.

  • Rewrites step06_model_modules.md grounded in step 05 decisions (7 estimated parameters, direct β/α estimation, exponential kernel, compound delay)
  • Decomposes the model into three progressive fitting modules: spillover only (5 params), spillover + local transmission (7 params), full model with movement (7 params, p_mov fixed)
  • Each module is a complete fittable model sharing the same observation model and likelihood structure
  • Defines validation criteria and diagnostics at each stage to determine when to proceed to the next module
  • Covers DAG variant handling (exponential vs Cauchy kernel) and delay sensitivity analysis

Summary by CodeRabbit

  • Documentation
    • Expanded design documenting a modular three-part epidemiological modelling framework (spillover, transmission, observation) and their composition into a full force-of-infection model.
    • Added standalone module specifications, validation protocols (prior checks, synthetic recovery, identifiability, posterior checks), data requirements, fitting strategy and staged development plan.
    • Mapped module compositions to research questions and analysis workflows.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a1f244 and 9d9f33d.

📒 Files selected for processing (1)
  • hpai-challenge/notes/step06_model_modules.md

Walkthrough

Introduces a comprehensive modular specification for the DAG-based epidemiological model, splitting the joint model into three standalone, fittable modules (Spillover, Transmission, Observation) with shared components, parameter lists, process equations, staged validation workflow and mappings to research questions.

Changes

Cohort / File(s) Summary
Model modules documentation
hpai-challenge/notes/step06_model_modules.md
Adds a new, detailed specification that modularises the model into Module A (Spillover), Module B (Transmission: local + movement), and Module C (Observation). Defines shared components (species susceptibility, removal/culling, zone biosecurity), full force-of-infection composition, parameter listings (fixed vs estimated), validation checks (unit tests, prior predictive, synthetic recovery, real-data fits), staged fitting strategy (A+C → A+B+C), data requirements and links to research questions. No code changes.

Sequence Diagram(s)

(Skipped — changes are documentation/specification only; no new multi-component control flow implemented in code.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • seabbs-bot
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Step 06: Modularising the DAGs' directly and clearly summarizes the main change in this pull request, which introduces a comprehensive modularization of DAG-based epidemiological models into three independent modules.
Linked Issues check ✅ Passed The pull request successfully addresses all coding requirements from issue #7: it documents six modules (spillover, local transmission, movement transmission, species modifier, removal processes, observation model) with their parameters, fitting strategy, validation criteria, and diagnostic procedures in notes/step06_model_modules.md.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #7 requirements; the single file modified (notes/step06_model_modules.md) exclusively documents the modularized model structure, validation strategy, and diagnostic criteria with no extraneous alterations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch step06

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
hpai-challenge/notes/step06_model_modules.md (4)

33-33: Consider justifying the duck susceptibility prior.

The Beta(2, 8) prior for β_duck has mean 0.2, implying ducks are substantially less susceptible than chickens. This is a relatively strong prior assumption. If this is based on prior knowledge or literature, briefly noting the justification would strengthen the specification. If data are expected to be uninformative (as suggested by the diagnostic at line 129), documenting the rationale becomes more important since the prior will dominate the posterior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 33, The Beta(2, 8)
prior for β_duck is a strong assumption (mean 0.2); either add a brief
justification citing prior knowledge or literature supporting lower duck
susceptibility, or relax the prior (e.g., use a less informative Beta or
hierarchical/specify a wider prior) so it doesn’t dominate uninformative data;
update the model notes around the β_duck/Beta(2,8) line to state which option
you chose and why, and mention any sensitivity checks performed.

129-129: Consider documenting the rationale for the 80% threshold.

The diagnostic "if 95% CrI covers > 80% of prior range, fall back to scenario analysis" uses a specific threshold. Whilst this is a pragmatic rule, briefly noting why 80% was chosen (e.g., based on prior experience, simulation studies, or simply as a conservative threshold) would make the decision criterion more transparent and defensible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 129, The doc line
referencing the diagnostic "if 95% CrI covers > 80% of prior range, fall back to
scenario analysis" needs a brief rationale added for transparency: update the
text around β_{\text{duck}} posterior vs prior to state why the 80% threshold
was chosen (e.g., empirical/simulation results, conservative heuristic from
prior experience, or trade-off between identifiability and over‑sensitivity),
and cite any supporting evidence or note that it is a pragmatic/default value
subject to sensitivity checks; ensure the added sentence mentions the 95% CrI,
the prior range comparison, and that this threshold can be adjusted with a short
note to run sensitivity analyses or simulations if different thresholds are
used.

52-52: Clarify the biosecurity reduction mechanism.

The reference "Zone biosecurity reduction applies as in step 05" makes Module 1 dependent on external documentation for a core mechanism. For independent testability and clear interfaces, either document the biosecurity logic here or add it to the shared components section (§Module dependencies) with a pointer to step 05 for full derivation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 52, Module 1 currently
references "Zone biosecurity reduction applies as in step 05" without specifying
the mechanism; update the document so Module 1 is self-contained by either (A)
embedding the biosecurity reduction formula and rules directly alongside the
definitions of phi_j, phi_hrz, phi_non and beta_species[j] (including how
zone-level modifiers are computed and applied), or (B) moving the zone
biosecurity reduction logic into the shared "Module dependencies" section with a
clear identifier and brief summary here and a pointer to step 05 for full
derivation; reference the symbols phi_j, phi_hrz, phi_non, beta_species[j],
beta_duck and the Module 1 header so reviewers can find and verify the exact
computation and interface.

149-149: Provide justification for the baseline p_mov value.

The parameter p_mov is fixed at 0.01, yet line 177 indicates sensitivity analysis will vary it over [0.001, 0.05]. Briefly documenting why 0.01 was selected as the baseline (e.g., literature estimates, exploratory analysis, or mid-range of plausible values) would clarify the modelling choice and help readers interpret sensitivity results.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 149, Add a one- or
two-sentence justification for the baseline p_mov = 0.01 in the model notes:
explain why 0.01 was chosen (e.g., cite a literature estimate, report it as the
midpoint of plausible values, or note it came from exploratory
calibration/sensitivity runs) and mention that sensitivity analysis then
explores [0.001, 0.05]; update the text near the p_mov table entry and any
accompanying paragraph so readers can see the rationale and its relation to the
sensitivity range.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hpai-challenge/notes/step06_model_modules.md`:
- Line 162: The conditional for p_eff(i,t) is ambiguous because "regulated zone"
and "HRZ" are not defined relative to each other; update the documentation and
implementation to state their relationship (e.g., whether HRZ is a subset of
regulated zones or mutually exclusive). Specifically, clarify in the text around
p_eff(i,t) that "regulated zone" refers to zones with active movement bans and
that "HRZ" (High-Risk Zone) is/is not included in that set (choose one) and
adjust the logic in any code using p_eff, p_mov, and sigma_test to reflect that
relationship (e.g., ensure the condition order in the p_eff calculation matches
the intended hierarchy and add a comment near p_eff, regulated zone, and HRZ
definitions referencing the data source/field names).

---

Nitpick comments:
In `@hpai-challenge/notes/step06_model_modules.md`:
- Line 33: The Beta(2, 8) prior for β_duck is a strong assumption (mean 0.2);
either add a brief justification citing prior knowledge or literature supporting
lower duck susceptibility, or relax the prior (e.g., use a less informative Beta
or hierarchical/specify a wider prior) so it doesn’t dominate uninformative
data; update the model notes around the β_duck/Beta(2,8) line to state which
option you chose and why, and mention any sensitivity checks performed.
- Line 129: The doc line referencing the diagnostic "if 95% CrI covers > 80% of
prior range, fall back to scenario analysis" needs a brief rationale added for
transparency: update the text around β_{\text{duck}} posterior vs prior to state
why the 80% threshold was chosen (e.g., empirical/simulation results,
conservative heuristic from prior experience, or trade-off between
identifiability and over‑sensitivity), and cite any supporting evidence or note
that it is a pragmatic/default value subject to sensitivity checks; ensure the
added sentence mentions the 95% CrI, the prior range comparison, and that this
threshold can be adjusted with a short note to run sensitivity analyses or
simulations if different thresholds are used.
- Line 52: Module 1 currently references "Zone biosecurity reduction applies as
in step 05" without specifying the mechanism; update the document so Module 1 is
self-contained by either (A) embedding the biosecurity reduction formula and
rules directly alongside the definitions of phi_j, phi_hrz, phi_non and
beta_species[j] (including how zone-level modifiers are computed and applied),
or (B) moving the zone biosecurity reduction logic into the shared "Module
dependencies" section with a clear identifier and brief summary here and a
pointer to step 05 for full derivation; reference the symbols phi_j, phi_hrz,
phi_non, beta_species[j], beta_duck and the Module 1 header so reviewers can
find and verify the exact computation and interface.
- Line 149: Add a one- or two-sentence justification for the baseline p_mov =
0.01 in the model notes: explain why 0.01 was chosen (e.g., cite a literature
estimate, report it as the midpoint of plausible values, or note it came from
exploratory calibration/sensitivity runs) and mention that sensitivity analysis
then explores [0.001, 0.05]; update the text near the p_mov table entry and any
accompanying paragraph so readers can see the rationale and its relation to the
sensitivity range.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c7b9e6 and ec9d40c.

📒 Files selected for processing (1)
  • hpai-challenge/notes/step06_model_modules.md

…d thresholds

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
hpai-challenge/notes/step06_model_modules.md (4)

124-124: Clarify the β₀ reparameterisation.

The β₀ reparameterisation is mentioned as a solution to high β–α correlation but is not defined in this document. For readers working through Module 2 in isolation, consider adding a brief inline description (e.g., "the β₀ reparameterisation—defining a reference distance and reparameterising β as transmission intensity at that distance—reduces correlation by changing the parameterisation geometry") or expanding the reference to be more explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 124, Add a brief inline
definition of the β₀ reparameterisation where it's first mentioned: state that
the β₀ reparameterisation redefines β as the transmission intensity at a chosen
reference distance (β₀) and expresses the original β parameter via this
reference and the distance-decay parameter α, which reduces β–α posterior
correlation by changing the parameterisation geometry; reference the earlier
step where it is introduced (step 05, §4) and optionally give a one-line formula
sketch (β = β₀ * f(distance; α)) so readers of Module 2 can apply it without
needing the full previous section.

272-272: Clarify δ_reactive reference.

Line 272 references "modify $\delta_{\text{reactive}}$" for Q5, but this parameter is not defined in the current document. For completeness, consider adding a brief inline clarification (e.g., "modify $\delta_{\text{reactive}}$ (reactive culling delay)") or ensuring that the parameter is introduced in the Module 3 or observation model sections.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 272, Update the
document to define or clarify the symbol δ_reactive where it is referenced for
Q5: either add an inline parenthetical after "modify δ_reactive" (e.g., "modify
δ_reactive (reactive culling delay)") on the Q5 table entry, or introduce a
brief definition in Module 3 or the observation model section that explains
δ_reactive as the reactive culling delay and its role; ensure the symbol
δ_reactive is consistently named and cross-referenced with Module 3/observation
model so readers can find the full definition.

235-235: Note missing imputation requirement in data inputs sections.

Line 235 mentions "impute missing preventive cull dates" as a preprocessing step, but the data inputs sections for each module (lines 58-64, 116-119, 168-172) list prev_culls.csv without noting that dates may be missing. Consider adding a note in the first data inputs section (Module 1, around line 64) to indicate that preventive cull dates may require imputation during preprocessing, so implementers are aware of this requirement earlier.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 235, Add a brief note
to the Module 1 data inputs section clarifying that the prev_culls.csv file may
contain missing preventive cull dates and that those dates are expected to be
imputed during the shared "Data preprocessing" step (which includes "impute
missing preventive cull dates"); update the Module 1 inputs text near the
prev_culls.csv entry to explicitly state the imputation requirement so
implementers see it earlier.

42-42: Consider listing all fixed parameters explicitly.

The reference "as step 05" assumes the reader has step 05 readily available. For better self-containment, consider listing the fixed parameters explicitly: $r = 1.0$ day⁻¹, $\tau_{\min} = 1$ day, $\varepsilon = 0.5$, $\sigma_{\text{test}} = 0.9$. This would make the module specification complete without requiring cross-referencing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hpai-challenge/notes/step06_model_modules.md` at line 42, Replace the vague
phrase "as step 05" with an explicit list of the fixed parameters so the module
is self-contained: state r = 1.0 day⁻¹, τ_min = 1 day, ε = 0.5, and σ_test = 0.9
(use the symbols $r$, $\tau_{\min}$, $\varepsilon$, and $\sigma_{\text{test}}$
exactly as in the diff) in the same sentence or a short bullet so readers need
not cross-reference step 05.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@hpai-challenge/notes/step06_model_modules.md`:
- Line 124: Add a brief inline definition of the β₀ reparameterisation where
it's first mentioned: state that the β₀ reparameterisation redefines β as the
transmission intensity at a chosen reference distance (β₀) and expresses the
original β parameter via this reference and the distance-decay parameter α,
which reduces β–α posterior correlation by changing the parameterisation
geometry; reference the earlier step where it is introduced (step 05, §4) and
optionally give a one-line formula sketch (β = β₀ * f(distance; α)) so readers
of Module 2 can apply it without needing the full previous section.
- Line 272: Update the document to define or clarify the symbol δ_reactive where
it is referenced for Q5: either add an inline parenthetical after "modify
δ_reactive" (e.g., "modify δ_reactive (reactive culling delay)") on the Q5 table
entry, or introduce a brief definition in Module 3 or the observation model
section that explains δ_reactive as the reactive culling delay and its role;
ensure the symbol δ_reactive is consistently named and cross-referenced with
Module 3/observation model so readers can find the full definition.
- Line 235: Add a brief note to the Module 1 data inputs section clarifying that
the prev_culls.csv file may contain missing preventive cull dates and that those
dates are expected to be imputed during the shared "Data preprocessing" step
(which includes "impute missing preventive cull dates"); update the Module 1
inputs text near the prev_culls.csv entry to explicitly state the imputation
requirement so implementers see it earlier.
- Line 42: Replace the vague phrase "as step 05" with an explicit list of the
fixed parameters so the module is self-contained: state r = 1.0 day⁻¹, τ_min = 1
day, ε = 0.5, and σ_test = 0.9 (use the symbols $r$, $\tau_{\min}$,
$\varepsilon$, and $\sigma_{\text{test}}$ exactly as in the diff) in the same
sentence or a short bullet so readers need not cross-reference step 05.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec9d40c and 7a1f244.

📒 Files selected for processing (1)
  • hpai-challenge/notes/step06_model_modules.md

…tal fitting

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HPAI Workflow Step 06: Modularising DAGs

1 participant