Skip to content

Tutorials on building PyRenew multisignal models#819

Open
cdc-mitzimorris wants to merge 68 commits into
mainfrom
mem_tutorial_h_e_approx
Open

Tutorials on building PyRenew multisignal models#819
cdc-mitzimorris wants to merge 68 commits into
mainfrom
mem_tutorial_h_e_approx

Conversation

@cdc-mitzimorris
Copy link
Copy Markdown
Collaborator

@cdc-mitzimorris cdc-mitzimorris commented May 15, 2026

This PR adds tutorial "Multi-signal Model for Weekly Hospital and Daily ED Data" which compares a PyRenew MultiSignalModel to the production H+E model fitting both on synthetic data.

@dylanhmorris
Copy link
Copy Markdown
Collaborator

@cdc-mitzimorris can you address the CI website build failure?

@cdc-mitzimorris
Copy link
Copy Markdown
Collaborator Author

@cdc-mitzimorris can you address the CI website build failure?

website should build now.

I highly recommend rendering the docs and reading the rendered tutorial before diving into the Quarto markdown.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

Thank you for your contribution @cdc-mitzimorris 🚀! Your github-pages is ready for download 👉 here 👈!
(The artifact expires on 2026-05-22T18:15:16Z. You can re-generate it by re-running the workflow here.)

The components are:

- **Latent infections**: `PopulationInfections` produces a single aggregate infection trajectory.
- **$\mathcal{R}(t)$process**: `WeeklyTemporalProcess` samples weekly values and broadcasts them to the daily renewal equation.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **$\mathcal{R}(t)$process**: `WeeklyTemporalProcess` samples weekly values and broadcasts them to the daily renewal equation.
- **$\mathcal{R}(t)$ process**: `WeeklyTemporalProcess` samples weekly values and broadcasts them to the daily renewal equation.

)
```

This tutorial builds a small hospital + emergency department model using PyRenew's high-level components and `PyrenewBuilder` class to specify a `MultiSignalModel` object, then fits it to a synthetic dataset.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This tutorial builds a small hospital + emergency department model using PyRenew's high-level components and `PyrenewBuilder` class to specify a `MultiSignalModel` object, then fits it to a synthetic dataset.
This tutorial builds a small hospital (H) + emergency department (E) model using PyRenew's high-level components and `PyrenewBuilder` class to specify a `MultiSignalModel` object, then fits it to a synthetic dataset.

It requires the following:

- **Generation interval**: PMF for secondary infection timing.
- **Initial conditions ($I(0)$ and $\log(\mathcal{R}(0))$)**: Infection prevalence at time $0$ as a proportion of the population, and the starting point of the $\log(\mathcal{R}(t))$ trajectory.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Initial conditions ($I(0)$ and $\log(\mathcal{R}(0))$)**: Infection prevalence at time $0$ as a proportion of the population, and the starting point of the $\log(\mathcal{R}(t))$ trajectory.
- **Initial conditions ($I(0)$ and $\log(\mathcal{R}(0))$)**: Infection incidence at time $0$ as a proportion of the population, and the starting point of the $\log(\mathcal{R}(t))$ trajectory.

The logit-scale representation keeps sampled rates between 0 and 1 after transformation back to the probability scale.

We use a common generic baseline rate and a logit-scale standard deviation of $0.3$, matching the prior scale used by custom-H-E.
The prior correlation of $0.5$ says that the two rates may differ, but prior draws that are high for hospital ascertainment also tend to be high for ED ascertainment.
Copy link
Copy Markdown
Collaborator

@damonbayer damonbayer May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that correlated ascertainment is important for identifiability.

The day-of-week effect multiplies the expected daily ED visits before the count observation model is applied.
Each day of the week has its own multiplier: a value above 1 increases expected ED visits on that day, and a value below 1 decreases them.

Following the custom H+E model, we place a symmetric Dirichlet prior with concentration value $5$ on these weights and scale the draw by $7$.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dylanhmorris Can you suggest a justification to include here?

)
hosp_plot_df["week_start"] = hosp_plot_df["week_end"] - pd.Timedelta(days=6)

raw_obs_plot = (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking two dot plots would be a better visualization, but I kind of like that the thickness of the bars calls attention to the daily vs weekly scale.

)
```

We compare the model estimates to the data-generating parameters.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it easier to compare the true value to the posterior? By putting them in the same table?

rt_comparison_df = pd.concat(rt_comparison_parts, ignore_index=True)
```

```{python}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This figure is quite cramped. Override the default size?

Image

Comment on lines +1092 to +1111
```{python}
#| label: synthetic-recovery-check

rhat = pyrenew_parameter_summary["r_hat"].astype(float)
ess = pyrenew_parameter_summary["ess_bulk"].astype(float)
print(f"Max R-hat: {rhat.max():.3f}")
print(f"Min bulk ESS: {ess.min():.0f}")

latent_inf = idata_trimmed.posterior["latent_infections"]
inf_q05 = latent_inf.quantile(0.05, dim=["chain", "draw"]).values
inf_q50 = latent_inf.quantile(0.50, dim=["chain", "draw"]).values
inf_q95 = latent_inf.quantile(0.95, dim=["chain", "draw"]).values

true_infections = daily_infections["true_infections"].to_numpy()
n_compare = min(len(true_infections), len(inf_q05))
inf_covered = (true_infections[:n_compare] >= inf_q05[:n_compare]) & (
true_infections[:n_compare] <= inf_q95[:n_compare]
)
print(f"90% interval coverage for true infections: {np.mean(inf_covered):.1%}")
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table needs more commentary. I'm not sure what it's about.

)
+ theme_tutorial
)
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense here to show the posterior predictive for E and H, not just a simple MA.

)
+ theme_tutorial
)
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we learn from this plot that we didn't learn from the previous one? Why bin into two week estimates?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further reading, I see that this introduces the idea of relative widths, which are an important comparison point between the models.

rt_plot
```

Rt is useful as a companion to latent infections because it focuses on epidemic shape rather than absolute infection scale.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Rt is useful as a companion to latent infections because it focuses on epidemic shape rather than absolute infection scale.
$\mathcal{R}(t)$ is useful as a companion to latent infections because it focuses on epidemic shape rather than absolute infection scale.

```

Rt is useful as a companion to latent infections because it focuses on epidemic shape rather than absolute infection scale.
The plot below compares Rt across all three models against the synthetic data-generating Rt trajectory.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intro only mentions 2 models. Perhaps I was not reading carefully enough. Can we state what the three models are at the very beginning?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the names could be a bit clearer, as well.

Something like

pyrenew-H-E -> independent-ascertainment
linked-pyrenew-H-E -> correlated-ascertainment
custom-H-E -> ratio-ascertainment

?

.reset_index()
)
latent_width_summary
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above table needs much more detail, I think. Mean relative width of what?


```{python}
#| label: linked-latent-infections
#| fig-cap: Posterior latent infections under the two PyRenew parameterizations
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized these figure captions aren't showing in the html (I have just been reviewing the html, not the qmd). We should figure out how to display them.

)
+ theme_tutorial
)
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit cramped. Can we make it taller?

Image

)
+ theme_tutorial
)
```
Copy link
Copy Markdown
Collaborator

@damonbayer damonbayer May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes. They all fail to recover the IEDR and IHR. Can we plot the priors too? Perhaps in a facet above?

)
+ theme_tutorial
)
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see priors here too.

Copy link
Copy Markdown
Collaborator

@damonbayer damonbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work @cdc-mitzimorris! I have only reviewed the new tutorial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants