Missing data: partial missingness, unroll_missing, diagonal obs models#149
Missing data: partial missingness, unroll_missing, diagonal obs models#149
Conversation
…arch/dynestyx into db-missingness-partial
…cordingly; added a demo notebook
|
This is fantastic @dimkab , thank you! So far I went through the notebook:
Some thoughts / questions (will look more carefully and possibly answer my own questions):
|
|
|
Thanks @mattlevine22 !
I agree, this makes more sense as the default. The question is, what if the users aren't aware they have missing data at all and so don't specify anything - should we at least emit some warning if there is missing data, and what we are doing (unrolling by default)?
yes, see
good idea, will do. This is actually how I'm using this in CIS anyway.
In the notebook there are a couple of cells which print the actual returned trajectory length in this case (actually in both cases), the question is if a plot would add anything substantial here.
take a look now |
|
@mattlevine22 actually regarding building the SDE model - this may not work because |
That's for state-evolution...in the code, it doesn't look like you require decoupled state transitions (although now I see that the notebook only treats this case). The de-coupled state case is a little bit niche...it won't really work for the CIS application, right? |
not sure I understand what you mean by building the SDE model then... you suggested to go through a Discretizer() which as far as I see returns MultivariateNormal - unless I implement my own DiagonalEulerMauyama assuming a diagonal diffusion coefficient... OR: do you mean I switch to a Diagonal observation model here? |
|
made |
Euler-Maruyama is a time-discretization -- i.e., it translates the dynestyx/dynestyx/discretizers.py Lines 189 to 197 in e1fb88a So if you have an observation model with diagonal Gaussians, that should persist. |
my observation model is dirac here... |
Sorry, I guess I only half-read... but the point is the same, that whatever special observation model structure you have is preserved by |
|
I'm also concerned that this is addressed w/ the Instead, @DanWaxman and I were picturing that things in
Doing the above will:
|
|
The algebra is straightforward: For the full MVN you would need to apply these I am not sure what's the big plan here for this PR. Are you OK with solving the diagonal Gaussian / Dirac case first? The interactions are encoded in the drift so I don't think we lose anything |
|
Yes, I'm OK with solving the simpler problem first. But, I'm worried about doing it in a way that: |
|
but how can you do |
to be precise: for gaussian observations you need diagonal observation covariance (and this is enforced); the state transition covariance can be anything because the states are fully sampled anyway. The enforcement for the state noise and the initial condition covariance comes into play only for dirac observations. |
|
From the new notebook: "The two dimensions are independent in both transitions and observations, which is precisely why
|
yeah sorry this should only concern observations. |
Summary
DiscreteTimeSimulatorunroll_missing=True(default) produces full-length output with gap imputation when entire rows are missingObservationModel.masked_log_probinterface for per-dimension scoringDiagonalLinearGaussianObservationandDiagonalGaussianObservationobservation modelsCore changes
dynestyx/simulators.pyunroll_missing: bool = Truefield onDiscreteTimeSimulator. The_simulatemethod dispatches to:_simulate_missing_scan— when partial NaNs orunroll_missing=Truewith missing rows_simulate_row_filter— whenunroll_missing=Falsewith missing rows_simulate_plate/_simulate_scan— no missingness_simulate_missing_scanhas two sub-paths:Sub-path A (
DiracIdentityObservation) — factor + masked-sample:numpyro.factor(obs_lp)scores the transition log-probnumpyro.handlers.mask(mask=~obs_mask) { numpyro.sample(trans_dist) }— the mask zeros out both model and guide ELBO contributions at observed dims, soAutoNormalallocates parameters but they are inert (zero gradient)_step_dirac_partial(per-dim vector mask, requiresIndependent(Normal, 1)transitions, usesbase_dist) and_step_dirac_wholerow(scalar mask, works with any transition includingMultivariateNormal)Sub-path B (non-Dirac obs models):
obs_model.masked_log_prob(y, obs_mask, x, ...)for partial rows, or fulllog_probzeroed at missing rows vianumpyro.handlers.maskdynestyx/models/ObservationModel.masked_log_prob: scores only observed dims; default raisesNotImplementedErrorDiagonalLinearGaussianObservation: linear H with diagonal R, supportsmasked_log_probDiagonalGaussianObservation: nonlinear obs function with diagonal R, supportsmasked_log_probTests
tests/test_with_missing_data.py— 39 tests covering all combinations of:unroll_missing: True/FalseDYNESTYX_SMOKE_TESTenv var (set inscripts/test.sh)Tutorial notebook (08)
Sections: whole-row block missingness (default + row-filter paths, side-by-side posterior comparison), per-dimension partial missingness, per-particle contiguous gaps with SVI.
Test plan
scripts/test.sh— smoke tests pass (~2 min)scripts/test_full.sh— full convergence tests pass (~16 min)🤖 Generated with Claude Code