vectorized d1s#102
Conversation
|
three plausible shapes, each with a different tradeoff: (a) Polymorphic index: keep one function, accept int | Sequence[int]. Return Tally for scalar, list[Tally] for sequence. Cleanest API surface, but the return type flips (b) Add indices kwarg: keep index as-is, add a new indices=None parameter. If supplied, vectorized path runs and returns list[Tally] (or arrays). Lower review risk than (c) Two functions (what I did): clearest discoverability, can return ndarray from the series path without violating any existing contract — but more API surface. this pr does (c) |
Description
Adds
apply_time_correction_seriestoopenmc.deplete.d1s, a vectorized variant ofapply_time_correctionthat evaluates many time indices in a single matrix multiplication.Calling
apply_time_correctionin a loop over N time indices deep-copies the tally and re-multiplies itssum/sum_sq/mean/std_devarrays N times. The new function builds an(N, n_radionuclides)factor matrix and folds the radionuclide-axis sum into a single matmul, so all indices are evaluated in one pass.Returns NumPy arrays rather than a list of derived
Tallyobjects: constructing N derived tallies (each with its own copy of_sum/_sum_sq/_mean/_std_dev) defeats the memory advantage on fine-mesh tallies. Users who need aTallyper index can build one from the returned arrays.Motivation: shutdown dose-rate analysis routinely needs a full dose-vs-time curve, which means evaluating the same
time_correction_factorsdict at every index in the cooling schedule. For a 90-timestep schedule on a ~10⁶-voxel mesh the loop spends most of its time in repeated copy + elementwise multiply; the matmul-based path is ~5–15× faster on typical workloads (matmul hits BLAS, the per-iterationcopy(tally)is gone, and_sum/_sum_sqare no longer materialized for results that are typically read-only).Fixes # (issue) — N/A
Checklist