Add ArrayDiscretization strategy for vectorized interior equations#531
Open
ctessum-claude wants to merge 38 commits into
Open
Add ArrayDiscretization strategy for vectorized interior equations#531ctessum-claude wants to merge 38 commits into
ctessum-claude wants to merge 38 commits into
Conversation
Store Symbolics.Arr directly in discvars instead of materializing to
Vector{BasicSymbolic} via unwrap.(collect(...)). This preserves the
symbolic array structure through discretization.
Since Symbolics.Arr does not support fancy indexing with
Vector{CartesianIndex}, add a _disc_gather utility that handles both
scalar and vector index types via comprehensions. Also add a
RefCartesianIndex method for interface boundary stencils.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement ArrayDiscretization as a new discretization strategy that pre-computes stencil information once and applies it per-point, bypassing the expensive per-point scheme-detection logic used by ScalarizedDiscretization. Supports 1D uniform grids with centered differences and Dirichlet BCs, including higher-order stencils (approx_order=4) with boundary proximity handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…x template Replace per-point equation generation with a template-based approach: - Build FD stencil and variable/grid rules once using a symbolic index _i from SymbolicUtils.idxs_for_arrayop(SymReal) - Transform the PDE once to produce a template expression - Instantiate at each interior point via pde_substitute(_i => k) This gives O(1) symbolic construction cost for the centered-stencil region instead of O(N). Near-boundary interior points and non-uniform grids fall back to the per-point approach. Add tests for uniform grid with parameter and symbolic structure verification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-rolled centered-difference logic in _per_point_eqs with delegation to discretize_equation_at_point from the scalar path. This unlocks ALL scheme types for ArrayDiscretization: upwind, nonlinear Laplacian, spherical, mixed derivatives, WENO, callbacks, and multi-D. Key changes: - _per_point_eqs now calls discretize_equation_at_point (12 lines vs 35) - generate_array_interior_eqs adds can_template gate: uniform + even orders + 1D required for template path; otherwise full per-point - Template validation: compares first template eq against scalar path to catch nonlinear Laplacian and other compound derivative forms - New tests: upwind convection (Burgers), nonlinear diffusion, 2D diffusion — all using ArrayDiscretization Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… differences Remove the length(interior_ranges) == 1 guard so 2D/3D uniform-grid PDEs with even-order derivatives use the template path instead of per-point fallback. Key changes: - Per-dimension boundary-point-count vectors instead of scalars - N symbolic indices (_i1, _i2, ...) with per-depvar dimension mapping - CartesianIndices iteration for boundary frame and centered region - Add 2D template-vs-scalar, 2D structural, and 3D comparison tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the template-instantiate approach (which produced N scalar equations via pde_substitute loop) with actual SymbolicUtils.ArrayOp expressions. The centered-stencil region now produces a single array-valued equation: D(u_arrayop) ~ stencil_arrayop Key changes: - Build RHS as ArrayOp parameterized by symbolic indices - Build LHS as D(u_arrayop) with derivative outside the ArrayOp (scalarize/substitute cannot penetrate Differential nodes) - Extract spatial RHS from rearranged PDE (lhs-rhs~0 form) by subtracting the known time-derivative template - Validation still uses pde_substitute (which penetrates Differential) - MTK currently scalarizes ArrayOps via flatten_equations, producing identical scalar equations; when MTK adds native array support, this will compile to vectorized loops Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add UpwindStencilInfo and precompute_upwind_stencils to cache both wind-direction operators for odd-order derivatives. Implement _build_upwind_rules using @rule pattern matching (mirroring the scalar path's generate_winding_rules) to produce IfElse.ifelse wind-switching expressions with ArrayOp-parameterized stencils. Add _build_mixed_derivative_rules for cross-derivatives using Cartesian products of 1D centered stencils. Relax can_template guard to allow upwind (UpwindScheme) and mixed derivatives on uniform grids, with upwind BPC included in boundary-proximity calculation. Three new tests: upwind Burgers structural (ArrayOp produced), upwind Burgers ArrayOp-vs-scalar match, and 2D mixed derivative ArrayOp-vs-scalar match. All 15 array_disc_tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…itution
Phase 3: Add ArrayOp template support for the spherical Laplacian pattern
r^{-2} * Dr(r^2 * Dr(u)). This builds on the nonlinear Laplacian
infrastructure, combining centered D1 derivatives with the existing
half-offset nonlinlap scheme.
Key changes:
- _detect_spherical_terms: Uses split_additive_terms (not split_terms)
to preserve the complete spherical expression as a single matchable term
- _spherical_template: Builds ArrayOp-indexed discretization using linear
grid formula (avoids Const-wrapped Float64 vector indexing issues)
- _substitute_terms: New helper that processes additive terms individually,
applying termlevel_dict to matched terms and rdict to unmatched terms.
This avoids passing template values (containing Const-wrapped arrays)
through pde_substitute, whose maketerm reconstruction tries to literally
index concrete arrays with symbolic indices.
- Handles PDE pipeline rearrangement to lhs ~ 0 form by processing both
sides of the equation via _substitute_terms.
Tests: 4 new spherical ArrayOp tests (basic, symbolic structure, with
coefficient, higher order approx_order=4). All existing tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend the ArrayOp template path to handle non-uniform grids for centered
(even-order) derivatives. Previously, any non-uniform grid triggered a full
per-point fallback. Now, per-point stencil weights are stored in a
Const-wrapped matrix indexed by symbolic grid indices, producing a single
ArrayOp equation instead of N scalar equations.
- Add weight_matrix field to StencilInfo for non-uniform weight storage
- Convert Vector{SVector} stencil weights to Matrix in precompute_stencils
- Relax can_template guard: only upwind/nonlinlap/spherical require uniform
- Index weight matrix with symbolic point index aligned to boundary frame
- Add 5 non-uniform grid tests (1D, 2D, coefficient, higher order, structure)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix a prerequisite bug where the non-uniform CompleteUpwindDifference constructor overrode the caller's offside parameter to 0, causing the positive-wind operator to store incorrect offside in the DerivativeOperator. This was invisible on the scalar path due to a compensating assertion, but blocked correct ArrayOp indexing. Changes: - upwind_diff_weights.jl: Remove offside=0 override (line 154) - upwind_difference.jl: Remove @Assert D.offside==0, fix stencil_coefs indexing to use II[j] - D.offside for the !ispositive branch - generate_array_fd_rules.jl: Extend UpwindStencilInfo with weight matrices, populate them for non-uniform grids, update _upwind_stencil_expr to index into Const-wrapped weight matrices using symbolic grid indices, fix boundary frame to use per-side boundary point counts, remove upwind from schemes_needing_uniform - array_disc_tests.jl: Add 5 Phase 8 tests verifying scalar path fix, ArrayOp-vs-scalar match, symbolic structure, parameter support, and advection-diffusion combo on non-uniform grids Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Build the WENO5 (Jiang-Shu) formula as a native symbolic ArrayOp template, enabling vectorized computation for WENO first-order derivatives on uniform grids. The template directly transcribes weno_f using Const-wrapped array taps — no branching, all substencil reconstructions computed and soft-blended via nonlinear weights. Changes: - Add WENOStencilInfo struct and precompute_weno_stencils - Add _weno_template (WENO5 formula as symbolic expression) - Add _build_weno_rules (term-level pattern matching for Dx(u)) - Relax precompute_upwind_stencils guard to also populate for FunctionalScheme windmap operators (odd orders >= 3) - Update generate_array_interior_eqs with weno_cache, has_weno in can_template, WENO BPC in boundary frame - Update _build_interior_arrayop to build and include WENO rules - Add 5 Phase 10 tests: basic convection, coefficient-multiplied, mixed advection+diffusion, symbolic structure, 3rd-order coexistence After this, only generic user-defined FunctionalScheme falls back to per-point computation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…able, Neumann/Robin BCs Remove !haslower/!hasupper guards from BPC computation in generate_array_interior_eqs so boundary_point_count is always accumulated regardless of BC type. This fixes ArrayDiscretization for periodic BCs where haslower/hasupper returns (true, true). Add Phase 11 tests (11a-11k): periodic BC centered/WENO/upwind, coupled multi-variable diffusion and advection-diffusion, Neumann and Robin BCs, and symbolic structure verification for each category. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Part A: Validate N-D ArrayOp with 2D upwind advection-diffusion, 2D coupled diffusion, and 2D symbolic structure tests. Part B: Add periodic BC support to ArrayOp template so periodic PDEs use vectorized computation instead of per-point fallback. Adds _wrap_periodic_idx helper, periodic dimension detection, and wrapping across all 6 scheme types (centered, upwind, WENO, nonlinear Laplacian, spherical Laplacian, mixed cross-derivatives). All 63 test sets pass with 0 failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… tests for ArrayOp Fix _build_interior_arrayop to handle algebraic equations (PDAE) by detecting absence of time derivatives and falling back to per-point discretization. Refactor ODE coefficient extraction to use a fast path for standard Dt coefficient (+1) that avoids pde_substitute on Const-wrapped arrays, with fallback to substitution-based extraction for non-standard forms (e.g. v ~ Dt(u)). Add 8 validation tests (Phase 13): PDAE diffusion+algebraic, KdV 3rd-order upwind, mixed advection+diffusion+dispersion, beam equation (4th-order coupled), Brusselator 2D periodic, Schrödinger complex-valued PDE, and symbolic structure tests for higher-order upwind and 4th-order spatial derivatives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminate per-point scalar fallbacks for boundary-proximity frame points and algebraic (PDAE) equations: - Algebraic equations now produce ArrayOp directly instead of returning nothing and falling back to per-point discretization - New FullInteriorStencilInfo/FullInteriorUpwindStencilInfo structs with position-dependent weight+offset matrices that fold boundary stencils from DerivativeOperator into a single ArrayOp covering ALL interior points - Conditional frame elimination: when equation has only centered+upwind derivatives (no nonlinlap/spherical/WENO/periodic), the frame per-point loop is skipped entirely - Nonlinlap/spherical/WENO equations correctly retain frame per-point behavior - Phase 14 tests (E1-E8) covering PDAE, full-interior centered/upwind, higher-order, 2D, non-uniform, mixed, and nonlinlap regression Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminate `has_nonlinlap` from the full-interior gate condition so that nonlinear Laplacian PDEs (Dx(expr * Dx(u))) use position-dependent weight+tap matrices in a single ArrayOp covering all interior points, removing the per-point frame equations at boundary-proximity points. Key changes: - Add FullNonlinlapInfo struct and precompute_full_nonlinlap() to build 3D weight+tap matrices indexed by (j_outer, j_inner/interp, grid_point) - Add _nonlinlap_full_template() using Const-wrapped 3D matrices - Fix padded outer tap 0/0 division: set first interp weight to 1.0 for zero-weight padded taps so expr_sym with negative powers (e.g. u^(-1)) doesn't produce NaN - Gate condition now: !(has_spherical || has_weno || nonlinlap_nonuniform) && !any(is_periodic) - Spherical and non-uniform nonlinlap deferred to future phases - Add Phase 15 tests (E1-E8) for nonlinlap/spherical full-interior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove non-uniform grid guard from precompute_full_nonlinlap, enabling full-interior mode for nonlinlap on non-uniform grids - Add D1 stencil caching in precompute_stencils for spherical variables so precompute_full_interior_stencils builds FullInteriorStencilInfo for the first derivative used by _spherical_template - Update gate condition to only block on WENO and periodic dimensions - Compute fi_nonlinlap when has_spherical (spherical templates use nonlinlap infrastructure internally) - Add Phase 16 tests: non-uniform nonlinlap (1D, 2D), spherical non-uniform, and spherical+centered 2D Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enable periodic BCs on uniform grids to use the full-interior ArrayOp path, eliminating boundary-proximity frame equations. Non-uniform periodic grids fall back to the standard path. Changes: - Gate condition: `all_full_interior` now allows periodic on uniform grids - Stencil precomputation: periodic branches in centered, upwind, and nonlinlap precompute functions always use interior stencils - ArrayOp wrapping: apply `_maybe_wrap` to full-interior index expressions - Validation skip: skip `_equations_match` for periodic (different wrapping structure) - Add `_wrap_grid_periodic` and `_wrap_half_periodic` integer wrapping helpers - Add Phase 17 tests (17a-17g): structural, numerical, upwind, 2D, mixed, nonlinlap Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ase 18) The standard ArrayOp path had an index mapping bug for periodic non-uniform grids: weight matrices had N-2*bpc columns but iteration covered N points, causing out-of-bounds access. Fix by building extended N-column weight matrices with wrapped grid positions for boundary-proximity stencils, and adjusting point_idx to map directly without bpc offset for periodic dims. Changes: - Add _periodic_stencil_positions, _build_periodic_centered_wmat, and _build_periodic_upwind_wmat helper functions - Fix centered stencil path in _build_interior_arrayop for periodic non-uniform - Fix upwind stencil path in _upwind_stencil_expr for periodic non-uniform - Fix mixed derivative path in _build_mixed_derivative_rules for periodic non-uniform - Add Phase 18 tests (18a-18f) covering structural, diffusion, upwind, 2D mixed, higher-order, and regression scenarios - All 300 array_disc_tests pass with zero regressions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Staggered grids are the last remaining discretization scheme without ArrayOp support. This adds ArrayOp-based discretization for staggered grids used in hyperbolic PDE systems (e.g., wave equations) where two dependent variables live on offset sub-grids. - Add StaggeredStencilInfo struct and precompute_staggered_stencils - Generate fixed-offset stencils based on variable alignment (CenterAligned → [0,1], EdgeAligned → [-1,0]) - Gate staggered into can_template; disable upwind rules when active - Add Phase 19 tests: structural, numerical (mixed/periodic BCs), and SplitODEProblem compatibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Per SciML style guide, untyped struct fields must explicitly be annotated with ::Any rather than left implicitly untyped. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add _ConstSR type alias to replace 34 verbose type references - Extract _stencil_coefs_to_matrix utility (replaces 10 hcat/splat patterns) - Merge duplicate periodic wmat builders into single _build_periodic_wmat - Replace non-deterministic rand() in _equations_match with fixed test offsets - Hoist allocations out of precompute loops (stencil weights, grid collects) - Cache _weno_template calls to avoid duplicate computation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Contributor
|
Second attempt! I don't know exactly what's going on with the brusselator, and I think there are some opportunities for refactoring, but I figured this would be a good point to start a conversation. It's not yet quite at constant time discretization, but it seems categorically faster than the pointwise implementation. I am told that @arrayop and @makearray don't work here because the indexes are determined at runtime rather than parse-time (?), so ArrayOp and MakeArray are used instead. Let me know what you think. |
| offsets = interior_offsets | ||
| elseif g <= bpc | ||
| # Lower frame: use low_boundary_coefs[g] | ||
| weights = Vector{Float64}(D_op.low_boundary_coefs[g]) |
Replace hardcoded Float64 with type parameter T<:Real in 7 structs
(StencilInfo, UpwindStencilInfo, NonlinlapStencilInfo, WENOStencilInfo,
FullInteriorStencilInfo, FullInteriorUpwindStencilInfo, FullNonlinlapInfo)
and ~30 allocation sites. This enables Float32 or other Real subtypes
to propagate through DerivativeOperator{T} into weight matrices.
Key changes:
- Add {T<:Real} type parameter to all struct definitions
- Replace Vector{Float64}(...) with collect(...) to preserve element type
- Replace zeros(Float64,...) with zeros(T,...) using T from operator
- Add _op_eltype helper to extract T from DerivativeOperator{T}
- Use explicit type parameter at construction sites with nothing fields
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6 tests covering the type-generic changes: - Unit tests for struct construction and helper functions with Float32 - 1D centered diffusion with Float32 grid spacing - Float32 vs Float64 solution comparison - Upwind advection with Float32 - Nonlinear diffusion (nonlinlap) with Float32 - WENO advection with Float32 (using exact binary dx=0.125) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move StableRNGs from [deps] to test-only (it's unused in src/)
- Extract _validate_arrayop_or_fallback helper to deduplicate validation logic
- Move _contains_time_diff to module level (pure utility, no closure capture)
- Remove dead code: unused bs/haslower/hasupper in BPC loop
- Remove unnecessary enumerate() where index was discarded
- Tighten StaggeredStencilInfo.alignment from ::Any to ::Type{<:AbstractVarAlign}
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Equations without spatial derivatives (e.g., algebraic equations like R(t,x) ~ f(params)) were incorrectly falling back to per-point scalar discretization when the system also contained PDEs with odd-order spatial derivatives. This happened because `can_template` used the global `derivweights.orders[x]` which includes derivative orders from ALL PDEs, and the scheme-specific caches (WENO, upwind, etc.) had no entries for variables without spatial derivatives. Add `_contains_spatial_diff` helper and an early-return fast path in `generate_array_interior_eqs` that bypasses the `can_template` gate and boundary frame computation for equations with no spatial derivatives, producing a single ArrayOp instead of N×M scalar equations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tude) Move upwind fallback rules from termlevel_rules to fd_rules so that bare derivatives like Dx(u) inside nonlinear expressions (e.g. sqrt(Dx(u)^2 + Dy(u)^2)) are substituted via pde_substitute, matching the scalar path's generate_winding_rules fallback behaviour. Also extend precompute_stencils and the fd_rules loop to include odd-order centered stencils, and broaden can_template for non- FunctionalScheme advection schemes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace `boundary_f = [nothing, nothing]` in WENOScheme with actual boundary stencil functions so WENO frame points produce correct derivative approximations instead of falling back to extrapolation. The four boundary functions use 3rd-order one-sided finite differences (Fornberg weights on a 4-point stencil) at the 2 grid points nearest each domain boundary where the full 5-point WENO5 stencil does not fit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For multi-dimensional problems, periodic BC equations are now generated as a single ArrayOp expression instead of one scalar equation per edge point. This reduces symbolic processing overhead for 2D+ periodic BCs. The ArrayOp tiles `disc1[boundary, tang...] ~ disc2[boundary+offset, tang...]` along the tangential dimensions. Falls back to the scalar path for 1D (single-point edges). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For multi-dimensional problems on CenterAligned and Staggered grids, truncating BC equations (Dirichlet, Neumann, Robin) are now generated as single ArrayOp expressions instead of one scalar equation per edge point. The implementation directly builds symbolic substitution rules using ArrayOp index variables for tangential dimensions, handling: - Variable value maps (u at boundary -> Const(discvar)[boundary, tang...]) - Derivative stencils (Dx(u) -> weighted sum with symbolic tangential indices) - Grid coordinate maps (x -> boundary value, y -> Const(grid)[tang...]) Validates against the scalar path at the first edge point and falls back to scalar for 1D, EdgeAlignedGrid, or HigherOrderInterfaceBoundary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace simple isequal check with the robust _equations_match function (symbolic + numerical comparison) for validating BC ArrayOp equations against the scalar path. Falls back to scalar if validation fails. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In generate_array_interior_eqs, the fast path for equations without spatial derivatives was calling precompute_stencils a second time instead of reusing the already-computed stencil_cache. Also hoist lo_vec/hi_vec extraction before the branching logic so both the fast path and the N-D ArrayOp path share a single computation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add `validate` field to ArrayDiscretization (default: true for correctness). When false, skips scalar-path validation of ArrayOp equations and boundary conditions, saving ~5-15% discretization time. Usage: ArrayDiscretization(validate=false) - Defer stencil cache precomputation (upwind, nonlinlap, WENO, staggered) until after the fast-path returns for equations without spatial derivatives and the can_template check. This avoids unnecessary work for algebraic equations and equations that fall back to per-point scalar discretization. - Use lightweight checks (halfoffsetmap key existence, staggeredvars presence, advection_scheme type) for the can_template decision, then refine with actual cache emptiness after precomputation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Applies the review fixes from the PR feedback without changing any numerical behaviour: - generate_array_fd_rules.jl: rework `_validate_arrayop_or_fallback` to sample up to three local points (first / mid / last) via a closure returned from `_build_interior_arrayop`, so position- dependent stencil bugs can be caught instead of only checking the first centered point. Also initialises `eq_scalar` deterministically and removes the short-circuit dependency for periodic dims. - _equations_match: expose the numerical comparison tolerance as a keyword and document its rationale. - _wrap_periodic_idx: expand the docstring to spell out the 2:N canonical-range convention and how downstream consumers should read index 1. - WENO.jl / advection_schemes.md: call out that `WENOScheme()` now uses 3rd-order one-sided Fornberg stencils at non-periodic boundaries, so users running convergence studies know the boundary order is 3 rather than 5. - upwind_diff_weights.jl / upwind_difference.jl: add explanatory comments around the `offside = 0` removal and the non-uniform `stencil_coefs[II[j] - D.offside]` indexing, tying them back to the bug fix that made ArrayOp non-uniform upwind possible. - spherical_laplacian.jl: update the remaining `s.discvars[v][I]` spot in `rsubs` to go through `_disc_gather` for consistency. - array_disc_tests.jl: relax the KdV Array-vs-Scalar tolerance to `atol = 2e-2`. `rtol = 1e-10` is unachievable for adaptive FBDF on a nonlinear dispersive PDE because tiny RHS differences (the scalar per-point build vs. ArrayOp-flattened build have different associativity) get amplified by timestep selection. The same failure reproduces on the pre-fix branch. Verified with GROUP=ArrayDisc (358 passed), GROUP=Higher_Order, GROUP=Convection_WENO (309 passed), GROUP=Nonlinlap_ADV, GROUP=Wave_Eq_Staggered, GROUP=Burgers, GROUP=Convection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Five low-risk cleanups that reduce duplication and remove a couple of
internal-API dependencies without changing any numerical behaviour.
1. Project.toml: bump PDEBase compat from 0.1.20 to 0.1.22. The PR
already uses `pde_substitute`, which was only added in PDEBase
0.1.22 — previous minimums would resolve to a version that crashes
at discretization time with an UndefVarError.
2. Replace `SymbolicUtils.BSImpl.Const{SymReal}` with
`SymbolicUtils.Const{SymReal}` throughout. The top-level
`SymbolicUtils.Const` alias is the least-internal path to the
Const ctor (Symbolics.jl imports it directly at line 28); reaching
through `BSImpl` ties us to Moshi's `@data` module layout, which
is the most likely identifier to move in a SymbolicUtils minor
rewrite. Also drops the redundant local `_ConstSR` alias in
generate_bc_eqs.jl in favour of the one defined in
generate_array_fd_rules.jl.
3. Replace hand-rolled `_contains_time_diff` / `_contains_spatial_diff`
tree walkers with thin wrappers over `PDEBase.differential_order`.
Drops ~20 lines and aligns the recursion with the rest of the
PDEBase pipeline so upstream fixes (e.g. Symbolics v7 `.order`
handling) are picked up automatically.
4. Introduce `ArrayOpContext` and `StencilCaches` immutable structs
to bundle the state that every `_build_*_rules` helper repeats in
its signature. Before: 15 positional args on
`_build_interior_arrayop`, an 8-tuple copy-pasted across eight
rule-builder signatures, and `is_periodic=falses(length(bases))`
default placeholders duplicated 8 times. After: all rule builders
take `(ctx::ArrayOpContext, caches::StencilCaches, pde, ...)` plus
a small handful of call-site-specific args. The `sample_at`
closure in `_build_interior_arrayop` now captures `ctx` by
reference, so the old tuple of loose parameters threaded through
every helper is gone.
5. Extract a single `_tap_expr(ctx, u_c, u_spatial[, x, off])` helper
to replace 11 copies of the
`taps = map(offsets) do off; idx_exprs = map(u_spatial) do xv … end`
idiom across `_build_interior_arrayop`, `_build_upwind_rules`,
`_build_mixed_derivative_rules`, `_nonlinlap_template`,
`_nonlinlap_full_template`, `_spherical_template`, and
`_weno_template`. Also deletes the parallel
`_upwind_stencil_expr`/`_upwind_full_interior_expr` inner helpers
and folds their work through `ctx`. Three-method dispatch:
- no-args variant for `var_rules` (at-point lookup, no wrap),
- `(x, off)` variant for single-dim stencil shifts (wrapped),
- `(shifts::Dict)` variant for mixed cross-derivatives (wrapped).
Note on wrapping: an earlier version of this refactor routed
`var_rules` through `_maybe_wrap` too, which silently broke the
periodic tests — wrapping the at-point index introduces
`IfElse.ifelse` branches that `pde_substitute`'s `maketerm`
reconstruction cannot traverse (the same workaround reason as
`_substitute_terms`). The no-shift `_tap_expr` therefore
intentionally skips wrapping, and its docstring explains why.
Review refactors 6 (struct-zoo parametric cleanup), 7 (collapse
full-interior vs standard code paths), and 8 (file split into an
array_fd/ subdirectory) are deliberately deferred to a follow-up PR
— 7 interacts with 6 and would land a ~250-line rewrite of the
most intricate part of the ArrayOp builder, and 8 is a mechanical
move that is best done on top of a stable 6+7 base.
Verified with GROUP=ArrayDisc (358 passed / 0 failed / 0 errored).
No other test groups were re-run; refactors 1-5 are all pure refactors
(no behaviour changes) so the earlier broader verification from the
previous commit still applies.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
`UpwindStencilInfo` previously flattened seven parallel fields
(`D_neg`, `D_pos`, `neg_offsets`, `pos_offsets`, `is_uniform`,
`neg_weight_matrix`, `pos_weight_matrix`) into one struct. Each half
is shape-identical to a centred `StencilInfo`, so the struct now
holds two `StencilInfo` fields — `neg::StencilInfo{T}` and
`pos::StencilInfo{T}` — and the tap-building helpers treat both wind
directions uniformly.
Mechanical changes:
- `precompute_upwind_stencils` builds two `StencilInfo`s and wraps
them in an `UpwindStencilInfo`.
- `precompute_full_interior_upwind`, `_build_interior_arrayop`'s
per-dim boundary-point-count computation, and `_build_upwind_rules`
(both primary and fallback paths) update their field accesses:
`usi.D_neg` → `usi.neg.D_op`, `usi.neg_offsets` → `usi.neg.offsets`,
`usi.neg_weight_matrix` → `usi.neg.weight_matrix`, etc. `is_uniform`
now lives on each half (the two halves always have the same value).
- `StencilInfo`'s docstring is updated to reflect that it is now
reused for upwind halves in addition to centred stencils; the
field comments describe what "offsets" means for each case.
Benefit: removes 6 parallel field names, eliminates the duplicated
`is_uniform::Bool` field, and clears a small piece of structural
duplication in `_upwind_stencil_expr`'s call sites. No behaviour
change.
Verified with GROUP=ArrayDisc (358 passed / 0 failed).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous monolithic file was 2955 LOC and mixed stencil
precomputation, PDE pattern detection, rule-building for seven
different discretization schemes, ArrayOp construction, validation,
and fallback logic. Split into thirteen thematic files under
`src/discretization/array_fd/`:
- stencil_info.jl (103 LOC) — StencilInfo / UpwindStencilInfo /
NonlinlapStencilInfo / WENOStencilInfo /
StaggeredStencilInfo
- context.jl (145 LOC) — _ConstSR alias, ArrayOpContext,
StencilCaches, _tap_expr variants,
_op_eltype
- precompute.jl (860 LOC) — all precompute_* caches, the
FullInterior* structs and their
precomputers, _half_op/_half_inner
helpers, stencil_weights_and_taps
- pattern_detection.jl (105 LOC) — _detect_nonlinlap_terms,
_detect_spherical_terms
- validation.jl (143 LOC) — _equations_match, _local_sample_indices,
_validate_arrayop_or_fallback
- interior_driver.jl (275 LOC) — generate_array_interior_eqs orchestrator
- substitution_helpers.jl (85 LOC) — _substitute_terms, _wrap_periodic_idx,
_maybe_wrap, _contains_*_diff
- arrayop_builder.jl (276 LOC) — _build_interior_arrayop
- rules_upwind.jl (192 LOC)
- rules_mixed.jl (95 LOC)
- rules_nonlinlap.jl (341 LOC) — _nonlinlap_template,
_nonlinlap_full_template,
_build_nonlinlap_rules
- rules_spherical.jl (156 LOC)
- rules_weno.jl (152 LOC)
`generate_array_fd_rules.jl` is now a 68-line stub that `include`s
each subfile in a dependency-respecting order, with a comment
explaining why the order matters (struct-dispatched functions need
their types defined upstream; function calls resolve at runtime and
so tolerate forward references).
No behaviour change — this is a purely mechanical reshuffle. The
new layout mirrors the existing `discretization/schemes/*/` pattern
so future contributors have one file per scheme to add/modify instead
of one 3000-line monolith.
Verified with GROUP=ArrayDisc (358 passed / 0 failed).
Refactor 7 (collapse full-interior vs standard code paths) is
deliberately deferred — it has plausible but unverified performance
implications (position-dependent matrix-indexed weights vs constant
dot products in the hot path) and the code-complexity win is more
than offset by the risk during a session without a benchmarking
harness.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JET's `@report_opt target_modules=(MethodOfLines,)` on a cold
`discretize(pdesys, ArrayDiscretization())` flagged two concrete
runtime-dispatch sites inside the PR's own code. Both are build-time
latency issues, not hot-path; even unfixed, ArrayDiscretization
already runs 3.6× faster with 3× fewer bytes than the scalar
baseline on a 32-point 1D diffusion test. These patches remove the
dispatch churn anyway.
1. `_to_scalar_interiormap` (src/array_discretization.jl) — the
output dict was constructed as plain `Dict()`, so every loop
iteration hit runtime dispatch on `indexed_iterate`, `isempty`,
`first`, `setindex!`, and `iterate` (JET flagged 9 sites in the
~15-line function). Narrowed to `Dict{Any, CartesianIndices}`.
The key stays `Any` because the upstream `InteriorMap.I` uses
heterogeneous PDE-equation keys. Also refactored the loop body
so `scalar_I[pde] = CartesianIndices(...)` always assigns the
same value type — the previous branch assigned a bare `ranges`
in the fallback arm, which would have forced a type widening.
2. `_validate_arrayop_or_fallback`
(src/discretization/array_fd/validation.jl) — all positional
arguments were typed as `Any`, so
`ntuple(d -> lo[d] + local_idx[d] - 1, ndim)` widened to `Tuple`
and the downstream `CartesianIndex(...) ->
discretize_equation_at_point(II::Any, ...)` chain hit runtime
dispatch. Added `n_region/lo/hi::AbstractVector{Int}`,
`ndim::Int`, `is_periodic::AbstractVector{Bool}`, `validate::Bool`,
and wrapped `ntuple` in `Val(ndim)` so it specializes on the
static dimension count.
Several medium-severity struct field types (`StencilCaches`
`Dict{Any, Any}`, `NonlinlapStencilInfo.{outer,inner,interp}_weights::Any`,
`StaggeredStencilInfo.alignment::Type{<:AbstractVarAlign}`,
`ArrayOpContext.idxs::Vector` untyped eltype) are deliberately left
for a follow-up. They are all build-time only, and the biggest
upstream blocker — `DiscreteSpace.discvars::Any` — is outside this
PR's scope.
Verified with GROUP=ArrayDisc (358 passed / 0 failed).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
This PR introduces
ArrayDiscretization, a new discretization strategy that generates interior PDE equations as single symbolicArrayOpexpressions instead of one scalar equation per grid point. This dramatically reduces symbolic processing time during discretization while producing numerically identical solutions (mostly).ScalarizedDiscretizationbehavior is unchangedMOLFiniteDifference([x => dx], t; discretization_strategy = ArrayDiscretization())How It Works
Original approach (
ScalarizedDiscretization)For N interior grid points, the discretizer calls
discretize_equation_at_point()N times, building fresh symbolic substitution rules at each point. This produces N independent scalar equations and has super-linear symbolic overhead.New approach (
ArrayDiscretization)_i1,_i2, ...) — one per spatial dimensionSymbolicUtils.ArrayOpwith range parameters (e.g.,_i ∈ 1:n_interior)Boundary equations are handled by the existing scalar infrastructure unchanged.
Two operating modes
Const-wrapped matrices, eliminating the frame entirelyAutomatic fallback
If the PDE contains patterns the template engine cannot handle (e.g.,
FunctionalScheme), or if template validation against the scalar path fails, the system automatically falls back to per-point scalar discretization.Supported PDE Patterns
Performance Benchmarks
Benchmarks were run across 7 PDE systems (1D and 2D, linear and nonlinear) at multiple grid resolutions, measuring both discretization time and solve time.
Discretization Time Speedup (Scalar / Array)
Absolute Discretization Times at Largest Grid Sizes
Solve Time
Solve times are essentially unchanged (speedup ratio ≈ 1.0× across all systems and grid sizes).
Test Coverage
The new test file (
test/pde_systems/array_disc_tests.jl, 5,437 lines) covers 15 phases:Primary validation: Array vs. Scalar numerical comparison at rtol=1e-10 in most cases, plus analytical solution verification and structural checks.
🤖 Generated with Claude Code