v3: enum-tagged dispatch for singleton search strategies (DRAFT — ignore until reviewed by @ChrisRackauckas)#73
Conversation
965878f to
2170a2c
Compare
Update: rebased + correctness fixes (2170a2c)Rebased onto current Review of the props-aware UniformStep path found four bugs, all fixed in
Also in this push: Tests: 105,020 pass on Julia 1.12 and 1.10 locally (was 104,964; +56 regression tests). Runic clean. |
|
CI triage for this PR and On this PR's own CI: all Core cells (1/lts on 3 OSes) and QA pass; the docs breakers are fixed here too ( Note for merge ordering: #86 and #87 touch |
|
Production cleanup ( Also fixed Re-ran the full suite after the cleanup: 105,020/105,020 pass on Julia 1.12; Runic clean. No TODO/FIXME/dev markers anywhere in the diff. |
Update: back-compat layer removed (
|
Replaces the v2 `Base.searchsortedlast(::SearchStrategy, ...)` multimethod
dispatch on strategy struct singletons with a single FFF-owned dispatcher
tagged by a `StrategyKind` enum. The runtime `if/elseif` over the enum
value is well-predicted in hot loops, the kernel bodies inline, and the
return path stays concrete (`Int`) regardless of which kind is picked at
runtime — none of the `Union`-return pathology that v2's design suffered
from when the chosen strategy depended on runtime data (e.g. `Auto`'s
decision tree returning `Union{BracketGallop, LinearScan, BinaryBracket}`).
The `bench/enum_vs_typeparam_dispatch.jl` sweep confirms ~0 ns overhead
vs. the v2 path across 20 representative cells (worst case +2.2%, several
cells show enum dispatch beating legacy multimethod by 3-6%).
Stateful strategies (`Auto`, `GuesserHint`) stay on the multimethod path
because they carry per-instance data that doesn't fit into a singleton
tag.
Layout:
- src/kinds.jl: `@enum StrategyKind` + `search_last` / `search_first`
enum dispatchers.
- src/kernels.jl: per-strategy kernel functions
(`_kernel_last_bracket_gallop`, etc.), lifted out of the v2 method
bodies. No semantic changes.
- src/legacy_dispatch.jl: `Base.searchsortedlast(::S, ...)` shims for
each singleton strategy struct, forwarding to `search_last(KIND_X, ...)`.
Scheduled for removal in v4.
- src/strategies.jl: `Auto` now holds a `StrategyKind` field. `Auto()`
defaults to `KIND_BINARY_BRACKET`; `Auto(v)` resolves the kind from
`length(v)` + `SearchProperties(v)`.
- src/auto.jl: `Auto`'s per-query `search_last` / `search_first` is a
one-line forward to the stored kind. The batched dispatcher
re-resolves the kind from `(v, queries)` (gap heuristic).
- src/guesser.jl: `GuesserHint` dispatches via `search_last(::GuesserHint, ...)`
method (no kind tag; stateful).
- src/findequal.jl: `findequal(::StrategyKind, v, x[, hint])` direct.
Back-compat: every v2 call site (e.g.
`searchsortedlast(BracketGallop(), v, x, hint)`) still works via the
shim — confirmed by the existing 100,472-test suite (now 104,796 tests
with the new v3 API safetestset).
DataInterpolations.jl: needs zero updates to work with v3. Optional
optimisation: replace `A.strategy = FindFirstFunctions.BracketGallop()`
with `A.strategy = FindFirstFunctions.KIND_BRACKET_GALLOP` and
`searchsortedlast(strat, ...)` with
`FindFirstFunctions.search_last(strat, ...)` to skip the shim layer.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Make `SearchProperties` parametric on the data ratio type
`T = typeof(oneunit(eltype(v)) / oneunit(eltype(v)))` and add two new
fields:
- `first_val::T` — `v[1]` (or `first(r)` for an AbstractRange)
- `inv_step::T` — precomputed `1 / step`, or `(n-1)/(v[end]-v[1])`
for a uniform AbstractVector
These fields are populated when `is_uniform = true` and zero otherwise.
They feed a new props-aware `UniformStep` kernel invoked by `Auto(v)`
when the resolved kind is `KIND_UNIFORM_STEP`: closed-form O(1) lookup
with one subtract, one multiply, one truncate (no per-query float
division). This folds in the never-merged `DirectStep` strategy.
`Auto{T}` is now parametric, carrying `SearchProperties{T}`. `Auto(v)`
returns `Auto{T}` where `T` is the ratio type of `eltype(v)`. Two
`Auto`s constructed from data with the same ratio type share one
concrete type (e.g. `Vector{Int}` and `Vector{Float64}` both promote
to `Auto{Float64}`), so `Vector{Auto{Float64}}` is concrete.
The raw `UniformStep()` singleton keeps its old behaviour for
back-compat: `searchsortedlast(UniformStep(), r, x)` still does
`fld(diff, step)` per query. Only `Auto(v)` routes through the
props-aware path.
Bench (per-query latency, n = 10k, m = 1000):
Sorted queries on AbstractRange (0.0:0.5:N):
Auto(r) [props]: 13.74 ns/q
UniformStep() [fld]: 65.02 ns/q
BracketGallop+hint: 44.29 ns/q
Sorted queries on uniform Vector{Float64}:
Auto(v) [props]: 7.6 ns/q
UniformStep() [bb]: 81.79 ns/q (vector path falls back to BinaryBracket)
BracketGallop+hint: 25.43 ns/q
Random queries on AbstractRange:
Auto(r) [props]: 13.73 ns/q
UniformStep() [fld]: 67.5 ns/q
BracketGallop+hint: 238.41 ns/q (miss path)
The props-aware UniformStep beats raw UniformStep by 5-10× and is
~3-20× faster than BracketGallop on uniformly-spaced data.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
`SearchProperties(::AbstractVector{<:Number})` used to call `T(v[1])`
unconditionally, which trips a `DimensionError` on Unitful `Quantity`
(or any other non-`Real` numeric type whose `T(::Quantity)` conversion
is ill-defined). Split the overload:
- `AbstractVector{<:Real}` keeps the props-aware path: populate
`first_val::T` / `inv_step::T` so `Auto(v)` routes through the
closed-form `KIND_UNIFORM_STEP` kernel.
- `AbstractVector{<:Number}` (non-`Real`) runs only the linearity probe
(which works on any `Number`), returns `SearchProperties{Float64}`
with `is_uniform = false`, and leaves `first_val` / `inv_step` zero.
Auto then resolves to `KIND_BRACKET_GALLOP` (or `KIND_LINEAR_SCAN`),
matching v2 behaviour.
`_auto_is_uniform` is correspondingly narrowed: `AbstractRange{<:Real}`
remains "always uniform", but for `AbstractRange{<:Number}` (e.g.
`StepRange{Quantity}`) we consult `props.is_uniform`, which is `false`
under the new overload. This prevents the closed-form kernel from
being invoked on Unitful ranges.
Fixes the `DataInterpolations.jl` test failure under Unitful eltypes:
`LinearInterpolation(u::Vector{Quantity{...,𝐋}}, t::StepRange{Quantity{...,𝐓}})`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
`SearchProperties{Float64}` now carries `first_val::Float64` and
`inv_step::Float64`, growing `Auto{Float64}` from 16 to 32 bytes. Julia
1.10's kwarg trampoline boxes the resulting NamedTuple as exactly 64
bytes; 1.11 elides the allocation entirely. The `< 64` bound (which fit
1.11 with margin) now reads as `64 < 64` on 1.10. Bump to `<= 64` to
cover both — the call is still a tight closed-form lookup, just with
one more boxed kwarg pointer than v2 had.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Three correctness fixes for the closed-form O(1) lookup: - Clamp the float position f = diff * inv_step in the float domain before truncating. unsafe_trunc(Int, f) is UB once |f| exceeds typemax(Int), which any finite extreme query reaches; on x86 it returned typemin and the clamp then produced the wrong end of the vector (search_last(Auto(v), v, 1.0e300) returned firstindex). - Validate is_uniform exactly. The 11-point sampled probe can be fooled by data that is uniform at the probed points but jittered between them, and a false positive makes the closed form land in the wrong cell. A positive from the sampled pre-filter is now confirmed by an exact O(n) scan over every element before is_uniform is set. - Replace the kernels' single-step roundoff correction with a walk, and fall back to binary search when f is NaN (caller-forced is_uniform = true on a zero-span vector gives inv_step = Inf). A wrong closed-form guess now degrades to a slower search instead of a silently wrong index, which also keeps the searchsorted contract when a caller forces is_uniform on non-uniform data. Also guard the per-query Auto dispatch on props.has_props: an Auto holding the sentinel SearchProperties() has inv_step = 0, which made the closed-form guess degenerate into an O(n) walk from firstindex (~49,000x slower on a 1e6-element range). The sentinel case now routes to the fld-based kind kernel like the batched path already did. Regression tests for all four cases. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
The enum-tagged findequal forms went through the generic
search_first + post-check path for every kind, so the documented v3
migration findequal(BisectThenSIMD(), v, x) ->
findequal(KIND_BISECT_THEN_SIMD, v, x) silently lost the
DenseVector{Int64} bisect-then-SIMD shortcut. Forward that kind to the
struct form, which carries the specialized dispatch.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
- kinds.jl / legacy_dispatch.jl claimed the back-compat shims emit a depwarn; they deliberately do not (NEWS documents why). State the actual v4-removal plan instead. - strategy_kind docstring said Auto throws; it returns the stored kind. - searchsortedrange docstring now describes the actual hint seeding (max(first_idx, hint) for the upper endpoint). - Drop the unused (and broken) bench_per_query function in the props bench. - Strip change-history narration from comments per repo convention. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
makedocs(strict on missing_docs/cross_references) failed on four counts carried over from the v2 docs: - [`searchsortedlast`](@ref Base.searchsortedlast) in batched.jl docstrings cannot resolve (Base docstrings are not in this build) — use plain text. - The [Equality search](@ref) link was ambiguous between the equality.md page title and a strategies.md heading of the same name — rename the heading and target the page slug explicitly. - _simd_scan_ir has no docstring, so its @ref fails — de-link it. - searchsortedrange has a docstring but appeared in no @docs block — add it to interface.md. Verified with a full local docs build (zero errors). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
enum_vs_typeparam_dispatch.jl and uniform_step_props_bench.jl were development sweeps used to justify the v3 dispatch design; their findings are recorded in NEWS.md and the PR discussion, and they have no ongoing maintenance value. The maintained sweeps referenced from the docs (auto_sweep.jl, analyze.jl, bitinterp_sweep.jl) stay. bench/Project.toml pinned FindFirstFunctions = "2.1.0", which cannot resolve against the dev'd 3.0.0 — bumped to "3". Verified the bench environment instantiates with the local v3 package and that every FindFirstFunctions symbol referenced by the remaining scripts exists in v3. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
v3 is a breaking release — drop the back-compat layer instead of carrying it to v4. FindFirstFunctions no longer extends Base.searchsortedlast / Base.searchsortedfirst with strategy methods; search_last / search_first are the only search entry points. - Delete the per-singleton Base shims (legacy_dispatch.jl) and the Base shims for Auto and GuesserHint. - Add search_last / search_first methods that accept a singleton strategy struct directly, forwarding through strategy_kind (which constant-folds for literal strategies, so the struct form costs nothing over the kind form). The structs stay as the friendly strategy names; the file is now strategy_kind.jl. - findequal's struct form and searchsortedrange / the batched sorted loops now route through search_first / search_last internally. - Precompile workload, tests, NEWS, and the docs migration guide updated to the v3-only API. New guard testset asserts no Base.searchsorted strategy methods exist, against accidental reintroduction. The FFF-owned batched API (searchsortedlast! / searchsortedfirst! / searchsortedrange) and the equality API are unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
interface.md still described the v2 contract (extending Base.searchsortedlast/searchsortedfirst per strategy); it now documents search_last / search_first as the API surface and shows custom strategies extending the FFF-owned functions — which also means third-party strategies no longer commit type piracy on Base. auto.md's per-query section described the v2 pick-at-every-query decision tree; it now shows the v3 construction-time kind resolution including the KIND_UNIFORM_STEP props path. Remaining searchsortedlast(strategy, ...) examples in index.md / guessers.md / equality.md renamed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
f4cb7ff to
a64751d
Compare
|
Rebased onto current One expected CI note: the downstream DataInterpolations cells on this PR will now genuinely fail — DI |
|
Correction to my previous note: CI on the rebased push is fully green (all 9 Core cells, QA, Documentation, Downgrade, typos, Runic, and all 6 downstream cells) except The downstream DataInterpolations cells don't fail as I predicted — the canonical downstream workflow handles breaking releases by design: it detects the SemVer resolve conflict (DI master's |
The v3 dispatcher names dropped the 'sorted' cue that Base's searchsortedfirst/searchsortedlast carried — but these functions share that precondition exactly (v must be sorted; assumed, not checked). Put 'sorted' back in the name. The underscore form stays distinct from Base's exported searchsortedfirst/searchsortedlast (so it remains exportable and bare calls resolve to FFF's) and matches the existing FFF family searchsortedfirst!/searchsortedlast!/searchsortedrange. Mechanical rename of the public functions, their methods (Auto, GuesserHint, the struct fallback, the kind dispatchers), the internal _search_*_ dispatch helpers, all call sites, exports, tests, NEWS, and docs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
State explicitly that v must be sorted ascending under order (assumed, not checked, as with Base.searchsortedlast) — the summary previously only implied it via the polarity reference. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Renamed
|
…ecated 'Preferred' implied a lesser-but-available alternative (the v2 Base.searchsortedlast strategy methods), but those are removed. Reframe the strategies overview and the SearchStrategy docstring so the enum tag and the strategy struct read as two ways to call the single search API, not as 'preferred vs legacy'. Drop the stale '(v3 preferred path)' export comment. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Status: Draft. Ignore until reviewed by @ChrisRackauckas. This is a breaking change targeted at v3.0.0.
Summary
Two related refactors land in this PR:
Enum-tagged dispatch for singleton search strategies. Replace the v2
Base.searchsortedlast(::SearchStrategy, ...)multimethod dispatch with a single FFF-owned dispatcher tagged by aStrategyKindenum. The runtimeif/elseifover the enum is well-predicted in hot loops, kernel bodies inline, and the return path stays concrete (Int) regardless of which kind is picked at runtime — none of theUnion-return pathology that v2's design suffered from when the chosen strategy depended on runtime data (e.g.Auto's decision tree returningUnion{BracketGallop, LinearScan, BinaryBracket}).Parametric
SearchProperties{T}+ props-awareUniformStep(folds in Add DirectStep strategy for closed-form O(1) range lookup #74DirectStep).SearchPropertiescarriesfirst_val::Tandinv_step::Tprecomputed at construction. WhenAuto(v)resolves toKIND_UNIFORM_STEP, per-query lookup uses a closed-form O(1) kernel (one subtract, one multiply, one truncate) — no per-query float division. This subsumes theDirectStepstrategy from PR Add DirectStep strategy for closed-form O(1) range lookup #74, which is closed as superseded.Stateful strategies (
Auto,GuesserHint) stay on the multimethod path because they carry per-instance data.Auto{T}is parametric on the data ratio type.Bench
Enum dispatch overhead vs v2 multimethod path
Across 20 representative cells, median Δ = -0.04 ns/q. Worst regression +2.2%; several cells show enum dispatch beating legacy by 3-6% (better register pressure / inlining).
Props-aware UniformStep (per-query latency, n = 10k, m = 1k, BenchmarkTools median):
fld)Vector{Float64}Vector{Float64}The props-aware UniformStep beats raw
UniformStepby 5-10× and is 3-20× faster thanBracketGallopon uniformly-spaced data.DataInterpolations.jl impact (DI PR #531)
DI PR #531 switches
_resolve_strategy(t)fromBracketGallop()toAuto(t). Per-query latency onn=10kLinear:Wins: 15-33% on uniform data. Cost: ~5-20 ns/q on non-uniform Vector (Auto's per-query
s.kind === KIND_UNIFORM_STEPbranch adds register pressure on the BracketGallop path). DI still 4-25× slower than FastInterp on Range — the remaining gap is DI's per-query overhead (Guesser, extrapolation check, linear-interp arithmetic), not the strategy.Breaking changes
Auto()'s per-query path now defers to its stored kind (KIND_BINARY_BRACKETby default) instead of re-running_auto_pick(length(v), hint)on every call. Callers that previously gotLinearScan-on-short-vectors behavior fromAuto()should switch toAuto(v).SearchPropertiesis nowSearchProperties{T}. Code that constructed via the positional 5-tuple constructor (e.g.SearchProperties(true, true, false, false, false)) needs to use the parametric 7-tuple formSearchProperties{Float64}(true, true, false, false, false, 0.0, 0.0). The keyword-argument and array-based constructors are unchanged.Autois nowAuto{T}. Code that wrote::Autoas a type annotation needs to either drop the parameter or useAuto{Float64}/Auto{T} where Tdepending on context.The v2
Base.searchsortedlast(::S, ...)/Base.searchsortedfirst(::S, ...)extensions are removed — v3 does not extend those Base functions at all.search_last/search_firstaccept aStrategyKind, a strategy struct (forwards throughstrategy_kind, constant-folds for literals), or a statefulAuto/GuesserHint. The FFF-owned batched API (searchsortedlast!/searchsortedfirst!/searchsortedrange) is unchanged.Test pass count
CoreQADataInterpolations.jl breakage assessment
DI's PR #531 is updated alongside:
get_idxand the Mooncake@zero_adjointdeclarations migrated from the removedBase.searchsortedlast(::Auto, ...)form toFindFirstFunctions.search_last/search_first.DI Mooncake ext needed a one-line addition:
increment_and_get_rdata!gains a method forRData{<:NamedTuple}to handle the newfirst_val::T/inv_step::Tfields onSearchProperties{T}(they show up asFloat64rdata but aren't differentiable — they're compile-time constants from the knot vector). All 5 DI test groups pass.Layout
Test plan
search_last/search_firstand pass; a guard testset asserts noBase.searchsorted*strategy methods exist.Enum-tagged dispatch (v3 API)safetestset (4,324 tests) exercises everyKIND_Xwith parity vsBase.searchsortedlast.SearchProperties{T} parametric eltypesafetestset (168 tests) covers T-resolution (Float64/Float32, Int → Float64 promotion, non-Real Number, non-Number).Auto{T} parametric + props-aware UniformStep kernelsafetestset covers Range / Vector / LinRange parity and Reverse ordering.🤖 Generated with Claude Code