feat(Query): add query complexity framework by kim-em · Pull Request #376 · leanprover/cslib

kim-em · 2026-02-27T01:47:04Z

This PR adds infrastructure for proving upper bounds on the number of queries
(comparisons, oracle calls, etc.) an algorithm makes, using monad parametricity to ensure validity of the bounds.

TickT m monad transformer with tick counting, Costs predicate, and combinators
(Costs.pure, Costs.bind, Costs.monadLift, Costs.ite, etc.)
RunsIn/RunsInT predicates packaging the parametricity argument
Demo: insertion sort with quadratic query bound
Demo: monadic mapSum with functional correctness under tick instrumentation

Quoting from the module doc for RunsIn:

RunsIn f bound (and its generalization RunsInT) assert that a monad-generic algorithm f
makes at most bound x queries on input x.

An algorithm like

def insertionSort [Monad m] (cmp : α × α → m Bool) : List α → m (List α) := ...

is written generically over the monad m. To measure its query complexity, we specialize
m to TickM (or TickT n for algorithms with additional effects) and provide a
cmp implementation that calls tick once per invocation.

Because insertionSort is parametric in m, it cannot observe the tick instrumentation.
It must call cmp the same number of times regardless of which monad it runs in.
Therefore any upper bound proved via TickM is a true bound on query count in all monads.

🤖 Prepared with Claude Code

kim-em · 2026-02-27T01:57:42Z

I have intentionally kept this at the level of "single tick" counting of costs. It is a fairly straightforward change to make this more flexible, which I have not done here to ease reviewing of the essential idea of using monad parametricity.

kim-em · 2026-02-27T01:58:30Z

Cslib/Algorithms/Lean/Query/Basic.lean

+namespace Cslib.Query
+
+structure TickT.State where
+  count : Nat


I would like to make this private, but this breaks a proof below in a way I haven't yet understood.

I think you reported this issue to me in Dec (I hope it's the same one). I handed over the following reproducer to @Kha:

module public structure State where private count : Nat private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by set_option trace.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp.congr true in set_option trace.Meta.realizeConst true in simp [*] private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by omega private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by grind

we traced the crash down to mkCongrSimp?, IIRC it was due to getFunInfo. But I think we never fixed it; should have kept better track of this.

Actually no, it appears this issue has been fixed! At least the reproducer works, and the following works as well (but proofs are broken on latest Mathlib due to DefEq changes...)

module public import Std.Do.Triple.Basic import Std.Tactic.Do open Std.Do public section structure TickM.State where private count : Nat @[expose] def TickM (α : Type) := StateM TickM.State α namespace TickM instance : Monad TickM := inferInstanceAs (Monad (StateM TickM.State)) instance : LawfulMonad TickM := inferInstanceAs (LawfulMonad (StateM TickM.State)) instance : Std.Do.WP TickM (.arg TickM.State .pure) := inferInstanceAs (Std.Do.WP (StateM TickM.State) _) def tick : TickM Unit := fun s => ⟨(), ⟨s.count + 1⟩⟩ @[spec] private theorem tick_spec {Q : PostCond Unit (.arg TickM.State .pure)} : Triple tick (fun s => Q.1 () ⟨s.count+1⟩) Q := by simp [tick, Triple, wp, Id.run]

So maybe try again to make the field private?

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

eric-wieser · 2026-02-27T02:10:37Z

Cslib/Algorithms/Lean/Query/Basic.lean

+@[expose] def TickT (m : Type → Type) (α : Type) := StateT TickT.State m α
+
+/-- The tick-counting monad, specializing `TickT` to `Id`. -/
+@[expose] def TickM (α : Type) := TickT Id α


This is effectively identical to TimeM Nat

Add a general correctness theorem for mapSum parameterized by an abstract predicate family `pre : Int → Assertion ps`, which captures "the Int state is c" for any postcondition shape. Both mapSum_spec and mapSum_spec_tick are now derived as corollaries, removing the TODO comment about the desired generalization. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sgraf812

Nice! It would be great if we could prove specs about Costs using mvcgen, though. Can take a look once I'm back from PTO

sgraf812 · 2026-02-27T02:30:26Z

Cslib/Algorithms/Lean/Query/Basic.lean

+namespace Cslib.Query
+
+structure TickT.State where
+  count : Nat


I think you reported this issue to me in Dec (I hope it's the same one). I handed over the following reproducer to @Kha:

module public structure State where private count : Nat private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by set_option trace.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp true in set_option trace.Debug.Meta.Tactic.simp.congr true in set_option trace.Meta.realizeConst true in simp [*] private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by omega private example {P : State → Prop} (h : P ⟨State.count s⟩) : P ⟨State.count s⟩ := by grind

we traced the crash down to mkCongrSimp?, IIRC it was due to getFunInfo. But I think we never fixed it; should have kept better track of this.

sgraf812 · 2026-02-27T02:33:12Z

Cslib/Algorithms/Lean/Query/Basic.lean

+  count : Nat
+
+/-- A monad transformer that adds tick-counting to any monad `m`. -/
+@[expose] def TickT (m : Type → Type) (α : Type) := StateT TickT.State m α


From my past (limited) experience with defeq abuse, it may make sense to make TickT @[irreducible] and add an explicit injection/projection pair.

sgraf812 · 2026-02-27T02:34:00Z

Cslib/Algorithms/Lean/Query/Basic.lean

+/-- Run a `TickT` computation, starting with tick count 0,
+    returning the result and the final tick count. -/
+def run [Monad m] (x : TickT m α) : m (α × Nat) := do
+  let (a, s) ← StateT.run x ⟨0⟩


(defeq abuse, but probably fine if you formulate your lemmas about .run.)

sgraf812 · 2026-02-27T02:40:00Z

Cslib/Algorithms/Lean/Query/Basic.lean

+
+/-- Instrument a pure function as a tick-counted query.
+    `counted f a` increments the tick counter by 1 and returns `f a`. -/
+@[expose] def counted [Monad m] (f : α → β) (a : α) : TickT m β := do tick; pure (f a)


This function suggests the existence of counted2 [Monad m] (f : α → β → γ) (a : α) (b : β) : TickT m γ etc., similar to liftA<n> in Haskell. It's probably convenient to have counted, but I think it would be convenient to have counted2 and counted3 as well.

Although it would be reasonable for callers to just uncurry their functions before using counted. I think it doesn't scale if we also provide variants of RunsInT below.

sgraf812 · 2026-02-27T02:50:54Z

Cslib/Algorithms/Lean/Query/Basic.lean

+  simp only [Triple.iff]
+  unfold tick
+  show _ ⊢ₛ (PredTrans.pushArg fun s => wp (pure ((), { count := s.count + 1 }) : n _)).apply Q
+  simp only [PredTrans.apply_pushArg, WP.pure]; exact .rfl


This works, but I think it would be simpler and yield a more reusable proof setup if

You defined injection (TickT.mk) and projection (TickT.runn : TickT m α -> StateT TickT.State m α, because run is taken) functions with simp lemma (TickT.mk x).runn = x

You defined wp[x : TickT m α] = wp[x.runn : StateT TickT.State m α]

You proved a simp lemma wp[TickT.mk x : TickT m α] Q = wp[x : StateT TickT.State m α] Q

You proved a simp lemma for tick.runn being equal to modify (\. + 1), presumably by defining it as TickT.mk (modify (\. + 1))

Then you do ext Q : 1; simp? in this proof to end up with wp[modify (\. + 1)] Q which is easily discharged by a single mvcgen, I think

sgraf812 · 2026-02-27T02:57:00Z

Cslib/Algorithms/Lean/Query/Basic.lean

+private theorem ExceptConds.false_and_self (ps : PostShape) :
+    (ExceptConds.false (ps := ps) ∧ₑ ExceptConds.false).entails ExceptConds.false := by


Ah yes, the generalization to any e : ExceptConds ps would be good to upstream.

sgraf812 · 2026-02-27T03:11:55Z

Cslib/Algorithms/Lean/Query/RunsIn.lean

+/-- `RunsInT n f bound` asserts that when the monad-generic function `f`
+    is specialized to `TickT n`, with any query that calls `tick` at most once per invocation,
+    the total number of ticks is bounded by `bound x`.
+
+    The function `f` is generic over monads that extend `n` via `MonadLift`,
+    ensuring it cannot observe the tick instrumentation. -/
+@[expose] def RunsInT {n : Type → Type} {ps : PostShape} [Monad n] [WP n ps]
+    (f : ∀ {m : Type → Type} [Monad m] [MonadLiftT n m], (α → m β) → γ → m δ)
+    (bound : γ → Nat) : Prop :=
+  ∀ (query : α → TickT n β), (∀ a, TickT.Costs (query a) 1) →
+    ∀ x, TickT.Costs (f query x) (bound x)


Neat!!

I think you need to add somewhere (maybe to the top-level comment) that relying on parametricity is only credible when everything is computable. Otherwise you can have if h : α = Nat then ... else ... by Classical.choice.

I wonder if callers can provide an f that uses higher-order constructs such as for loops in do notation. Can't quite play it out in my head right now, but I think it's one of the reasons that the IteratorLoop class is so complicated.

sgraf812 · 2026-02-27T03:18:11Z

Cslib/Algorithms/Lean/Query/MonadicExample.lean

+  induction xs with
+  | nil => exact TickT.Costs.pure ()
+  | cons x xs ih =>
+    simp only [List.length]; rw [Nat.add_comm]
+    have ih : TickT.Costs (mapSum query xs) xs.length := ih
+    exact TickT.Costs.bind (hquery x) (fun y => by
+      have := TickT.Costs.bind
+        (TickT.Costs.monadLift (modify (· + y) : StateM Int Unit) (fun P => by mvcgen))
+        (fun _ => ih)
+      rwa [Nat.zero_add] at this)


I guess it would be nice to use mvcgen for this kind of proof. Can take a look when I'm back from PTO! You shouldn't need to replicate your own reasoning framework with Costs.pure/Costs.bind etc.

sgraf812 · 2026-02-27T03:20:22Z

Cslib/Algorithms/Lean/Query/MonadicExample.lean

+    The predicate family `pre c` captures "the Int state is c" within the
+    abstract postcondition shape `ps`. The hypotheses `hf` and `h_modify`
+    assert that `f` preserves this predicate and the lifted `modify` transitions it. -/


My gut says it yields simpler VCs if you write specs that are schematic in the post (like tick_spec) rather than the precondition

sgraf812 · 2026-02-27T03:23:00Z

Cslib/Algorithms/Lean/Query/MonadicExample.lean

+    (h_modify : ∀ v c, ⦃pre c⦄
+      (MonadLiftT.monadLift (modify (· + v) : StateM Int Unit) : m Unit)
+      ⦃⇓ _ => pre (c + v)⦄)
+    (xs : List Int) :
+    ∀ c, ⦃pre c⦄ mapSum f xs ⦃⇓ _ => pre (c + (xs.map g).sum)⦄ := by


This sounds to me like it could be expressed as a loop invariant lemma, similar to Spec.forIn_list etc.

Shreyas4991 · 2026-02-27T08:06:07Z

@kim-em : This is duplicating #372 (refinement of #275) is it not? Also there I have a specific meaning of queries which are pre-declared inductive types there. I think the reuse of the word query might cause confusion here.

@sgraf812 : Related question, could we set up mvcgen for the framework in #372 ? Essentially it runs code in Id monad and measures complexity in TimeM (which is just an additive writer monad).

Shreyas4991 · 2026-02-27T08:22:43Z

Cslib/Algorithms/Lean/Query/InsertionSort/Defs.lean

+    List α → m (List α)
+  | [] => pure [x]
+  | y :: ys => do
+    let lt ← cmp (x, y)


You say
"Because insertionSort is parametric in m, it cannot observe the tick instrumentation.
It must call cmp the same number of times regardless of which monad it runs in.
Therefore any upper bound proved via TickM is a true bound on query count in all monads."

But you could just as easily use a pure function version of cmp. So you can still sneak in a 0-cost comparison in this model. Same in other examples. The basic limitation of using monadic DSLs applies here too. Anything can be snuck into pure in-principle.

If your counter-argument here is the absence of BEq, we already use that in the linearSearch example in #372

I think you are misunderstanding this PR. I'm offline for the weekend, but see if you can subvert it! I claim you can't write any monad polymorphic function from lists to lists that sorts, and has a "too small" RunsIn.

I see that you simply take a comparison function as a parameter. In this case our models have the same security from misuse of return. However I have noted the strong limitations of this approach as opposed to mine below. In my case, I only interpret this function concretely in my model.

kim-em · 2026-02-27T11:28:19Z

This is duplicating #372 (refinement of #275) is it not?

No, I am proposing that this is a better alternative.

Shreyas4991 · 2026-02-27T12:16:12Z

This is duplicating #372 (refinement of #275) is it not?

No, I am proposing that this is a better alternative.

I strongly disagree. You don't state your queries up front in your model. So you can't state reductions, lower bounds etc. all this makes use of explicit queries and custom cost functions. Instead you use a lean function as a parameter. This is a huge limitation in algorithms theory and complexity.

In the model of #372, one can formally state multiple RAM models and even DSLs for Turing machines and circuits from circuit complexity. Therefore we can already define uniform circuit classes in complexity which involves writing programs in two different models with two different cost models (TMs and circuits). The fact that I can write circuits tells me that I can also encode parallel algorithms and the equivalence between circuits and parallel algorithms (there are several standard equivalences between these models). I have kept my pr limited to simple examples to keep it focused (the maintainers explicitly asked this). In fact it can even talk about lower bounds.

In summary : the other PR allows for a comprehensive top down treatment of the multitude of models in algorithms and the relationships between them(one of the standard models like RAM and TM could be sink nodes in this relationship network).

What this PR is doing is equivalent to compressing the process of defining a query and an evaluation function (and a cost function) in my PR, by directly providing the interpreted version as a parameter (cmp). But it is losing a lot in the process. So, you get a slight generalisation from evaluating in Id as #372 does, to evaluating in a parametric monad m and adding mvcgen machinery. While this might be a positive, it could just as easily have been done on top of #372, with all the benefits of that approach. This suggests (per project contribution guidelines), that at the very least this PR should have been:

Discussed in Zulip first

However, for any major development, it is strongly recommended to discuss first on Zulip (or via a GitHub issue) so that the scope, dependencies, and placement in the library are aligned.

Build on top of the existing PR feat: query complexity model for algorithms theory #372 mentioned above.

New definitions should instantiate existing abstractions whenever appropriate

kim-em force-pushed the feat/tickm-complexity-demo branch from a889930 to 5299afa Compare February 27, 2026 01:51

kim-em commented Feb 27, 2026

View reviewed changes

feat(Query): add query complexity framework with insertion sort demo

6b6a308

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kim-em force-pushed the feat/tickm-complexity-demo branch from 61d9a60 to 6b6a308 Compare February 27, 2026 02:01

eric-wieser reviewed Feb 27, 2026

View reviewed changes

sgraf812 reviewed Feb 27, 2026

View reviewed changes

Shreyas4991 reviewed Feb 27, 2026

View reviewed changes

		private theorem ExceptConds.false_and_self (ps : PostShape) :
		(ExceptConds.false (ps := ps) ∧ₑ ExceptConds.false).entails ExceptConds.false := by

Conversation

kim-em commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kim-em commented Feb 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgraf812 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgraf812 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgraf812 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shreyas4991 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shreyas4991 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kim-em commented Feb 27, 2026

Uh oh!

Shreyas4991 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kim-em commented Feb 27, 2026 •

edited

Loading

sgraf812 Feb 27, 2026 •

edited

Loading

eric-wieser Feb 27, 2026 •

edited

Loading

sgraf812 Feb 27, 2026 •

edited

Loading

Shreyas4991 commented Feb 27, 2026 •

edited

Loading

Shreyas4991 Feb 27, 2026 •

edited

Loading

Shreyas4991 commented Feb 27, 2026 •

edited

Loading