Skip to content

feat: query complexity model with combined history#685

Open
Shreyas4991 wants to merge 63 commits into
leanprover:mainfrom
Shreyas4991:query-final-squash-with-401-history
Open

feat: query complexity model with combined history#685
Shreyas4991 wants to merge 63 commits into
leanprover:mainfrom
Shreyas4991:query-final-squash-with-401-history

Conversation

@Shreyas4991

Copy link
Copy Markdown
Contributor

This is PR #372 that appends the history of #401 as requested by the maintainers of CSLib. It preserves the design and code of #372, and mirrors a similar action performed on #401. It also upgrades #372 to more recent toolchains.

Shreyas4991 and others added 26 commits March 17, 2026 23:54
Co-authored-by: Ethan Ermovick <61568556+Arleee1@users.noreply.github.com>
Add a framework for proving upper and lower bounds on query complexity
of comparison-based algorithms, using `Prog` (free monad over query
types) with oracle-parametric evaluation and structural query counting.

Results:
- Insertion sort: correctness + O(n²) upper bound
- Merge sort: correctness + n·⌈log₂ n⌉ upper bound
- Lower bound: any correct comparison sort on an infinite type
  needs ≥ ⌈log₂(n!)⌉ queries (via adversarial pigeonhole on
  QueryTree depth)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shrys <shreyasss94@gmail.com>
Add `Prog.cost`, a weighted generalization of `Prog.queriesOn` where
each query type can have a different cost. Demonstrate this with complex
multiplication: naive (4 muls + 2 adds) vs Gauss's trick (3 muls + 5 adds),
proving correctness, exact parametric costs, and the crossover condition.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shrys <shreyasss94@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shrys <shreyasss94@gmail.com>
Prog Q α was already a definitional `abbrev` for FreeM Q α; this commit
deletes the Prog namespace and moves the eval/queriesOn/cost interpreters
to a new Cslib/Algorithms/Lean/Query/FreeM.lean. All call sites in the
query subtree (sorting, arith examples, bounds) now refer to FreeM
directly. One-step query constructors (LEQuery.ask, ArithQuery.doAdd/
doSub/doMul) now use FreeM.lift rather than raw .liftBind … .pure.

QueryTree and the Prog→QueryTree bridge (now FreeM.toQueryTree) remain in
place; deleting QueryTree requires generalising the lower-bound lemma
and lands separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The QueryTree decision-tree datatype was a single-response-type
specialisation of FreeM, kept around because the existing
combinatorial lower-bound lemma was easier to state with a fixed
response type. This commit ports the lemma directly to FreeM:

  FreeM.exists_queriesOn_ge_clog : if every response type has
    cardinality ≤ r, n distinct injective oracles force some oracle
    to make ≥ ⌈log_r n⌉ queries.

The proof mostly mirrors the QueryTree version, using @liftBind to
bind the existential response type, and one extra ceiling-division
step (Nat.div_le_div_left) to relate the per-node branching factor
to the global bound r.

Sort/LowerBound.lean now applies the FreeM lemma directly, with
LEQuery.fintypeResponse / cardResponse_le_two witnessing that
LEQuery responses are always Bool. The Prog→QueryTree bridge
(toQTOracle / fromQTOracle / toQueryTree / *_eval / *_queriesOn) is
gone; only LEQuery.oracleOf survives, renamed and moved into
Sort/LEQuery.lean.

Both QueryTree.lean and Sort/QueryTree.lean are deleted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ment

The threshold theorem `gauss_le_naive` uses `3 * c_add ≤ c_mul` (inclusive),
so the section header should say "at least 3×", not "more than 3×".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…liftM

Addresses Eric Wieser's review (Mar 5 2026, on the original Prog.lean):
"pattern matching on the free monad is exploiting an implementation detail,
and that everything should really go through the universal property,
FreeM.liftM."

All three interpreters are now defined as `liftM` into a target monad:

  eval      : liftM (m := Id) oracle
  cost      : liftM (m := Tally) (fun op => ⟨weight op, oracle op⟩) |>.cost
  queriesOn : cost oracle (fun _ => 1)

where `Tally` is a tiny accumulator monad (a value paired with a `Nat`-valued
running cost) introduced in this file with `Monad` and `LawfulMonad` instances.

The right primitive turned out to be `def`, not `abbrev`. With `def`, the
constructor-form simp lemmas (eval_pure, eval_liftBind, cost_pure,
cost_liftBind, queriesOn_pure, queriesOn_liftBind) all reduce by rfl, so
downstream proof ergonomics are unchanged from the original pattern-match
definitions. simp normal form is determined by the explicit @[simp] theorems
rather than opportunistic abbrev unfolding (which would otherwise mix
`queriesOn` and `cost _ (fun _ => 1)` forms in goals and confuse omega).

Net effect: the universal property is the actual definition, not a post-hoc
characterisation. queriesOn_eq_cost_one is rfl. No downstream proof needed
updating.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…linter

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Shreyas4991 Shreyas4991 marked this pull request as ready for review June 30, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants