ProofAtlas

Architecture

ProofAtlas turns a Lean 4 declaration into a hypergraph and then measures the geometry of that hypergraph. The library is organised in four layers; each layer has a single responsibility and exposes a clean interface to the layers above and below.

   ┌──────────────────────────────────────────────────────────────┐
   │  Layer 4 — Quality                                            │
   │  scripts/check_bdf_metric_instances.sh · IsHypergraphMetric  │
   │  CI gate, sorry audit, axiom whitelist                        │
   └──────────────────────────────────────────────────────────────┘
                              ▲
   ┌──────────────────────────┴───────────────────────────────────┐
   │  Layer 3 — Compute (Float, #eval-able)                        │
   │  Pipeline/IncidenceData    Pipeline/Linalg                    │
   │  Pipeline/Hyperbolicity    Pipeline/Cache                     │
   │  Mapping/{Expr, Environment, Declaration}                     │
   │  Command/* (#atlas.*)      Export/* (lake exe atlas-export)   │
   └──────────────────────────┬───────────────────────────────────┘
                              ▲
   ┌──────────────────────────┴───────────────────────────────────┐
   │  Layer 2 — Metric (proof side, ℝ-valued)                      │
   │  Metric/Axioms.lean: IsHypergraphMetric spec                  │
   │  Metric/{Resistance, CommuteTime, JensenShannon,              │
   │          Diffusion}/{Basic, Instance}.lean                    │
   │  RandomWalk/{KemenySnell, CommuteTime, Spectral/*}            │
   │  Hyperbolicity/Basic.lean: generic δ, Hausdorff, GH theorems  │
   └──────────────────────────┬───────────────────────────────────┘
                              ▲
   ┌──────────────────────────┴───────────────────────────────────┐
   │  Layer 1 — Proof (combinatorial)                              │
   │  Hypergraph/Basic.lean: Hypergraph V C                │
   │  Mapping/Expr.lean: Lean.Expr → Hypergraph        │
   └──────────────────────────────────────────────────────────────┘

Layer 1 — Proof

Hypergraph/Basic.lean defines Hypergraph V C: a directed, ordered, coloured, acyclic, connected hypergraph. The four structural axioms (D, O, C, A) are encoded as fields of the structure, so they hold by construction. The translation from any Lean proof term lives in Mapping/Expr.lean:

def Lean.Expr.toHypergraph (e : Expr) :
    Hypergraph (Fin (countSubterms e)) ExprColor

The load-bearing lemma is connected: every non-root DFS vertex is incident to its parent compound subterm. With that sealed, every Lean declaration in any library ships with a Hypergraph for free.

Layer 2 — Metric

Metric/Axioms.lean defines the contract a candidate distance must satisfy to count as a "hypergraph metric":

structure IsHypergraphMetric
    (d : Hypergraph V C → WalkParams C → V → V → ℝ) where
  d_self, d_symm, d_triangle, d_nonneg           -- metric axioms
  sensitive_to_direction, _to_ordering, _to_coloring  -- existence witnesses

Four concrete metrics are registered against this spec:

metricfilestatus
resistanceDistMetric/Resistance/Instance.leancertified
commuteTimeDistMetric/CommuteTime/Instance.leancertified
jsDistMetric/JensenShannon/Instance.leanin progress (sorry)
diffusionDistMetric/Diffusion/Instance.leanin progress (sorry)

All large-scale geometry (Hyperbolicity/Basic.lean for δ-hyperbolicity, the Hausdorff and Gromov–Hausdorff constructions) is parameterised by {d} (isMetric : IsHypergraphMetric d). Swap a metric and every downstream theorem follows.

Layer 3 — Compute

The proof layer is noncomputable (it uses abstract and carries proofs). For end-to-end pipelines we need an #eval-able representation: Pipeline/IncidenceData.lean gives the same combinatorial object as arrays of Nat vertex indices and Float weights. Everything in Pipeline/ is total, computable, and free of proof obligations.

Two extraction modes feed IncidenceData:

  • Expr-level (Mapping/Environment.lean). Exprs.toIncidenceData : Array Expr → IncidenceData × Array Nat glues proof terms via const-sharing — every reference to the same .const Name collapses to one shared vertex.
  • Decl-level (Mapping/Declaration.lean). Environment.toDeclIncidence : Environment → Array Name → IncidenceData builds the co-reference graph of a namespace: vertices are declarations, hyperedges are maximal application spines inside each declaration's proof body.

From IncidenceData, Pipeline/Linalg.lean computes the Laplacian and the effective-resistance matrix, and Pipeline/Hyperbolicity.lean computes the four-point δ. Pipeline/Cache.lean memoises both per (env, namespace) so repeated .ns commands don't re-pay the work.

The interactive surface is Command/*: each #atlas.<verb> resolves identifiers, calls into Pipeline/, and either logs a number or renders a widget. The same compute machinery is wrapped by the atlas-export executable in Export/ for shell pipelines and CI.

Layer 4 — Quality

scripts/check_bdf_metric_instances.sh runs in CI on every push. For each registered metric in the registry, it walks #print axioms of the corresponding *_isHypergraphMetric instance theorem and refuses to merge if any sorry is reachable.

The Util/Audit.lean linter (the MathTag linter) forces every public declaration's docstring to begin with one of **Math.**, **Eng.**, or **Mixed.** so it's always obvious which lines correspond to paper content and which are type-theoretic plumbing.

Where to touch

If you want toTouchRecipe
Add a new mappingMapping/MyMapping.lean + Command/MyCmd.leanadd-a-mapping
Add a new metricMetric/MyMetric/{Basic,Instance}.lean + script registryadd-a-metric
Add a new commandCommand/MyCmd.lean using Command/Common.lean helpersfollow Command/Resistance.lean as a template
Add a downstream consumerExport/MyFormat.lean + lakefile.toml targetfollow Export/Profile.lean as a template
Fill an open sorrythe relevant Metric/*/Basic.lean(how-to coming in Wave 2)

Two things this architecture deliberately does

Separates spec from instance. IsHypergraphMetric is the contract; the four metrics are instances of it. Any new theorem in Layer 2 or 3 takes an IsHypergraphMetric d hypothesis, not a specific distance. New metrics drop in without breaking downstream results.

Separates the proof layer from the compute layer. Layer 1 + 2 prove things about abstract -valued metrics; Layer 3 evaluates their Float shadows. A future bridging lemma will pin the correspondence formally; for now they're kept on speaking terms by convention. The CI gate on IsHypergraphMetric instances ensures the proof side stays honest; the #eval-able compute side ensures the demo loop is fast.

Pointers

  • For a tour of the #atlas.* commands: tutorial
  • For the command reference: commands
  • For the theory side: theory
  • For the canonical source: docs/architecture.md in the repo