ProofAtlas

How to add a new metric

A metric is a function of type

Hypergraph V C → WalkParams C → V → V → ℝ

that satisfies IsHypergraphMetric: four metric-space axioms (identity, symmetry, triangle, non-negativity) and three sensitivity existence witnesses (direction, ordering, colouring). Four metrics are currently registered: resistanceDist, commuteTimeDist, jsDist, and diffusionDist.

This guide walks through commuteTimeDist as the worked example because it is the second-most-mature metric in the tree and exhibits every part of the recipe. Once you can follow this example, the open work on diffusionDist becomes a productive exercise.

1. Pick a definition with mathematical content

A new metric is worth adding if it answers a question the existing four don't. Three questions to write down before you start coding:

  • What does this metric measure? A geometric quantity, an information-theoretic one, a spectral one?
  • What is the closed-form or characterisation? A formula, an optimisation problem, a fixed-point equation?
  • Why is it a metric? Are the four axioms easy (resistance distance: spectral formula gives them in 3 lines) or hard (commute time: routes through the Markov / Doyle–Snell identity)?

commuteTimeDist answers "how long does the random walk take to make a round trip between u and v" with the closed form dct(u,v)=h(u,v)+h(v,u)d_{ct}(u, v) = h(u, v) + h(v, u) where hh is the expected first passage time of the harmonic random walk on the hypergraph.

2. Drop a Basic.lean defining the metric

Create lean/ProofAtlas/Metric/<MyMetric>/Basic.lean. Two deliverables:

  • The noncomputable def myDist itself, with type Hypergraph V C → WalkParams C → V → V → ℝ.
  • The four metric-axiom lemmas: myDist_self, myDist_symm, myDist_triangle, myDist_nonneg. Mathlib namespacing convention: each lives in namespace Hypergraph.

For commuteTimeDist the deliverable is split across Metric/CommuteTime/{HittingTime, KilledChain, CommuteTime}.lean because the proofs lean on a sub-development of hitting-time analysis. For simpler metrics one Basic.lean is enough.

If a lemma needs a hypothesis the user must supply (e.g., connectedness of the random walk, or full-rank conductance), declare that hypothesis explicitly. Don't bake it into a global axiom. The pattern is to require the hypothesis on each metric-axiom lemma and re-export it as a top-level hypothesis in the Instance.lean theorem (see step 3).

3. Drop an Instance.lean proving IsHypergraphMetric

Create lean/ProofAtlas/Metric/<MyMetric>/Instance.lean. The whole file is one theorem:

theorem myDist_isHypergraphMetric
    {V C : Type*} [Fintype V] [DecidableEq V]
    [hFin : ∀ H : Hypergraph V C, Fintype H.edges]
    -- (a) preconditions that propagate from the metric-axiom lemmas
    (hPre : ∀ (H : Hypergraph V C) (params : WalkParams C), ...)
    -- (b) three existence witnesses
    (hWit_direction : ∃ (H H' : Hypergraph V C) (params : WalkParams C) (u v : V),
        H.myDist params u v ≠ H'.myDist params u v)
    (hWit_ordering : ... )
    (hWit_coloring : ... ) :
    IsHypergraphMetric (fun (H : Hypergraph V C) params u v =>
        H.myDist params u v) := by
  refine ⟨?_, ?_, ?_, ?_, hWit_direction, hWit_ordering, hWit_coloring⟩
  · intro H params u                -- d_self
    exact H.myDist_self params u
  · intro H params u v              -- d_symm
    exact H.myDist_symm params u v
  · intro H params u v w            -- d_triangle
    exact H.myDist_triangle (hPre H params) u v w
  · intro H params u v              -- d_nonneg
    exact H.myDist_nonneg params u v

The pattern is:

  • Metric-space axioms close by re-using the lemmas from Basic.lean. Their preconditions become hypotheses of the instance theorem. For commuteTimeDist this is hErgodic, a five-part conjunction bundling row-stochasticity, non-negativity, normalised stationary, strict positivity, and convergence to the stationary projection.
  • Sensitivity witnesses are existence statements. They're passed in as hypotheses for now. The fully-discharged versions are a follow-up — see Metric/Resistance/Instance.lean for the current treatment.

The commuteTimeDist_isHypergraphMetric declaration is the cleanest template; copy its shape.

4. Register in the CI script

Edit lean/scripts/check_bdf_metric_instances.sh:

METRICS=(
  "resistanceDist:ProofAtlas/Metric/Resistance/Instance.lean"
  "commuteTimeDist:ProofAtlas/Metric/CommuteTime/Instance.lean"
  "jsDist:ProofAtlas/Metric/JensenShannon/Instance.lean"
  "diffusionDist:ProofAtlas/Metric/Diffusion/Instance.lean"
  "myDist:ProofAtlas/Metric/MyMetric/Instance.lean"
)

The CI gate refuses to merge if your instance file contains any sorry. If your first version is incomplete, leave the instance theorem sorry'd and the script will tag it in progress (existing examples: jsDist, diffusionDist).

5. (Optional) Compute-layer Float implementation

If you want the metric to be #eval-able from a #atlas.<verb> command, mirror the definition in Pipeline/MyMetric.lean returning Float numbers. For commuteTimeDist this is currently TODO — the hitting-time matrix inversion is noncomputable on the proof side and hasn't been ported to the Float pipeline.

6. The minimum recipe (cheat sheet)

  1. Metric/MyMetric/Basic.lean — definition + four axiom lemmas.
  2. Metric/MyMetric/Instance.leanmyDist_isHypergraphMetric theorem assembling the four axioms and routing through three existence witnesses.
  3. scripts/check_bdf_metric_instances.sh — add myDist to the METRICS=() registry.
  4. Optional: Pipeline/MyMetric.lean for the Float compute path; Command/MyCmd.lean for an #atlas.my command.
  5. PR: use the New metric proposal issue template to discuss the mathematical content and the witness shapes before implementing.

What reviewers will check

  • The instance file is sorry-free if you advertise the metric as certified. CI enforces this.
  • #print axioms myDist_isHypergraphMetric reports only the whitelisted axioms (propext, Classical.choice, Quot.sound).
  • The Mathlib-style docstring tags are present.
  • The key identity (Doyle–Snell or equivalent) is documented with a paper reference if you cite one.

Now try: fill an open sorry

Two metrics in tree are currently in progress:

  • jsDist — Jensen–Shannon divergence between stationary neighbourhoods.
  • diffusionDist — heat-kernel distance at finite time t.

Both follow the exact shape of commuteTimeDist_isHypergraphMetric: d_self, d_symm, d_nonneg are direct; d_triangle is the genuinely hard piece. For diffusionDist, the triangle inequality reduces to Cauchy–Schwarz on the heat-kernel row vectors at time tt, so the discharge is a Mathlib inner_mul_le_norm_mul_norm manipulation plus the heat-kernel positivity already proved in Metric/Diffusion/Basic.lean.

If you want a first PR, diffusionDist's d_triangle discharge is the shortest path to a green CI tick on a previously-yellow row.

Pointers

  • Existing instances to copy: Metric/Resistance/Instance.lean (the cleanest template), Metric/CommuteTime/Instance.lean (the one this how-to walks through). Both in the repo.
  • For the mapping side, see add-a-mapping.