ProofAtlas

How to add a new mapping

A mapping is an extractor that turns some Lean object into an IncidenceData (and, optionally, a proof-layer Hypergraph). The two mappings currently in tree are:

MappingInputVertex setHyperedge
Expr.toIncidenceLean.ExprSubterm occurrencesone per compound constructor (app, lam, ...)
Environment.toDeclIncidenceEnvironment + Array NameDeclarationsone per maximal application spine in a proof body

This guide walks through Environment.toDeclIncidence as the worked example because it is the most recent and exhibits all the moving parts: declaration filtering, vertex labelling, allowed-set filtering, deduplication, command wiring, and cache integration.

1. Pick the input and decide the vertex set

A mapping is defined by three choices:

  • Input type. What Lean object are you extracting from? Environment, Array Name, Module, a single ConstantInfo?
  • Vertex set. What does each vertex represent? Subterms, decls, modules, namespaces?
  • Hyperedge structure. What does each hyperedge represent? An inference step, a co-reference, an import dependency?

For the decl-level mapping these are: (Environment, Array Name), declarations as vertices, and "one hyperedge per maximal Expr.app spine inside a proof body" as the edge rule (filtered to the namespace, deduped, self-references removed).

The hyperedge rule is the load-bearing choice. Two practical constraints from IncidenceData:

  • inputs are Array Nat; vertex IDs must be allocated upfront in a deterministic order. The decl mapping uses 0 .. declNames.size - 1 in the order returned by declsInNamespace.
  • output is one Nat. If the mapping is "many inputs → many outputs", decompose into one edge per output.

2. Write the extractor in Pipeline/

Drop a new file lean/ProofAtlas/Mapping/MyMapping.lean. The file should export a single primary function with signature Source → IncidenceData (or Source → IncidenceData × Array Nat if you want to return root vertex IDs).

For decl-level:

def Environment.toDeclIncidence
    (env : Environment) (declNames : Array Name) : IncidenceData := Id.run do
  let mut declToVid : Std.HashMap Name Nat := {}
  let mut labels : Array String := #[]
  let mut colors : Array String := #[]
  for h : i in [:declNames.size] do
    let n := declNames[i]
    declToVid := declToVid.insert n i
    labels := labels.push (toString n)
    -- ... per-vertex colour by declaration kind
  let allowed : Std.HashSet Name :=
    declNames.foldl (fun s n => s.insert n) {}
  let mut allEdges : Array EdgeRec := #[]
  for h : i in [:declNames.size] do
    let n := declNames[i]
    match env.find? n with
    | some info =>
      match info.value? (allowOpaque := true) with
      | some body =>
        let es := collectDeclEdges allowed declToVid n body
        allEdges := allEdges ++ es
      | none => pure ()
    | none => pure ()
  return { numVertices := declNames.size
           edges := allEdges
           vertexLabel := labels
           vertexColor := colors }

Three patterns to copy:

  • Vertex ID is the index in the input array. Deterministic, cheap, gives the caller a stable handle.
  • Per-vertex labels and colours go in vertexLabel and vertexColor. These flow straight to the widget; empty arrays fall back to defaults.
  • Filter inputs to the allowed set in the edge builder. Drop any reference outside your vertex set; otherwise edges point to ghost vertices and the resistance matrix gets confusing rows.

3. (Optional) Proof-layer mirror

If you want quality guarantees on top of the numerics, define a proof-layer companion that returns a real Hypergraph V C and discharges the structural axioms.

For v1 of a new mapping this is usually deferred. Ship the IncidenceData version first; add the proof-layer mirror once the numerical results show the mapping is worth proving theorems about.

Mapping/Expr.lean is the example to copy.

4. Wire a command in Command/

Drop lean/ProofAtlas/Command/MyCmd.lean. Boilerplate is in Command/Common.lean.

syntax (name := atlasMyCmd) "#atlas.my" ppSpace colGt ident : command

@[command_elab atlasMyCmd]
def elabAtlasMyCmd : CommandElab := fun stx => do
  let arg : Name := stx[1].getId
  let env ← getEnv
  let data := Pipeline.Environment.toMyMapping env arg
  if data.numVertices = 0 then
    throwError "#atlas.my: no vertices for {arg}"
  logInfo m!"My-mapping ({arg}): {data.numVertices} vertices, {data.numEdges} edges"
  let report : Pipeline.HypergraphReport :=
    { incidence := data, resistance := #[], delta := 0.0, roots := #[] }
  renderGraphWithLegend stx report

Conventions:

  • The command name mirrors the mapping: Foo.toX#atlas.x.
  • A namespace-scoped mapping gets a .ns suffix on the command.
  • Errors throw with the command name as prefix (#atlas.my: ...) so users can grep for them.

5. Wire into the facade and the cache

Two cheap edits:

  • lean/ProofAtlas/Command.lean: add the import and a line in the docstring's quick-reference table.
  • lean/ProofAtlas/Pipeline/Cache.lean (only if the mapping is expensive enough to warrant caching, like decl-level): add a getOrBuildMyMapping mirroring getOrBuildDeclProfile.

6. The minimum recipe (cheat sheet)

  1. lean/ProofAtlas/Mapping/MyMapping.lean — extractor function returning IncidenceData.
  2. lean/ProofAtlas/Command/MyCmd.lean#atlas.my <args> wrapper.
  3. lean/ProofAtlas/Command.lean — add the import.
  4. lean/ProofAtlas/Demos/Pipeline.lean — at least one example invocation.
  5. Optional: Pipeline/Cache.lean for memoisation, a proof-layer Hypergraph mirror for quality guarantees, a site doc page.
  6. PR: use the New mapping proposal issue template to discuss before implementing.

What reviewers will check

  • The mapping is total (no Option, no exceptions on well-formed input).
  • Vertex IDs are deterministic across runs.
  • The resulting hyperedges satisfy inputs.size ≥ 1 and output ∉ inputs.toList.
  • The command has a single demo invocation in Demos/Pipeline.lean.
  • The MathTag linter is satisfied (every public def has an **Eng.** / **Math.** / **Mixed.** tagged docstring).

Pointers

  • Existing mappings to copy from: Mapping/Environment.lean (expr-level), Mapping/Declaration.lean (decl-level), both in the repo.
  • For the metric side, see add-a-metric.