Step 1 · Hypergraph

Definition

A hypergraph (in the sense of BDF26) is a tuple $H = (V, E, c)$ where:

$V$ is a finite set of vertices, interpreted as propositions of a formal mathematical language;
$E$ is a finite set of hyperedges, each interpreted as a single application of a deductive rule;
$c : E \to C$ is a colouring, assigning to every hyperedge the label of the deductive rule it applies.

Each hyperedge $e \in E$ is itself a tuple

e \;=\; (e^{\mathrm{in}};\, e^{\mathrm{out}}) \;=\; \bigl( (e^{\mathrm{in}}_1, \ldots, e^{\mathrm{in}}_{p_e}); \; (e^{\mathrm{out}}_1, \ldots, e^{\mathrm{out}}_{q_e}) \bigr)

with $e^{\mathrm{in}} \in V^{p_e}$ the (ordered) tuple of premises, $e^{\mathrm{out}} \in V^{q_e}$ the (ordered) tuple of conclusions, and $p_e, q_e \ge 1$ .

Within a single edge, all vertices are distinct: premises are pairwise distinct, conclusions are pairwise distinct, and no premise coincides with a conclusion of the same edge.

The four structural axioms

Following Barkeshli–Douglas–Freedman (BDF26), the hypergraph satisfies:

(D) Directed. Every edge has a distinguished input side and output side. Inputs and outputs are typed differently and never coincide within an edge.
(O) Ordered. Inputs and outputs are tuples, not sets: $e^{\mathrm{in}}_1 \ne e^{\mathrm{in}}_2$ as positions even if the underlying vertices were equal. The order matters because the position carries the deductive role (which premise is the major premise, etc.).
(C) Coloured. Each edge carries a colour $c(e) \in C$ recording which deductive rule was applied. Two edges with identical input and output tuples but different colours are different edges.
(A) Acyclic. No proposition is its own descendant. Formally, the transitive closure of the input-to-output relation
$v \to w \;\iff\; \exists\, e \in E.\; v \in e^{\mathrm{in}} \text{ and } w \in e^{\mathrm{out}}$
is irreflexive.

We additionally require $H$ to be connected: any two vertices are linked by an undirected path through hyperedge incidence (a vertex is incident to an edge if it appears in its input or output tuple).

Example: modus ponens

The single inference step

\frac{\;A,\; A \Rightarrow B\;}{B}\;\;\text{(MP)}

corresponds to a hyperedge with $p_e = 2$ , $q_e = 1$ , $e^{\mathrm{in}} = (A,\; A{\Rightarrow}B)$ , $e^{\mathrm{out}} = (B)$ , and $c(e) = \text{MP}$ . Drawn with vertices as circles and the hyperedge as a coloured box:

Example: conjunction introduction

The rule

\frac{\;A,\; B\;}{A \wedge B}\;\;(\wedge\text{-I})

is a hyperedge with the same shape ( $p_e = 2$ , $q_e = 1$ ) but a different colour: $c(e) = \wedge\text{-intro}$ . The colouring is what distinguishes these two two-in-one-out edges.

Example: a small derivation

Stringing two edges together: from premises $A$ , $B$ , $B \Rightarrow C$ we derive $A \wedge C$ in two steps (one MP, then one $\wedge$ -I). The resulting hypergraph has five vertices and two hyperedges:

Lean encoding

The structure lives in ProofAtlas/Hypergraph/Basic.lean:

structure Hyperedge (V : Type u) (C : Type v) where
  inArity  : Nat
  inVertex : Fin inArity → V
  outArity : Nat
  outVertex : Fin outArity → V
  color    : C
  inArity_pos          : 0 < inArity
  outArity_pos         : 0 < outArity
  inputOutputDisjoint  : ∀ i j, inVertex i ≠ outVertex j
  inVertex_injective   : Function.Injective inVertex
  outVertex_injective  : Function.Injective outVertex

structure Hypergraph (V : Type u) (C : Type v) where
  edges    : Type w
  edge     : edges → Hyperedge V C
  acyclic  : Hypergraph.IsAcyclic edge
  connected : ∀ u v, Relation.ReflTransGen (incidence) u v

The axioms (D), (O), (C) live per edge as Hyperedge fields; (A) is the cross-edge acyclic field on Hypergraph.

From `Lean.Expr` to hypergraph

Every Lean proof term e : Lean.Expr induces a hypergraph automatically, via

Lean.Expr.toHypergraph e : Hypergraph (Fin (countSubterms e)) ExprColor

defined in ProofAtlas/Mapping/Expr.lean. The five structural proofs (inArity_pos, outArity_pos, inputOutputDisjoint, inVertex_injective, outVertex_injective, plus acyclic and connected) are all sorry-free.

The load-bearing lemma is vertex_has_parent: every non-root DFS vertex is incident to the hyperedge of its parent compound subterm. It is proved by induction on the twelve Lean.Expr constructors; the six compound cases each split into a child-block case analysis with shifted induction hypotheses.

References

[BDF26] Maissam Barkeshli, Michael R. Douglas, Michael H. Freedman, Artificial Intelligence and the Structure of Mathematics, arXiv:2604.06107, 2026. §2.1. arxiv.org/abs/2604.06107