We look at the more general case of inductive types defining tree-like data structures
We also introduce inductive predicates, which correspond to proof trees
Finally, we explain how to help Lean when it fails at proving well-foundedness of a recursive function
Inductive types with constructors taking several recursive arguments define tree-like objects
There are various kinds of trees, notably:
Binary trees have nodes with at most two children
Rose trees have nodes with a list of children
Examples:
inductive bin_tree (α : Type) : Type
| empty {} : bin_tree
| node (a : α) (l : bin_tree) (r : bin_tree) : bin_tree
inductive rose_tree (α : Type) : Type
| empty {} : rose_tree
| node (a : α) (sub : list rose_tree) : rose_tree
The type aexp
of arithmetic expressions was an example
The nodes of a tree, whether inner nodes or leaf nodes, often carry labels or other annotations
Inductive trees contain no infinite branches, not even cycles
This is less expressive than pointer- or reference-based data structures (in imperative languages) but easier to reason about algebraically
Recursive definitions (and proofs by induction) work roughly as for lists, but we may need to recurse (or invoke the induction hypothesis) on several child nodes
Inductive types are modeled after the datatypes of typed functional programming languages (e.g. Haskell, ML, OCaml)
Inductive predicates, or (more precisely) inductively defined propositions, are familiar from mathematics—e.g.
The set
E
of even natural numbers is defined as the smallest set closed under the following rules:0 ∈ E
for everyn ∈ ℕ
, ifn ∈ E
, thenn + 2 ∈ E
In Lean, we write
inductive even : ℕ → Prop
| zero : even 0
| add_two (n : ℕ) : even n → even (n + 2)
If this looks familiar, it is because it should
By Curry–Howard, what we have effectively done is introduced a new unary type constructor, even
(or, equivalently, a ℕ
-indexed family of propositions)
even
is equipped with two constructors, zero
and add_two
; these can be used to build proof terms
even
can be seen as a tree type, the trees being the corresponding proof terms (or proof trees)
Thanks to the no junk guarantee of inductive definitions, zero
and add_two
are the only two ways to construct even
Inductive types are modeled after the datatypes of typed functional programming languages (e.g. Haskell, ML, OCaml)
Similarly, inductive predicates are reminiscent of the Horn clauses of Prolog-style logic programming languages
But Lean offers a much stronger logic than Prolog, so strong in fact that we need to do some work (interactive proving) to establish theorems instead of just running the Prolog interpeter
A possible view of Lean:
Lean = ML + Prolog + lemmas with proofs
The truth values false
and true
, the connectives ∧
and ∨
, the ∃
quantifier, and the equality predicate =
are all defined as inductive propositions or predicates:
inductive false : Prop
inductive true : Prop
| intro : true
inductive and (a b : Prop) : Prop
| intro : a → b → and
inductive or (a b : Prop) : Prop
| intro_left : a → or
| intro_right : b → or
inductive Exists {α : Type} (p : α → Prop) : Prop
| intro : ∀a : α, p a → Exists
inductive eq {α : Type} : α → α → Prop
| refl (a : α) : eq a a
The notations ∃x : α, p
and x = y
are syntactic sugar for Exists (λx : α, p)
and eq x y
, respectively
In contrast, ∀
(= Π
) and →
are built directly into the logic
Even subtypes are defined as an inductive type (and some syntactic sugar)
inductive subtype {α : Type} (p : α → Prop) : Type
| mk : Πx : α, p x → subtype
{x : α // p x}
is syntactic sugar for subtype (λx : α, p x)
Let p
be an inductive predicate
The constructors are introduction rules, which typically have the form ∀..., ... → p ...
; they help prove properties of the form p ...
Elimination works the other way; it extracts information from a lemma or hypothesis of the form p ...
Elimination takes various forms: pattern matching (at the top-level or with match
), the cases
and induction
tactics, and more
Just as we can perform induction on a term, we can perform induction on a proof term
This is sometimes called rule induction, because the induction is on the introduction rules (i.e. the constructors of the proof term)
Thanks to Curry–Howard, this works in the expected way
Often it is convenient to rewrite concrete terms of the form p (c ...)
, where c
is typically a constructor
We can state and prove an inversion rule to support such eliminative reasoning
Format:
∀x y, p (c x y) → (∃..., ... ∧ ...) ∨ ... ∨ (∃..., ... ∧ ...)
Often we find it convenient to combine introduction and elimination into one lemma, which can be used for rewriting both hypotheses and goals
∀x y, p (c x y) ↔ (∃..., ... ∧ ...) ∨ ... ∨ (∃..., ... ∧ ...)
Example:
∀n : ℕ, even n ↔ n = 0 ∨ ∃m : ℕ, n = m + 2 ∧ even m
Sometimes we want to define a function only on a subset of its domain
There are several approaches:
We can use a subtype to restrict the input
By Curry–Howard, we can specify the proof term of the precondition as an argument
For α
, we can require [inhabited α]
and return default α
If we know the type, we can usually return a default value (e.g. 0
, []
)
Use option
to wrap the return type
Despite their different syntaxes, approaches 1 and 2 are essentially the same
Approach 3 is not ideal because it restricts the applicability of our function
To handle impossible cases in pattern matching, we can use false.elim
or absurd
When Lean fails to establish well-foundedness (or termination) of a recursive function f
automatically, we can help it by providing, essentially
a well-founded relation <
(i.e. a relation <
such that there exists no infinite descending chains ... < x₂ < x₁ < x₀
)
a proof that <
is well founded
a proof that f
's call graph is included in <
For point 3, for each equation of the form
| p1 ... pN := ... (f q1 ... qN) ...
in the definition of f
, we must prove that (q1, ..., qN) < (p1, ..., pN)
Lean will look for this proof in a have
before the call to f
:
| p1 ... pN :=
...
have (q1, ..., qN) < (p1, ..., pN) := ...,
(f q1 ... qN)
...
For point 1, there are two options:
we can provide the relation <
directly
we can provide a measure m
, which is a function into ℕ
; the relation is then λx y, m x < m y
We supply point 2 using the using_well_founded
keyword immediately following the last equation of the definition; point 1 is extracted from the proof's proposition
If we provide a measure, well-foundedness is proved already as measure_wf
(since (<) : ℕ → ℕ → Prop
is well founded)
The red-black tree example was taken and adapted from Andrew W. Appel's Software Foundations: Volume 3: Verified Functional Algorithms