Lecture 2.2: Functional Programming — Trees

We look at the more general case of inductive types defining tree-like data structures

We also introduce inductive predicates, which correspond to proof trees

Finally, we explain how to help Lean when it fails at proving well-foundedness of a recursive function

Trees as inductive types

Inductive types with constructors taking several recursive arguments define tree-like objects

There are various kinds of trees, notably:

Binary trees have nodes with at most two children
Rose trees have nodes with a list of children

Examples:

inductive bin_tree (α : Type) : Type
| empty {} : bin_tree
| node (a : α) (l : bin_tree) (r : bin_tree) : bin_tree

inductive rose_tree (α : Type) : Type
| empty {} : rose_tree
| node (a : α) (sub : list rose_tree) : rose_tree

The type aexp of arithmetic expressions was an example

The nodes of a tree, whether inner nodes or leaf nodes, often carry labels or other annotations

Inductive trees contain no infinite branches, not even cycles

This is less expressive than pointer- or reference-based data structures (in imperative languages) but easier to reason about algebraically

Recursive definitions (and proofs by induction) work roughly as for lists, but we may need to recurse (or invoke the induction hypothesis) on several child nodes

Inductive predicates

Inductive types are modeled after the datatypes of typed functional programming languages (e.g. Haskell, ML, OCaml)

Inductive predicates, or (more precisely) inductively defined propositions, are familiar from mathematics—e.g.

The set E of even natural numbers is defined as the smallest set closed under the following rules: 0 ∈ E for every n ∈ ℕ, if n ∈ E, then n + 2 ∈ E

In Lean, we write

inductive even : ℕ → Prop
| zero : even 0
| add_two (n : ℕ) : even n → even (n + 2)

If this looks familiar, it is because it should

By Curry–Howard, what we have effectively done is introduced a new unary type constructor, even (or, equivalently, a ℕ-indexed family of propositions)

even is equipped with two constructors, zero and add_two; these can be used to build proof terms

even can be seen as a tree type, the trees being the corresponding proof terms (or proof trees)

Thanks to the no junk guarantee of inductive definitions, zero and add_two are the only two ways to construct even

Logic programming

Inductive types are modeled after the datatypes of typed functional programming languages (e.g. Haskell, ML, OCaml)

Similarly, inductive predicates are reminiscent of the Horn clauses of Prolog-style logic programming languages

But Lean offers a much stronger logic than Prolog, so strong in fact that we need to do some work (interactive proving) to establish theorems instead of just running the Prolog interpeter

A possible view of Lean:

Lean = ML + Prolog + lemmas with proofs

Logical symbols

The truth values false and true, the connectives ∧ and ∨, the ∃ quantifier, and the equality predicate = are all defined as inductive propositions or predicates:

inductive false : Prop

inductive true : Prop
| intro : true

inductive and (a b : Prop) : Prop
| intro : a → b → and

inductive or (a b : Prop) : Prop
| intro_left : a → or
| intro_right : b → or

inductive Exists {α : Type} (p : α → Prop) : Prop
| intro : ∀a : α, p a → Exists

inductive eq {α : Type} : α → α → Prop
| refl (a : α) : eq a a

The notations ∃x : α, p and x = y are syntactic sugar for Exists (λx : α, p) and eq x y, respectively

In contrast, ∀ (= Π) and → are built directly into the logic

Subtypes

Even subtypes are defined as an inductive type (and some syntactic sugar)

inductive subtype {α : Type} (p : α → Prop) : Type
| mk : Πx : α, p x → subtype

{x : α // p x} is syntactic sugar for subtype (λx : α, p x)

Introduction and elimination rules

Let p be an inductive predicate

The constructors are introduction rules, which typically have the form ∀..., ... → p ...; they help prove properties of the form p ...

Elimination works the other way; it extracts information from a lemma or hypothesis of the form p ...

Elimination takes various forms: pattern matching (at the top-level or with match), the cases and induction tactics, and more

Rule induction

Just as we can perform induction on a term, we can perform induction on a proof term

This is sometimes called rule induction, because the induction is on the introduction rules (i.e. the constructors of the proof term)

Thanks to Curry–Howard, this works in the expected way

Rule inversion

Often it is convenient to rewrite concrete terms of the form p (c ...), where c is typically a constructor

We can state and prove an inversion rule to support such eliminative reasoning

Format:

∀x y, p (c x y) → (∃..., ... ∧ ...) ∨ ... ∨ (∃..., ... ∧ ...)

Often we find it convenient to combine introduction and elimination into one lemma, which can be used for rewriting both hypotheses and goals

∀x y, p (c x y) ↔ (∃..., ... ∧ ...) ∨ ... ∨ (∃..., ... ∧ ...)

Example:

∀n : ℕ, even n ↔ n = 0 ∨ ∃m : ℕ, n = m + 2 ∧ even m

Preconditions

Sometimes we want to define a function only on a subset of its domain

There are several approaches:

We can use a subtype to restrict the input
By Curry–Howard, we can specify the proof term of the precondition as an argument
For α, we can require [inhabited α] and return default α
If we know the type, we can usually return a default value (e.g. 0, [])
Use option to wrap the return type

Despite their different syntaxes, approaches 1 and 2 are essentially the same

Approach 3 is not ideal because it restricts the applicability of our function

To handle impossible cases in pattern matching, we can use false.elim or absurd

Well-foundedness proofs

When Lean fails to establish well-foundedness (or termination) of a recursive function f automatically, we can help it by providing, essentially

a well-founded relation < (i.e. a relation < such that there exists no infinite descending chains ... < x₂ < x₁ < x₀)
a proof that < is well founded
a proof that f's call graph is included in <

For point 3, for each equation of the form

| p1 ... pN := ... (f q1 ... qN) ...

in the definition of f, we must prove that (q1, ..., qN) < (p1, ..., pN)

Lean will look for this proof in a have before the call to f:

| p1 ... pN :=
   ...
   have (q1, ..., qN) < (p1, ..., pN) := ...,
   (f q1 ... qN)
   ...

For point 1, there are two options:

we can provide the relation < directly
we can provide a measure m, which is a function into ℕ; the relation is then λx y, m x < m y

We supply point 2 using the using_well_founded keyword immediately following the last equation of the definition; point 1 is extracted from the proof's proposition

If we provide a measure, well-foundedness is proved already as measure_wf (since (<) : ℕ → ℕ → Prop is well founded)

Reference

Theorem Proving in Lean, Sections 7.3, 7.6, 7.7, 8.4, 8.6, and 8.8

Acknowledgment

The red-black tree example was taken and adapted from Andrew W. Appel's Software Foundations: Volume 3: Verified Functional Algorithms