Nottingham FP Lab Blog

Induction-Recursion and Impredicativity

by Venanzio on December 6, 2012.
Tagged as: Lunches, Type theory meetings.

The type theory meeting of Wednesday 5 December was a short introduction to induction-recursion with a divagation about impredicativity.

Type Universes

Martin-Löf distinguishes universes à la Russell and universes à la Tarski. The difference is that a Russell-style Universe \({\mathsf{U}}\) is a type such that any of its elements \(A:{\mathsf{U}}\) is itself a type; while a Tarski-style universe is a set \({\mathsf{U}}\) of codes together with a decoding function \({\mathsf{Elem}}: {\mathsf{U}}\rightarrow {\mathsf{Type}}\).

A simple Tarski universe with non-dependent types can be defined by plain induction and the decoding function is given by recursion on it: \[ \begin{array}{l} {\mathsf{U}}: {\mathsf{Type}}\\ {\mathsf{zero}}: {\mathsf{U}}\\ {\mathsf{one}}: {\mathsf{U}}\\ {\mathsf{plus}}: {\mathsf{U}}\rightarrow {\mathsf{U}}\rightarrow {\mathsf{U}}\\ {\mathsf{times}}: {\mathsf{U}}\rightarrow {\mathsf{U}}\rightarrow {\mathsf{U}}\\ \end{array} \quad \begin{array}{l} {\mathsf{Elem}}: {\mathsf{U}}\rightarrow {\mathsf{Type}}\\ {\mathsf{Elem}}\,{\mathsf{zero}}= \emptyset\\ {\mathsf{Elem}}\,{\mathsf{one}}= \{\bullet\}\\ {\mathsf{Elem}}\,({\mathsf{plus}}\,a\,b) = {\mathsf{Elem}}\,a + {\mathsf{Elem}}\,b\\ {\mathsf{Elem}}\,({\mathsf{times}}\,a\,b) = {\mathsf{Elem}}\,a \times {\mathsf{Elem}}\,b \end{array} \]

But when we want to add constructors \({\mathsf{pi}}\) and \({\mathsf{sigma}}\) for dependent products and sums to \({\mathsf{U}}\), then the definitions of \({\mathsf{U}}\) and \({\mathsf{Elem}}\) become entangled: \({\mathsf{Elem}}\) occurs in the type of the constructors of \({\mathsf{U}}\): \[ \begin{array}{l} {\mathsf{pi}}: {\Pi\,}a:{\mathsf{U}}, ({\mathsf{Elem}}\, a \rightarrow {\mathsf{U}}) \rightarrow {\mathsf{U}}\\ {\mathsf{sigma}}: {\Pi\,}a:{\mathsf{U}}, ({\mathsf{Elem}}\, a \rightarrow {\mathsf{U}}) \rightarrow {\mathsf{U}}\\ \ \\ {\mathsf{Elem}}\, ({\mathsf{pi}}\,a\,b) = {\Pi\,}x:{\mathsf{Elem}}\, a, {\mathsf{Elem}}\, (b\, x)\\ {\mathsf{Elem}}\, ({\mathsf{sigma}}\,a\,b) = {\Sigma\,}x:{\mathsf{Elem}}\, a, {\mathsf{Elem}}\, (b\, x) \end{array} \] This was the first instance of an inductive-recursive definition: the simultaneous specification of an inductive type and a recursive function on it.

The construction can be iterated and generalized. We can define another universe \({\mathsf{U}}_1\) that contains \({\mathsf{U}}\) by the same recursive-inductive specification plus a code \(\mathsf{u}_0:{\mathsf{U}}_1\) with associated type \({\mathsf{Elem}}\, \mathsf{u}_0 = {\mathsf{U}}\). In fact, if we already defined an entire family of universes, we can use this construction to create an new universe that contains them all. Eric Palmgren introduced a superuniverse that is closed under this universe-creation rule.

Induction-Recursion

Peter Dybjer further generalized the construction and invented simultaneous inductive-recursive definitions. These allow us to define an inductive type together with a recursive function on it. The function is allowed to appear in the type of the constructors; but there must be syntactic restrictions to guarantee that it is applied to arguments that are smaller than the one generated by the constructor. Later Dybjer and Anton Setzer devised a elegant type of codes for inductive-recursive definitions (we'll look at it on a future meeting).

A simple example is the type of lists without repetitions, by Catarina Coquand. The type is defined simultaneously with a freshness relation that guarantees that we can only add elements that are not already in the list. \[ \begin{array}{l} {\mathsf{FList}}\, (A:{\mathsf{Set}}): {\mathsf{Set}}\\ {\mathsf{fnil}}: {\mathsf{FList}}\, A\\ {\mathsf{fcons}}: {\Pi\,}x:A, {\Pi\,}l:{\mathsf{FList}}\, A, {\mathsf{Fresh}}\, x\, l \rightarrow {\mathsf{FList}}\, A\\ \ \\ {\mathsf{Fresh}}: A \rightarrow {\mathsf{FList}}\, A \rightarrow {\mathsf{Prop}}\\ {\mathsf{Fresh}}\, x\, {\mathsf{fnil}}= \top\\ {\mathsf{Fresh}}\, x\, ({\mathsf{fcons}}\, y\, l\, p) = (x\neq y) \land {\mathsf{Fresh}}\, x\, l \end{array} \]

Ana Bove and I used inductive-recursive definitions to formalize general recursive functions in type theory. The idea is to characterize the domain of a function by an inductive predicate and then define the function itself by recursion on the proof of the predicate. Take for example the quicksort algorithm. In Haskell it is written as:

quicksort :: [A] -> [A]
quicksort [] = []
quicksort (x:l) = quicksort ll ++ x : quicksort lr
   where ll = [y | y <- l, y<=x]
         lr = [y | y <- l, y>x]

The arguments of the recursive calls are not subterms of the principal argument, so this definition is not accepted in systems like Coq or Agda. Instead we can first define an inductive domain predicate: \[ \newcommand{\dnil}{\mathsf{d}_{{\mathsf{nil}}}} \newcommand{\dcons}{\mathsf{d}_{{\mathsf{cons}}}} \newcommand{\append}{\mathop{+\!\!+}} \begin{array}{l} {\mathsf{D}_{{\mathsf{qs}}}} : {\mathsf{List}}\, A \rightarrow {\mathsf{Prop}}\\ \dnil : {\mathsf{D}_{{\mathsf{qs}}}}\,{\mathsf{nil}}\\ \dcons : {\Pi\,}x:A, {\Pi\,}l:{\mathsf{List}}\, A, {\mathsf{D}_{{\mathsf{qs}}}}\, {l_{\leq x}}\rightarrow {\mathsf{D}_{{\mathsf{qs}}}}\, {l_{>x}}\rightarrow {\mathsf{D}_{{\mathsf{qs}}}}\, ({\mathsf{cons}}\, x\, l) \end{array} \] where \({l_{\leq x}}\) and \({l_{>x}}\) are the lists of elements of \(l\) respectively less or equal and larger than \(x\). The actual sorting algorithm can be defined by structural recursion on the proof that the list satisfies the domain predicate: \[ \begin{array}{l} {\mathsf{qs}}: {\Pi\,}l:{\mathsf{List}}\,A, {\mathsf{D}_{{\mathsf{qs}}}} \rightarrow {\mathsf{List}}\,A\\ {\mathsf{qs}}\,{\mathsf{nil}}\,\dnil = {\mathsf{nil}}\\ {\mathsf{qs}}\,({\mathsf{cons}}\,x\,l)\,(\dcons\,x\,l\,p_1,p_2) = ({\mathsf{qs}}\,{l_{\leq x}}\,p_1) \append {\mathsf{cons}}\,x\,({\mathsf{qs}}\,{l_{>x}}\,p_2) \end{array} \]

In this case the function happens to be total: we can prove that the domain predicate is always satisfied, \(H:\forall l:{\mathsf{List}}\,A, {\mathsf{D}_{{\mathsf{qs}}}}\,l\). Therefore, we can eliminate the dependency on the proof: \({\mathsf{quicksort}}: {\mathsf{List}}\,A \rightarrow {\mathsf{List}}\,A\), \({\mathsf{quicksort}}\,l = {\mathsf{qs}}\,l\,(H\,l)\).

If the recursive function we try to implement has nested recursive calls, then we need induction-recursion. Take for example McCarthy's 91 function:

mccarthy :: Int -> Int
mccarthy n | n > 100 = n - 10
           | otherwise = mccarthy (mccarthy (n+11))

The function always terminates and returns 91 on every argument less or equal to 101. However, the recursive calls are made on arguments that may be larger than the original input and are nested. We can formalize it in type theory in the same style as the quicksort algorithm:

\[ \newcommand{\mcc}{\mathsf{mc}} \newcommand{\dgr}{\mathsf{d}_{0}} \newcommand{\dls}{\mathsf{d}_{1}} \begin{array}{l} {\mathsf{D}_{\mcc}} : {\mathbb{N}}\rightarrow {\mathsf{Prop}}\\ \dgr : {\Pi\,}n:{\mathbb{N}}, n>100 \rightarrow {\mathsf{D}_{\mcc}}\,n\\ \dls : {\Pi\,}n:{\mathbb{N}}, n\leq 100 \rightarrow \\ \qquad {\Pi\,}h: {\mathsf{D}_{\mcc}}\,(n+11), {\mathsf{D}_{\mcc}}\,(\mcc\,(n+11)\,h) \rightarrow {\mathsf{D}_{\mcc}}\,n \\ \ \\ \mcc : {\Pi\,}n:{\mathbb{N}}, {\mathsf{D}_{\mcc}}\,n \rightarrow {\mathbb{N}}\\ \mcc\,n\,(\dgr\,n\,p) = n - 10\\ \mcc\,n\,(\dls\,n\,p\,h_1\,h_2) = \mcc\,(\mcc\,(n+11)\,h_1)\,h_2 \end{array} \] In this case we used the function \(\mcc\) in the type of the constructor \(\dls\), which makes it a simultaneous inductive-recursive definition.

Impredicativity

I've been a bit too casual in defining the domain predicates as functions to \({\mathsf{Prop}}\). There are issues related to the consistency of elimination rules between type universes. You can simply adopt the Curry-Howard correspondence in full and replace \({\mathsf{Prop}}\) everywhere with \({\mathsf{Set}}\). However, if instead you have a separate impredicative universe of proposition (as, for example, in Coq), then there are consistency problems.

Impredicativity refers to the possibility of quantifying over a type universe and obtaining an element of that same universe. In general, the formation rule of product types can be formulated as follows: \[ \dfrac {A:s_1 \quad x:A \vdash B[x]:s_2} {{\Pi\,}x:A, B[x] : s_3} \] where \(s_1\), \(s_2\) and \(s_3\) are sorts (type universes). One can define a Pure Type System by specifying a set of such rules simply by giving the triples \((s_1,s_2,s_3)\) and some axioms in the form \(s_1:s_2\). We have that a sort \(s_1\) is impredicative if there is an axiom \(s_1:s_2\) and a rule for \({\Pi\,}\)-types with triple \((s_2,s_1,s_1)\). In our specific case, \(s_1 = {\mathsf{Prop}}\) and \(s_2 = {\mathsf{Set}}\): we can quantify over any set, including the set of all propositions.

The most famous impredicative theory is Girard's system F. Girard proved that the normalization property of his system was equivalent to the consistency of second order Peano Arithmetic, that is, analysis. He then proved normalization with the new technique of reducibility candidates.

Although pure type systems only have function types, in an impredicative theory we can encode inductive types by the polymorphic quantification of their iterators. For example, the natural numbers can be defined in system F as \[ {\mathbb{N}}= {\Pi\,}X:{\mathsf{Set}}, X \rightarrow (X\rightarrow X) \rightarrow X \] (In this case it is the sort \({\mathsf{Set}}\) that's impredicative).

If \({\mathsf{Prop}}\) is impredicative, eliminating an inductive proposition over a set like I have done is forbidden. Allowing this kind of elimination in general would lead to a contradiction, a version of Girard's paradox described by Thierry Coquand.

However, there is a nice trick to get around it; it was taught to me by Christine Paulin. There is no contradiction if the proposition you eliminate has at most one proof. This is in fact the case for our domain predicates. To convince Coq to accept our formalization we need to modify it slightly by defining destructors for the proofs, for example: \[ \newcommand{\getlx}{\mathsf{get}_{\leq x}} \newcommand{\getgx}{\mathsf{get}_{>x}} \begin{array}{l} \getlx : {\Pi\,}x:A, {\Pi\,}l:{\mathsf{List}}\,A,{\mathsf{D}_{{\mathsf{qs}}}}\, ({\mathsf{cons}}\, x\, l) \rightarrow {\mathsf{D}_{{\mathsf{qs}}}}\, {l_{\leq x}}\\ \getlx\,({\mathsf{cons}}\, x\, l)\,(\dcons\,x\,l\,h_1\,h_2) = h_1\\ \ \\ \getgx : {\Pi\,}x:A, {\Pi\,}l:{\mathsf{List}}\,A, {\mathsf{D}_{{\mathsf{qs}}}}\, ({\mathsf{cons}}\, x\, l) \rightarrow {\mathsf{D}_{{\mathsf{qs}}}}\, {l_{\leq x}}\\ \getgx\,({\mathsf{cons}}\, x\, l)\,(\dcons\,x\,l\,h_1\,h_2) = h_2\\ \ \\ \end{array} \] Then we use these destructors to extract the subproofs when defining the function: \[ \begin{array}{l} {\mathsf{qs}}\,({\mathsf{cons}}\,x\,l)\,h =\\ \qquad ({\mathsf{qs}}\,{l_{\leq x}}\,(\getlx\,x\,l\,h)) \append {\mathsf{cons}}\,x\,({\mathsf{qs}}\,{l_{>x}}\,(\getgx\,x\,l\,h)). \end{array} \]

References


comments powered by Disqus