This article is meant to be a quick introduction to dependently typed programming, and theorem proving through the Curry–Howard isomorphism.

I assume basic knowledge of programming in statically typed imperative languages, as well as familiarity with some basic concepts from first order classical logic (quantifiers, implication, conjunction, disjunction). I will be using Idris for the examples, but I will explain the syntax and all necessary functional programming concept as we go. Still, this article covers a lot of material in a short amount of text. Take a break if you need space to process the new concepts, and use the table of contents to navigate.

Note that often an explanation for a new code sample is only offered afterwards, so make sure to read on. Sometimes it takes multiple paragraphs just to explain every aspect of a single example.

## Table Of Contents

# Motivating examples

Let’s look at a regular function in C++.

```
int id(int x) { return x; }
```

This is one of the simplest functions imaginable, the identity function:

$\mathrm{id} : \mathtt{int} \to \mathtt{int} \\ \mathrm{id}(x) = x$

The domain and codomain of this function is fixed. If we wanted an identity function for any other imaginable type out there, we would have to redefine it for that type.

In C++, we have templates to address this issue.

```
template<typename T>
T id(T x) { return x; }
```

This is what is usually called a polymorphic identity function. When calling `id`

, you have to supply two arguments: a type for the parameter `T`

, and a value of type `T`

for the parameter `x`

. The two parameters are quite different though: one is a type, and the other is a value. They are also clearly separated when calling the function:

```
id<int>(1)
id<char>('x')
```

Okay, so even though the syntax is a bit awkward, we can take types as arguments. Next, what about returning them?

```
template<typename A, typename B>
struct pair {
A fst;
B snd;
};
```

If you squint at this hard enough, it is actually a function taking two type arguments, and returning the type of pairs constructed from them. Again, I am deliberately using a language where it is awkward to express these type level concepts.

Next, what about a function that takes a value, and returns a type? Here is an attempt:

```
template<bool b>
using f = char;
template<true>
using f = double;
```

`f<false>`

returns `char`

, and `f<true>`

returns `double`

. This is almost what we are looking for, but something is not quite right. The issue with this example is that `b`

is not truly a value, it can only be a compile time constant. In particular, you can’t do this:

```
auto g(bool b) -> f<b> {
return b ? 1.0 : 'x';
}
```

Let’s examine what this function is trying to do. It takes a boolean, and depending on its value it either returns a `char`

or a `double`

. Such a function is called *dependently typed*, since its return type depends on the actual value supplied to it. The obvious issue here is that `b`

might only be known at runtime (it could be input from the user for instance), so it’s impossible to know the type of this function at compile time. Allowing something like this would clearly break *static* typing, right?

# Pure functional programming

For the rest of this article I will be using Idris, but the ideas should be applicable for any dependently typed functional language.

So, how do functional programming languages differ from the more common *imperative* languages? The most important difference is that functions are much more like mathematical functions. That is:

- They have exactly one input and one output.
- They can’t have side effects. When you “call” $\sin(x)$ in mathematics, it can’t mutate global variables, or affect anything else.

The notation $f : A \to B$ is used in mathematics to denote a function with domain $A$ and codomain $B$. The same notation is used in Idris, but unlike in regular set theory, the domain and codomain are *types* instead of sets.

```
f : A -> B
```

Here is how you would define the function $f(x) = x + 1$ in Idris:

```
f : Int -> Int
f x = x + 1
```

Note the lack of parenthesis. As functional languages are all about functions, the notation around them is simplified as much as possible. Function application also omits parenthesis, so $f(1)$ would simply be written as `f 1`

.

A very important concept in functional programming is that functions are first class values. (*First class* in this context means that functions are like any other value, they can be received and returned by functions.) The above definition is just syntactic sugar for defining `f`

as the anonymous function value `\x => x + 1`

:

```
f : Int -> Int
f = \x => x + 1
```

In mathematics we don’t usually define functions over functions, but in functional programming this is perfectly normal:

```
plus : Int -> (Int -> Int)
plus = \x => (\y => x + y)
```

In this example, we return a function from a function. This technique is called *currying*, and it is the most common way to define multi argument functions. The terse application syntax makes using such functions natural: `plus 1 2`

. The meaning of this is to apply `1`

to the `plus`

function, then apply `2`

to the function returned by `plus 1`

.

Syntactic sugar also allows us to simplify the definition of `plus`

. We can remove the parenthesis from the type as well, since the `->`

type constructor is right associative:

```
plus : Int -> Int -> Int
plus x y = x + y
```

One thing you might have glossed over so far is that I haven’t actually described what `x + y`

means. It just seems natural, everybody knows what addition is, right?

In fact, in Idris, operators are just regular functions.^{1There are some exceptions, notably -> is built in.} `x + y`

is syntactic sugar for `(+) x y`

. Just remember this for now, later we will be defining some of our own infix operators as well.

At this point you may wonder how any actually useful programming gets done in a pure language. Let’s say, how do you print something? That certainly counts as a side effect.

Various techniques can be used to emulate imperative constructs while preserving functional purity, but this topic is not really relevant to this article, so I won’t go into any details. Look for any decent Haskell introductory material if you are interested in learning more.

# First class types

So functions are first class. A much more unique feature of Idris is that *types* are also first class. You can manipulate types just like you would any other values.

Despite this, static typechecking is fully retained. The details on how such dependently typed programs can be typechecked and compiled are somewhat complicated, but they are well studied, and backed by solid formal mathematics. Let’s set aside our worries about the implementation for now, and explore what first class types can do. First consider our familiar identity function over `Int`

:

```
id : Int -> Int
id x = x
```

We can extend it to be polymorphic by taking another parameter of type `Type`

, which is the *type of types*:^{2And if you are perhaps wondering what is the type of Type, you are in for a ride. Fortunately though most of the time you can just ignore the complicated theory and pretend that Type : Type.}

```
id : (a : Type) -> a -> a
id _ x = x
```

An interesting new piece of syntax here is that we can name parameters in the function type. (In general, any time you see a colon, it means that the thing on its left hand side has the type on its right hand side.) Naming `a`

is necessary since we want to refer to it in the rest of the type. This new `id`

function has two parameters, but we don’t need to use the type parameter in the body, so we can write `_`

to just ignore it.

With this new definition, `id Int`

would be equivalent to our previous `id`

definition. We can of course now use this new definition with any other type as well:

```
id Int 1
id Char 'x'
```

Notice how the types are just regular arguments, not distinguished in any special way.

The above example is good for introducing the concept of first class types, but writing out the types every time we use `id`

would be a bit verbose. In Idris we can mark some parameters as *implicit*, and they will be inferred at the call site instead of having to be explicitly provided:

```
id : {a : Type} -> a -> a
id x = x
```

With this definition we can just write `id 1`

and `id 'x'`

, and Idris will figure out what `a`

should be.

We can go even one step further. This definition is equivalent to the previous one:

```
id : a -> a
id x = x
```

Since using implicit parameters is so common, Idris provides special syntactic sugar for them. In particular, any lowercase names in types will automatically get turned into implicit parameters.

We ended the motivating examples section with attempting to write a function that returned different types based on the value of the input. Let’s now write such a function in Idris:

```
g : (b : Bool) -> if b then Double else Char
g b = if b then 1.0 else 'x'
```

Notice how the same `if`

construct is used at the type and the value level: types are first class, and there isn’t really any difference between types and values anymore.

Of course we could have easily written such a function in any dynamically typed language. The novel thing here is that the Idris typechecker is perfectly on board with the above definition, and will ensure that any usage of `g`

is correct: the type of `g True`

is `Double`

, and the type of `g False`

is `Char`

.

One final piece of syntax I want to introduce before we move on is pattern matching definitions. We also could have written `g`

like this:

```
g : (b : Bool) -> if b then Double else Char
g True = 1.0
g False = 'x'
```

We can pattern match on any inductive type in such a manner. So… what are inductive types exactly?

# Inductive types

The only method we have seen so far for constructing types was `->`

, the function type constructor. Idris also supports inductive type definitions, a very powerful mechanism for creating our own types.

## Algebraic data types

Let’s start with a subset of inductive types commonly referred to as algebraic data types. It turns out that `Bool`

is not built into Idris, but defined as an algebraic data type:

```
data Bool = False | True
```

The names don’t matter, we could also define our own special isomorphic variant of `Bool`

if we wanted to:

```
data Light = Off | On
```

Next, let’s extend this type a bit:

```
data Color = Red | Green | Blue
data Light = Off | On Color
```

`Color`

is straightforward, it has 3 possible values instead of 2. This new `Light`

type is also similar to the previous one, but we are attaching some extra data to the `On`

state, namely its color. There are 4 possible values of this type: `Off`

, `On Red`

, `On Green`

, and `On Blue`

. `On`

is called a *data constructor*, and it takes a single argument of type `Color`

to *construct* a value of the type `Light`

. (`Off`

is also called a constructor, it just takes no arguments.)

```
data Intensity = Low | High
data Light = On Color Intensity | Off
```

Here I added another parameter to the `On`

constructor. This `Light`

type has 7 possible values, and this is how we can calculate that:

$\underbrace{\overbrace{3}^{\mathtt{Color}} \times \overbrace{2}^{\mathtt{Intensity}}}_{\mathtt{On}} + \underbrace{1}_{\mathtt{Off}} = 7$

At this point the reason for calling these algebraic data types should become apparent. The number of values is the sum of the number of values for each constructor, and the number of values for each constructor is the product of the number of values of each of its parameter. (You will often see types with these properties called *sum* types and *product* types respectively.)

## Parameterized and recursive types

Parameterized types are supported as well:

```
data Prod a b = Pair a b
data Sum a b = Left a | Right b
```

Here we can see the concept of algebraic data types distilled into the notion of sums and products. `Prod`

is not the type of pairs by itself, it is a *type constructor* taking two parameters: `Prod Bool Bool`

is the type of `Pair True False`

, and `Prod Int Char`

is the type of `Pair 1 'x'`

.

Note that I am glossing over a technical detail here, which is that we also have to supply the type arguments to the data constructors, but Idris automatically makes these parameters implicit.

For the next example, you need to be familiar with how the Peano axioms define natural numbers:

- $0$ is a natural number.
- For any natural number $n$, $S(n)$ is also a natural number.

$S$ is called the *successor* function. $S(0)$ is 1, $S(S(S(0)))$ is 3, and so on.

We can encode natural numbers with a *recursively* defined type:

```
data Nat = Z | S Nat
```

Zero is represented as `Z`

, one as `S Z`

, three as `S S S Z`

, just like in the Peano axioms. Idris in fact provides syntactic sugar for the built in `Nat`

type, so that instead of `S S S Z`

we can simply write `3`

.

At this point it’s important to note that `Nat`

should not be confused with `Int`

, even though the numeric notation can be used for both. `Nat`

is the type of natural numbers $\mathbb{N}$, its cardinality is $\aleph_0$. `Int`

is the type of machine integers, usually $[-2^{63}, 2^{63} - 1]$, with cardinality $2^{64}$.

Also I would suggest not really thinking about cardinalities for anything beyond simple algebraic types. Infinities in set theory are quite complicated, for instance the cardinality of `Nat -> Nat`

should be $\aleph_0^{\aleph_0}$. You can run into issues if you define types as sets in regular set theory, so just remember that types are not sets, they are defined by a completely separate system of axioms.

With that said, let’s get back to exploring inductive types. In this next example, we combine type parameters and recursion to define lists:^{3I have no idea why the list constructors are named Nil and Cons, but everyone names them like this. Just roll with it.}

```
data List a = Nil | Cons a (List a)
```

`List Int`

is the type of the lists of integers. `Nil`

is the empty list, `Cons 1 Nil`

is a list with one element, `Cons 1 (Cons 2 (Cons 3 Nil))`

is a three element list containing 1, 2, and 3. (And again syntactic sugar comes to the rescue, the list notation `[1, 2, 3]`

expands to constructors of the built in `List`

type.)

## Type indices

The syntax we have been using so far to define types is actually just the abbreviated syntax. The more verbose way to define `List`

would be the following:

```
data List : Type -> Type where
Nil : List a
Cons : a -> List a -> List a
```

Here you can see the type of the type and data constructors fully written out. (Note that there is nothing special about the type parameter `a`

here, it gets automatically inserted as an implicit parameter just like it would be when defining a function.)

When we write `List Int`

it really is just a function application. When we write `Nil`

the type parameter `a`

is automatically inferred from the context. (Note that type and data constructors are *not* functions, but their types are the same, and they can be used in the exact same ways. The special thing about data constructors is that they can be deconstructed in pattern matching definitions.)

A question that arises is whether we have to parameterize the type constructor the same way in each data constructor? Inductive types in fact do allow for such heterogeneity. In this case these are usually called type indices instead of parameters, and the whole definition an *indexed type family*.^{4I don’t think Idris actually distinguishes type parameters and indices, but some languages like Coq do.}

With all that said, let’s consider the ubiquitous example when introducing type indices, the length indexed list. It is called `Vect`

to differentiate it from the regular `List`

.

```
data Vect : Nat -> Type -> Type where
Nil : Vect 0 a
Cons : a -> Vect n a -> Vect (n + 1) a
```

Let’s unpack this definition. The `Vect`

type constructor has two parameters: a `Nat`

that represents its length, and the type of the contained elements. So for instance `Vect 3 Int`

is the type of 3 element `Int`

vectors.

Let’s look at the constructors. `Nil`

is quite straightforward: it says that the empty vector has length `0`

.

The magic mostly happens in the `Cons`

constructor: it says that for any `Vect n a`

, when you add one element, it will have length `Vect (n + 1) a`

. (Note that both `a`

and `n`

are still just implicit parameters of `Cons`

, even if the syntax makes it seem like they come out of thin air.)

So, how can such a type be used? We will look at an example in the next section.

# Programming with dependent types

Indexed types allow for a level of expressiveness not seen outside dependently typed languages.

Consider the `head`

function that returns the first item of a list:

```
head : List a -> a
head (Cons h _) = h
head Nil = ?
```

The `Cons`

case is straightforward, but what should we do when the list is empty? In other languages, we would throw an exception, return a null value, or maybe an option type. Unfortunately none of these are very good solutions, and they make it harder to reason about our programs. What we would really want is a function that’s guaranteed to receive a non-empty list:

```
head : Vect (n + 1) a -> a
head (Cons x _) = x
head Nil impossible
```

In this example, we take a `Vect`

with length `n + 1`

as the argument. This guarantees, at the type level, that it is not empty: $\forall n,\ n + 1 \gt 0$. So, what do we write in the `Nil`

case? We now know that it’s impossible. Idris can in fact handle this for us: by using the `impossible`

keyword, Idris will go ahead and prove for itself that no `n`

exists for which `n + 1 = 0`

. (In more complicated cases we might have to provide our custom proof that a certain case is impossible.)

Except… there is a slight error. Idris won’t accept the above definition, and gives us a cryptic error message:

```
While processing left hand side of head. Can't solve constraint between: ?n [no locals in scope] and ?_ [no locals in scope].
```

Can you tell what’s the error based on this? Me neither. I think Idris is trying to be too smart for its own good here by attempting to unify some terms when pattern matching, and failing.

The fix is relatively simple in this case. Change occurrences of `n + 1`

to the more “natural” `S n`

. While Idris tries to handle unification automatically, it can still fail even in simple cases like this. This is not Idris’ fault: unification is undecidable in general, and no algorithm can exist that covers every case. In this case, Idris can’t figure out by itself that `n + 1`

and `S n`

are the same.

The main issue with dependently typed programming I have experienced so far is that you often can’t just write the code you would without strong types. The above failure is a contrived example, but issues like this also occur in practice in cases where you can’t just solve the issue with such simple tweaks.

Often you have to add explicit rewriting annotations, or even large hand written proofs to your code. This is not really surprising though: as you can express very complex properties with dependent types, a type checker that wouldn’t need annotations would essentially be a completely automatic theorem prover.

Using dependent types sparingly to enforce some key properties of your program can work amazingly. (The Idris compiler itself, written in Idris, does exactly this, and is able to encode some important guarantees in the types.) What definitely won’t work is putting any property you can think of indiscriminately in the types, and expecting to be able to program normally.

# The Curry–Howard correspondence

```
data And a b = Conj a b
modus_ponens : And p (p -> q) -> q
modus_ponens (Conj prf impl) = impl prf
```

Look at the type of the suggestively named `modus_ponens`

. It is supposed to mean $(p\ \wedge\ (p \rightarrow q)) \rightarrow q\text{,}$ that is, the proposition that if $p$ is true, and $p$ implies $q$, then $q$ is true.

So… what am I talking about here? `And`

is an inductive type, `modus_ponens`

is just a function, `p`

and `q`

are types. What does this all have to do with logic?

As it turns out, quite a lot actually. In fact, the famous Curry–Howard isomorphism states a strong equivalence between programs and mathematical proofs. In this view, types are propositions, and programs are proofs.^{5Note however that we won’t get a classical logic here, but rather a kind of constructive logic, but we can ignore this technical detail for now.}

An important thing to note here is that this logical correspondence only applies to *terminating* programs. If you had an infinitely recursive function, the whole system would break down, and you would be able to prove any propositions.^{6Idris does not verify this by default, you have to provide the %default total directive to tell it to only accept functions for which it can prove termination. This built in checking works well for most cases.}

So, let’s go through the computational features we have learned about so far, and uncover their hidden logical meaning. (By the end, the meaning of the `modus_ponens`

proof above should also become clear.)

First, as I already said, types correspond to propositions. `p`

, `q`

, `p -> q`

, and `And p (p -> q) -> q`

are all types, and can be interpreted as propositions as well. In turn, a value of a type corresponds to a proof of the proposition.

As a very simple example, let’s prove the proposition `And a b`

. This is possible if we have a proof of `a`

and a proof of `b`

. Let’s create a function taking these proofs:

```
prove_and : a -> b -> And a b
prove_and prf_a prf_b = Conj prf_a prf_b
```

Once we have both of them, the proof of `And a b`

can be simply constructed using the `Conj`

data constructor we defined previously.

Actually, remember our `Prod`

and `Sum`

types from the previous section?

```
data Prod a b = Pair a b
data Sum a b = Left a | Right b
```

Our definition of `And`

is exactly the same as `Prod`

! Looking at it this way, a proof of a conjunction is really just a pair of proofs.

The logical meaning of `Sum`

should also be obvious: it corresponds to disjunction. A proof of either `a`

or `b`

is enough to prove $a \vee b$.

Next, as you might have already noticed, function types correspond to implications. Our `modus_ponens`

proof above is actually superfluous, as modus ponens corresponds directly to function application: if you have `prf_p : p`

and `prf_pq : p -> q`

, then a proof of `q`

is simply `prf_pq prf_p`

.

As it turns out, inductive types can be used to encode many complex mathematical relations. Let’s take a look at our definition of the less than or equal to relation (note that `S`

and `Z`

are the constructors of the `Nat`

type we discussed previously):

```
data (<=) : Nat -> Nat -> Type where
LeN : n <= n
LeS : n <= m -> n <= S m
```

Notice how we are defining the `<=`

infix operator to be an inductive type. `LeN`

defines the base case saying that $\forall n,\ n \leq n$. `LeS`

defines that given a proof of $n \leq m$, $n \leq m + 1$ is also true. These two rules are sufficient to define the $\leq$ relation.

Let’s now state the proposition that $\forall n,\ 0 \leq n$:

```
all_geq_z : (n : Nat) -> Z <= n
```

Notice that this is not a regular function type, but a *dependent* function type. Dependent function types correspond exactly to universal quantification.

Let’s now prove this proposition. The case where `n`

is zero is quite simple, $0 \leq 0$ by definition.

```
all_geq_z Z = LeN
```

In the case where `n`

is nonzero, we want to somehow construct a chain of `LeS`

applications until we get down to zero. We can do this quite easily with a recursive function call:

```
all_geq_z (S n) = LeS (all_geq_z n)
```

What we just did is a *proof by induction*. We first proved our proposition for zero, then for any $n + 1$ by using the proof for $n$.

As our final example, let’s define equality:^{7In reality Idris doesn’t accept this definition because equality is already defined by the standard library, and = is a reserved symbol.}

```
data (=) : a -> a -> Type where
Refl : (x : a) -> x = x
```

The above definition might seem a bit strange at first though.

```
eq3 : 3 = 3
eq3 = Refl 3
```

Our equality works well for reflexive cases, but how could we ever prove two things equal that are not literally the same?

```
eq_2_1_plus_1 : 2 = 1 + 1
eq_2_1_plus_1 = Refl 2
```

Turns out Idris is perfectly content with this proof. So what’s going on? Can Idris just magically prove things equal?

```
plus_z : (n : Nat) -> n + 0 = n
plus_z n = Refl n
```

This proof won’t be accepted:

```
Can't solve constraint between: n and plus n 0.
```

What the typechecker actually is doing is called *reduction*. The `+`

operator is just defined as the `plus`

function. When faced with `1 + 1`

, Idris will apply the `plus`

function and arrive at the answer `2`

. On the other hand, it can’t reduce `n + 0`

since it doesn’t know anything about `n`

yet.

Let me now show what it actually takes to prove `plus_z`

:

```
replace : (p : a -> Type) -> x = y -> p x -> p y
replace _ (Refl _) prf = prf
plus_z : (n : Nat) -> n + 0 = n
plus_z 0 = Refl 0
plus_z (S n) = replace (\x => S (n + 0) = S x) (plus_z n) (Refl (S (n + 0)))
```

That’s quite a mouthful, so let’s go through it step by step.

Let’s look at the definition of `replace`

. It might not look too complex, but it’s actually a very powerful tool: it can replace arbitrary terms with each other as long as you have an equality proof between them. Suddenly our definition of equality becomes an immensely powerful tool for manipulating propositions, and we no longer have to rely just on reduction to unify terms for us. (Something similar called `rewrite`

is actually built into Idris that can deduce `p`

automatically, but I am deliberately showing here how it could be defined by hand.)

Using `replace`

, we can prove `plus_z`

by induction:

- The
`0`

case is trivial. - In the
`S n`

case, we have to prove that`S n + 0 = S n`

- Idris’ reduction can still partially help us here: it reduces the above term to
`S (n + 0) = S n`

, by the definition of`plus`

. - We can trivially obtain a proof of
`S (n + 0) = S (n + 0)`

, this is what`Refl (S (n + 0))`

stands for. - We have
`plus_z n`

as a proof of`n + 0 = n`

. We can use this to rewrite the right hand side of the above equality, and obtain`S (n + 0) = S n`

, which is exactly what we wanted to prove.

- Idris’ reduction can still partially help us here: it reduces the above term to

# Conclusion

We went over a lot here, building up from just introducing functional programming to writing mechanized proofs using dependent types. I really just went over the basics here, but these are the foundations modern type theory based proof assistants are built on. (Writing proofs like this by hand is not really practical though. Coq for example has a whole separate language just for automating the creation of proofs like these, so that most of the time you don’t even have to think about the details of what’s going on under the hood.)

If this topic has caught your interest, here is a list of resources I would recommend:

- Software Foundations: A very good introduction to Coq, building up to quite advanced levels.
- Certified Programming with Dependent Types: A book about dependently typed programming and advanced proof automation in Coq.
- Functional Programming in Idris 2: A comprehensive introduction if you are interested in Idris specifically. (Idris does not have anything like Coq’s advanced proof automation capabilities though, it is more suited as a practical dependently typed programming language.)

There are some exceptions, notably

`->`

is built in. ↩︎And if you are perhaps wondering what is the type of

`Type`

, you are in for a ride. Fortunately though most of the time you can just ignore the complicated theory and pretend that`Type : Type`

. ↩︎I have no idea why the list constructors are named

`Nil`

and`Cons`

, but everyone names them like this. Just roll with it. ↩︎I don’t think Idris actually distinguishes type parameters and indices, but some languages like Coq do. ↩︎

Note however that we won’t get a classical logic here, but rather a kind of constructive logic, but we can ignore this technical detail for now. ↩︎

Idris does not verify this by default, you have to provide the

`%default total`

directive to tell it to only accept functions for which it can prove termination. This built in checking works well for most cases. ↩︎In reality Idris doesn’t accept this definition because equality is already defined by the standard library, and

`=`

is a reserved symbol. ↩︎