Theorem proving in Coq June 18, 2023 on György Kurucz's blog
~14 minutes, 2953 words

Introduction

Coq is a proof assistant that lets you state and prove propositions in a kind of constructive logic, the Calculus of Inductive Constructions. The main reason for using such a system is because proofs written in it can be efficiently machine checked.

Coq is built around the Coq kernel, the component responsible for checking the validity of proofs. The kernel is deliberately kept small, and carefully checked, since a bug in its implementation would mean that incorrect proofs could be accepted.

No other part of Coq can compromise the correctness of proofs that the kernel checks. In particular, the tactics that we will use in the main section of this article to construct proofs are not part of the kernel.

This is the most important concept about Coq, and worth reiterating: it doesn’t matter how we construct the proofs. We can use the built-in tactic language. We could write our own tactics. We could even ask ChatGPT to write the proofs for us. Doesn’t matter. If the kernel says that a proof is correct, it’s irrelevant where it came from.

The first section in this article will go over the basic syntax of Coq. This is mostly useful to bring you up to speed if you are already familiar with other dependently typed functional languages. (I cover most of the knowledge I am building on here regarding dependent types in my previous article.)

On the surface Coq tries to mimic regular mathematical practice with its syntax, and you can get pretty far with proving theorems in it without actually understanding the underlying type theory. Don’t expect to understand everything though, beneath the surface it relies on a system fundamentally different to ZFC, the set theoretic axiomatic system most of mathematics is built upon. Any non-trivial usage of Coq requires a proper understanding of its foundations.

In the theorem proving section, I will go over a very basic proof step-by-step. I will not attempt to explain any proofs in detail beyond this first example though. Coq’s interactive proofs are very hard to understand without stepping through it with an IDE yourself and seeing the proof state at each step. (An IDE plugin is just a useful way to display the proof state of Coq in real time, while you are editing. Plugins exist for most popular text editors like Emacs, Vim, or VS Code. You can also use Coq’s interactive command line interface, though it’s not very convenient. Coq also ships with its own IDE called CoqIDE, but I would recommend just going with the plugin for the editor you are already using.)

The last section will showcase some more advanced proofs to show how they can be heavily “automated” in Coq.

In summary, the goal of this article is to give you a very basic overview of Coq, and hopefully get you interested in learning it for yourself. At the end I summarize my tips for getting into and learning Coq.

Syntax

Global definitions can be created with the Definition command:

Definition one : nat := 1.

Each Coq top level command (called “The Vernacular”) is terminated with a period. (Definition is just one of the possible commands, we will see a few others too throughout this article.)

Functions are defined using the fun keyword, the identity function is denoted as fun x => x.

A convenient syntax is provided for function parameters when defining functions, making these two definitions equivalent:

Definition f : nat -> nat -> nat := fun x => fun y => x + y.
Definition f' (x y : nat) : nat := x + y.

(Notice the use of the apostrophe in an identifier. This is allowed so that names can more closely reflect the common mathematical practice of denoting closely related things with apostrophes.)

Coq also features very powerful type inference, we could have also written the previous definition without any type annotations at all:

Definition f'' x y := x + y.

Coq knows that the + operator takes nats, and can infer everything from there.

Dependent function types are denoted with the forall keyword. (Remember that on the logical side of the Curry–Howard isomorphism dependent function types mean universal quantification, hence the “forall” name.)

Definition g : forall (T : Type), T -> T := fun _ => fun x => x.

The above definition looks quite clunky, fortunately the simplified definition syntax works for dependent function types as well:

Definition g' (T : Type) (x : T) := x.

Notations

Coq has a powerful extensible parser. Let’s look at the definition of + from the standard library:

Notation "x + y" := (plus x y) (at level 50, left associativity).

(The actual definition is more complicated, but that doesn’t concern us right now.)

This should be quite straightforward: we ask Coq to translate the x + y infix notation to the plus x y function application.

Some additional information is provided to the parser between the second set of parentheses:

One thing that might surprise you is that the -> function type constructor is just a notation as well:

Notation "A -> B" := (forall (_ : A), B) (at level 99, right associativity).

This should be fairly straightforward: non-dependent function types are simply a special case of dependent function types.

Theorem proving

We will start with proving some simple tautologies, considered the “Hello World” of proof assistants.

Let’s suppose we have two arbitrary propositions A and B. We can declare them in Coq with the Parameters command:

Parameters (A B : Prop).

Prop is the type of propositions in Coq. It is very similar to Type, but there are a few subtle differences. For now just remember that we will use Prop for propositions instead of Type.

The tautology we want to prove is this one:

A -> B -> A /\ B

/\ means conjunction of course, and it is defined as the and type constructor. The proposition then says that AA and BB implies ABA \land B.

We could prove it like this:

Definition prove_and : A -> B -> A /\ B :=
    fun prf_a prf_b => conj prf_a prf_b.

While this is a perfectly fine proof, directly writing out the proof term is inconvenient and quickly becomes infeasible for larger proofs. Let’s look at the Coq way of proving things:

Theorem prove_and : A -> B -> A /\ B.
Proof.
    intros prf_a prf_b.
    split.
    - exact prf_a.
    - exact prf_b.
Qed.

Let’s break this down.

Theorem is the same as Definition1The Theorem command also makes terms “opaque”, meaning that the definition cannot be unfolded. This is useful for proofs since once the proof is finished, we only care about its existence and not the actual proof term., except instead of providing a value straight away, Coq enters its special “interactive proof mode” to help us interactively construct one. (Proof. is a no-op, it just helps visually separate the interactive proof script.)

And here is where the magic of interactive proof construction comes in: using an appropriate IDE, you can see see the “proof state” while writing the proof. Just after entering Proof., this is what the proof state looks like:

1 goal
___________________(1/1)
A -> B -> A /\ B

The goal (what needs to be proven) is shown below the line, while hypotheses are above the line. Currently there are no hypotheses, but we can introduce some with the intros tactic. The intros prf_a prf_b. tactic introduces the 2 hypotheses, namely the proof of A and the proof of B.

1 goal
prf_a : A
prf_b : B
___________________(1/1)
A /\ B

The hypotheses are now shown above the line. (We named them prf_a and prf_b with the intros tactic.) We have to prove A /\ B now. We will use the split tactic, which lets us prove the two sides of the conjunction separately. The state after split looks like this:

2 goals
prf_a : A
prf_b : B
___________________(1/2)
A
___________________(2/2)
B

Now we have two goals. Coq conveniently lets us focus goals with “bullet points”. After the first - bullet, the proof state looks like this:

1 goal
prf_a : A
prf_b : B
___________________(1/1)
A

The goal is to prove A. At this point this is trivial, since we do have a proof of A as a hypothesis. We can tell Coq which hypothesis we want with the exact tactic. After exact prf_a., Coq will tell us this:

This subproof is complete, but there are some unfocused goals:

___________________(1/1)
B

We completed one branch of the proof, but Coq tells us that to finish the whole thing we have to take care of the other branch as well. We can again go to the next goal with the - bullet:

1 goal
prf_a : A
prf_b : B
___________________(1/1)
B

The situation is very similar to the first branch. We have to prove B, and we have it as a hypothesis, so just use it. After the exact prf_b. tactic, Coq will show the following:

No more goals.

This means that we proved everything, and now the proof is complete. Again, the whole proof looks like this:

Theorem prove_and : A -> B -> A /\ B.
Proof.
    intros prf_a prf_b.
    split.
    - exact prf_a.
    - exact prf_b.
Qed.

The Qed. command terminates the proof, and sends the constructed proof term to the kernel to be verified.

Proof automation

Here is a proof of the commutativity of addition:

Lemma plus_n_O n : n = n + 0.
Proof.
    induction n.
    - reflexivity.
    - simpl. rewrite <- IHn. reflexivity.
Qed.

Lemma plus_n_Sm n m : S (n + m) = n + S m.
Proof.
    induction n.
    - reflexivity.
    - simpl. rewrite IHn. reflexivity.
Qed.

Theorem plus_comm a b : a + b = b + a.
Proof.
    induction a.
    - rewrite <- plus_n_O. reflexivity.
    - rewrite <- plus_n_Sm. rewrite <- IHa. reflexivity.
Qed.

You may wonder, if it takes this much just to prove that a+b=b+aa + b = b + a, how much can it take to prove something actually useful?

Here is a shorter proof of the above theorem:

Require Import Lia.

Theorem plus_comm a b : a + b = b + a.
Proof.
    lia.
Qed.

lia is a tactic for linear integer arithmetic. Proving plus_comm is just a trivial case for it. (It relies on existing proofs for the properties of the standard mathematical operators.) Let’s look at a more complex case:

Theorem binomial a b : (a + b) * (a + b) = a*a + 2*a*b + b*b.
Proof.
    lia.
Qed.

This is not even completely linear, but lia can handle some simple non-linear cases. If you still can’t quite appreciate how much heavy lifting lia is doing here for us, let me show you my handwritten proof that just appeals to the basic properties of addition and multiplication:

Theorem binomial a b : (a + b) * (a + b) = a*a + 2*a*b + b*b.
Proof.
    rewrite mul_add_distr_r.
    rewrite mul_add_distr_l.
    rewrite mul_add_distr_l.
    rewrite (mul_comm b a).
    rewrite add_assoc.
    rewrite <- (mul_1_l (a * b)).
    rewrite <- add_assoc.
    rewrite <- add_assoc.
    rewrite (add_assoc (1 * _)).
    rewrite <- mul_add_distr_r.
    change (1 + 1) with 2.
    rewrite add_assoc.
    rewrite mul_assoc.
    reflexivity.
Qed.

You probably learn to do the above multiplication in high school, but might not appreciate just how much goes into it if you rigorously write out every step.

Writing your own automation tactics

lia is a generally useful automation tactic based on well established algorithms, implemented in OCaml2Coq is written in OCaml, so the easiest way to write plugins is in OCaml. for performance. You probably don’t want to start with writing your own tactics in OCaml though.

A language called Ltac ships by default with Coq for writing tactics. (In fact, all of the interactive proofs you have seen so far were written in Ltac, they just weren’t too complex, mostly consisting of simple sequences of tactic invocations.)

Ltac is a domain specific language that specifically evolved to be used with Coq. It is a dynamically typed, Turing-complete scripting language that features built-in backtracking for proof search. The language is not great by modern standards, it has a lot of inconsistencies stemming from its organic evolution alongside Coq. Efforts are underway to replace it, and alternative tactic languages exist as well. Nonetheless, Ltac is still the most widely used tactic language today for Coq, and probably the best starting point for learning proof automation.

Let’s use some logical tautologies as examples again:

Theorem taut1 A B C : (A -> B -> C) -> (A -> B) -> A -> C.
Proof.
    intros abc ab a.
    specialize (ab a).
    specialize (abc a).
    specialize (abc ab).
    exact abc.
Qed.

The proof is quite straightforward. Just apply hypotheses to implications until we reach C.

Let’s say we also need to prove a slight variant of taut1:

Theorem taut2 A B C : (B -> C) -> (A -> B) -> A -> C.
Proof.
    intros bc ab a.
    specialize (ab a).
    specialize (bc ab).
    exact bc.
Qed.

We can just copy-paste our previous proof script, and modify it a bit. Not too bad, right?

Well, imagine if we had to do a 100 proofs like this. All with very slight variations. Something like this can easily happen in real-world proof developments. It’s not uncommon to encounter a lot of very similar goals (usually as a result of some combinatorial explosion) that need to be proven. We clearly need some more scalable approach.

This is where proof automation comes in. We will implement our intuition (“just apply hypotheses to implications until we reach the goal”) in Ltac:

Ltac my_tauto :=
    intros;
    repeat (match goal with
    | x : ?H |- ?H => exact x
    | x : ?H, p : ?H -> _ |- _ => specialize (p x)
    end).

Let’s break this down. First, intros will introduce as many hypotheses as it can. (In this case their names are auto-generated by Coq.)

Next, we will use the very common repeat and match pattern. It will repeatedly choose a branch and execute it, until either the goal is solved or it can’t progress anymore.

The heart of this script are the two match patterns. Patterns on the left side of the turnstile (|-) match hypotheses, while the pattern on the right side matches the goal. Underscores are wildcards, while names prefixed with question marks are unification variables. (For example, the pattern ?A matches any proposition, while ?A -> ?B matches any implication. Any unification variables we don’t care about can just be replaced with underscores: _ -> _ matches implications as well, we just can’t extract the matched propositions afterwards.)

The first branch looks for any hypothesis directly proving the goal. This is achieved by using the unification variable ?H to match the goal, and also a hypothesis. If we find such a hypothesis, we can just directly prove the goal, and we are done. (We name this hypothesis x here. Of course the pattern can match a hypothesis with any name, and Coq will allow us to refer to the hypothesis with the name x in this branch.)

The second branch looks for any hypothesis proving ?H, and any implication with an antecedent ?H. (We don’t care about the goal in this case.) If it finds a pair like this, it specializes the implication to obtain its consequent as a hypothesis.

This simple algorithm makes short work of our two previous example tautologies:

Theorem taut1' A B C : (A -> B -> C) -> (A -> B) -> A -> C.
Proof.
    my_tauto.
Qed.

Theorem taut2' A B C : (B -> C) -> (A -> B) -> A -> C.
Proof.
    my_tauto.
Qed.

I hope this section was able to give you a glimpse of just how powerful proof automation can be.

Learning resources

I hope this short introduction gave you a good picture of what Coq is capable of, and maybe got you interested in learning more. Here a few resources I would recommend:

The Coq website also has a long list of Coq related resources, but I have only read the ones I listed above, so I can’t vouch for the quality of any others.


  1. The Theorem command also makes terms “opaque”, meaning that the definition cannot be unfolded. This is useful for proofs since once the proof is finished, we only care about its existence and not the actual proof term. ↩︎

  2. Coq is written in OCaml, so the easiest way to write plugins is in OCaml. ↩︎