A formally verified R-tree implementation November 12, 2023 on György Kurucz's blog
~60 minutes, 12696 words

Table Of Contents


Bugs are ubiquitous in computer programming. Widely applied solutions exist to verify certain properties like memory or type safety. On the other hand, most programming languages only offer very rudimentary tools to enforce logical properties.

Proof assistants offer a potential solution, and significant portion of them today is based on the Curry–Howard isomorphism [1]H. B. Curry, J. R. Hindley, and J. P. Seldin, Eds., To H.B. Curry: Essays on combinatory logic, lambda calculus, and formalism. London ; New York: Academic Press, 1980.. The idea behind using the Curry–Howard isomorphism for theorem proving is that strongly typed functional programs can also be interpreted as proofs of mathematical theorems. As such, proof assistants based on the Curry–Howard isomorphism are just as powerful functional programming languages as they are proof assistants.

Coq is a proof assistant and a programming language based on Martin-Löf type theory. It allows arbitrary logical propositions to be stated and proved about programs.

Proving the correctness of algorithms in proof assistants is still very labor intensive, and an active research area. I present an implementation of the R-tree data structure and its correctness proof in Coq. (This is the first correctness proof of an R-tree specifically that I am aware of.) You can find the source code for this project here.

The reader is referred to my previous article for an introduction to type theory and its use for theorem proving, and this article for an introduction to the specific features of Coq.


An R-tree [2]A. Guttman, “R-trees: A dynamic index structure for spatial searching,” in Proceedings of the 1984 ACM SIGMOD international conference on Management of data - SIGMOD ’84, Boston, Massachusetts: ACM Press, 1984, p. 47. doi: 10.1145/602259.602266. is a data structure that can efficiently query a set of axis-aligned bounding boxes (AABBs) in Rd\mathbb{R}^d that intersect a query box γ\gamma.

The R-tree is a bounding volume hierarchy, meaning that each node in the tree has an associated volume, and each child is completely contained in its parent’s volume. In the case of the R-tree, all of the volumes are AABBs, and each non-leaf node’s volume is simply the minimum bounding rectangle of the volumes of its children.

The number of children of each non-leaf non-root node should be between mm and MM, which are freely chosen parameters of a particular R-tree. (In the example below m=2m=2 and M=4M=4.) It’s the job of the insertion and deletion procedures to make sure that these bounds are respected. They also make sure that the tree is balanced. (That is, each leaf node is at the same height.)

In the example below, we are storing the rectangles {A,B,C,D,E,F,G,H,I}\{A,B,C,D,E,F,G,H,I\} in the tree, while {X,Y,Z}\{X,Y,Z\} and the root are inner nodes that were generated during the insertion procedure.

A visualization of the bounding boxes associated with the nodes in the tree, and a visualization of the tree hierarchy.

Note that in bounding volume hierarchies there is nothing preventing sibling nodes from overlapping. While the R-tree insertion procedure does try to minimize overlaps, in most cases it can’t completely eliminate them.

First, let’s look at how querying works using an example. We want to find all rectangles that intersect a query rectangle γ\gamma:

First we check whether it intersects the root node, which it does in this case. Next, we look at the children of the root node. γ\gamma intersects XX and ZZ, but it does not intersect YY. Since all descendants of YY are completely contained within YY, we can be sure that no descendant of YY intersects γ\gamma, and can completely skip searching its subtree. The same search procedure is now repeated recursively for XX and ZZ. In the end, we find that the rectangles intersecting γ\gamma were CC and FF. Here is a visualization of how we searched the tree:

In this particular case, the R-tree didn’t save us anything compared to brute force search: we had to do 11 intersection tests, while brute force would have only done 9. Gains are achieved when we can throw away large subtrees during search, which may or may not be the case even with larger trees, depending on their structure. Unfortunately, there is no complexity guarantee better than O(n)O(n) for the plain R-tree.


Let’s say we want to insert a new item JJ into the tree:

The insertion algorithm will traverse the tree, and find that ZZ is the best leaf node to put it in, since it already fully contains it. (The actual algorithm is to look for the leaf node that needs the minimum area enlargement to contain the new item. In this case, ZZ needs zero enlargement, so it’s the best choice.) Let’s insert JJ into ZZ:

Now we encounter an issue with ZZ: we said at the beginning that each node can have a maximum of M=4M=4 children.

The insertion algorithm’s solution is to recursively split the nodes going up the tree until the constraint imposed by MM is satisfied again for all of the nodes.

In this case, splitting only ZZ will be enough:

Visually, it’s obvious that partitioning {F,G,H,I,J}\{F,G,H,I,J\} into {G,H,I}\{G,H,I\} and {F,J}\{F,J\} is a “good” split. A good split should minimize the probability that some arbitrary future query rectangle γ\gamma overlaps both. A heuristic algorithm tries to achieve this by minimizing the combined area of the two resulting rectangles. (There are actually multiple variations that can be used for this heuristic, I will go into the details later.)


Deletion is conceptually easy after we have insertion. Delete the item, and if its parent has too few items, remove it as well. (And do this recursively all the way up the tree.)

Now we have a well-formed tree, but unfortunately we also removed a bunch of unrelated items.

The solution is to simply reinsert all of them using the insertion algorithm we already have. That’s it.


Finally, just how fast are R-tree operations? Unfortunately we can’t say anything better than O(n)O(n) (nn being the number of items) for the worst case complexity (for any R-tree operation).

I have seen claims of O(logMn)O(\log_M{n}) average complexity for search, which does intuitively make sense if you assume that your data is uniformly distributed, and on average the search only continues down on a single path at each level.

Unfortunately, I haven’t seen any rigorous derivations of an average case complexity.


We are going to write an R-tree implementation in Coq, and prove it functionally correct. But, what does it mean for an implementation to be correct?

Correctness can only be established in relation to a specification, so we will have to come up with one. Fortunately in our case the specification is quite straightforward. (Well, straightforward compared to the proof at least. This exact specification would apply to any other spatial data structure as well.)

Informal specification

In this section I will lay out the specification without making everything perfectly rigorous, while in the next section I will properly formalize everything in Coq.

Let’s define bounding boxes to be hyperrectangles in dd dimensional Eucledian space, with intersection and all geometric concepts having their usual meanings. (Note that while we develop the specification here for arbitrary dd, my implementation later will be specialized to d=2d=2.)

We are going to denote the empty tree with ε\varepsilon, and the multiset of all items contained in a tree tt with ι(t)\iota(t). (Multiset because an item is allowed to be inserted multiple times.) We define three operations (functions) on our trees:

First let’s state the trivial specification of the empty tree: ι(ε)=\iota(\varepsilon) = \empty. We are saying that the empty tree has no items.

Next up, insertion: ι(insert(t,i))={i}+ι(t)\iota(\operatorname{insert}(t, i)) = \{i\} + \iota(t). The items in the tree with ii inserted are exactly the items of the original tree plus ii. (++ instead of \cup is important here, since we are dealing with multisets.)

Next, let’s specify search: search(t,q)={iι(t)qi}\operatorname{search}(t, q) = \{ i \in \iota(t) \mid q \cap i \neq \empty \}. Searching with a bounding box qq means returning all items that overlap with this box.

Finally, deletion: ι(delete(t,q,f))={iι(t)qi=f(i)}\iota(\operatorname{delete}(t, q, f)) = \{ i \in \iota(t) \mid q \cap i = \empty \lor f(i) \}. We define deletion as a kind of filtering: we keep items that are either completely outside of qq, or which satisfy ff.

Formal specification in Coq

Multisets would be somewhat tricky to represent in Coq, so I will just use plain lists instead. Duplicate items are now represented simply by including them in the list multiple times.

Regular equality between lists would give wrong results though, since the items can be ordered differently. Instead, we will use the concept of permutations. Two lists are permutations of each other iff the multisets they represent are equal. (More plainly, if you can reorder the items of a list to get the other one, that means they have the same items.)

The Coq standard library already includes a the permutation relation, so we will just use it. For convenience, let’s denote the proposition that two lists aa and bb are permutations of each other with aba \sim b.

Let’s also tell Coq about this notation (you don’t need to understand the exact details of this definition, it just works as you would expect):

Notation "a ~ b" := (Permutation a b) (at level 70, no associativity).

The Coq standard library defines list concatenation as ++. negb stands for boolean negation, || for disjunction. bbox is the type of AABBs, and bbox_intersect is a function determining whether two bounding boxes intersect. filter is the usual filtering function from functional programming languages.

Here is the whole specification. I will break it down step-by-step afterwards:

Context {I : Type} {item_bbox : I -> bbox}.
Class Container := {
  T : Type;
  items : T -> list I;

  empty : T;
  insert : T -> I -> T;
  search : T -> bbox -> list I;
  delete : T -> bbox -> (I -> bool) -> T;

  empty_correct : items empty ~ [];
  insert_correct : forall t i,
    items (insert t i) ~ [i] ++ items t;
  search_correct : forall t q, search t q ~
    filter (fun i => bbox_intersect q (item_bbox i)) (items t);
  delete_correct : forall t q f, items (delete t q f) ~ filter
    (fun i => negb (bbox_intersect q (item_bbox i)) || f i) (items t);

First we declare that there is some type I (the type of items), and a projection item_bbox that tells us the bounding box of any item.

The Container interface1Container is actually a typeclass, a feature popularized by Haskell. In Coq they are mostly equivalent to record types, which are just syntax sugar for inductive types. then consists of the following entries:

The goal of the rest of this article is to provide an implementation of Container.


This section introduces some general techniques I use in this project.


The transpilation of a Coq program into some other non-dependently typed functional programming language is called extraction. The best supported extraction target is OCaml, so I will be focusing on that. (Note that extraction is not needed for the proofs themselves, but it is needed to run the verified implementations in a reasonable manner.)

The reason it’s called extraction is because we are erasing a lot of type information, and importantly all proofs.2While in theory there is no clear delineation between proofs and programs, in any reasonable Coq development all proofs will be in the Prop sort. Extraction erases everything in Prop.

Let’s look at this simple example first:

Require Coq.extraction.ExtrOcamlBasic.
Definition double_list {A} (l : list A) := l ++ l.
Recursive Extraction double_list.

The first line tells Coq to configure common options for OCaml extraction. (E.g. to map common types, like list, to their native OCaml equivalents.) The word Recursive in the last line tells Coq to extract not just double_list, but also anything that it depends on. We get this as the result of the extraction:

(** val app : 'a1 list -> 'a1 list -> 'a1 list **)

let rec app l m =
  match l with
  | [] -> m
  | a :: l1 -> a :: (app l1 m)

(** val double_list : 'a1 list -> 'a1 list **)

let double_list l =
  app l l

This is mostly what we expected. app is also extracted beside double_list, since it depends on it.

Erasing proofs

Let’s take this implementation of the list head function:

Definition hd {A} (l : list A) {prf : length l > 0} :=
  match l return length l > 0 -> _ with
  | [] => fun prf => ltac:(simpl in prf; lia)
  | h :: _ => fun _ => h
  end prf.

It receives a proof that the list is non-empty, and with this it can prove the empty-list case impossible. Coq’s extraction does a wonderful job with this function:

(** val hd : 'a1 list -> 'a1 **)

let hd = function
| [] -> assert false (* absurd case *)
| h :: _ -> h

All of the complicated proof manipulation is erased, and the impossible case is marked with an assert false.

The main takeaway you should have here is that proof manipulation in Coq has no runtime performance impact, since all proofs are erased. And as long you trust the extractor, the proofs still ensure the correctness of the extracted program.

Dependent types

Now you might be thinking that sure, proofs can be erased, but Coq is dependently typed, while OCaml is not, what happens when we try to extract a dependently typed function? Here is an example (you are going to have to wait until the next section for an explanation of what Z is and how it extracts to a literal 1):

Definition good_luck_with_this_ocaml (b : bool)
  : if b then Z else bool
    := match b with true => 1 | false => false end.

And here is what it extracts to:

type __ = Obj.t

(** val good_luck_with_this_ocaml : bool -> __ **)

let good_luck_with_this_ocaml = function
| true -> Obj.magic 1
| false -> Obj.magic false

Obj.magic is OCaml’s name for an unsafe cast between two arbitrary types. Unlike in Java or other languages, there is no runtime type information in OCaml, so a bad unsafe cast leads to memory corruption. For this reason, its use is very much discouraged. But, in this instance, since Coq’s typechecker (the kernel) already checked the types, we can be pretty sure that these Obj.magic usages are correct.

In fact, the extractor could just use Obj.magic everywhere and ignore OCaml’s type system alltogether. Extracting to an untyped functional programming language like Scheme is actually much easier than to OCaml: just erase all of the types, and that’s it. In this case we are just using Scheme as a convenient implementation of lambda calculus.

Extracting the above function to Scheme yields no weird types, since there are no static types in Scheme:

(define good_luck_with_this_ocaml (lambda (b)
(match b
   ((True) `(Zpos ,`(XH)))
   ((False) `(False)))))

The extractor actually goes to great lengths to come up with some reasonable OCaml translation of the Coq types, and it only falls back to Obj.magic when absolutely necessary. You can read more about the extraction mechanism in the Coq reference manual and in [3]P. Letouzey, “A New Extraction for Coq,” in Types for Proofs and Programs, G. Goos, J. Hartmanis, J. Van Leeuwen, H. Geuvers, and F. Wiedijk, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 200–219. doi: 10.1007/3-540-39185-1_12..


Arithmetic by default does not play very well with extraction. Let’s take this example:

Definition double (a : nat) := 2 * a.

This double function gets extracted to this OCaml code:

type nat =
| O
| S of nat

module Nat =
  (** val add : nat -> nat -> nat **)

  let rec add n m =
	match n with
    | O -> m
    | S p -> S (add p m)

  (** val mul : nat -> nat -> nat **)

  let rec mul n m =
    match n with
    | O -> O
    | S p -> add m (mul p m)

(** val double : nat -> nat **)

let double a =
  Nat.mul (S (S O)) a

This is not… great. nat is good when proving mathematical theorems because of its conceptual simplicity, but storing a number nn using O(n)O(n) storage is pretty bad. What’s wrong with OCaml’s built in integers and arithmetic operators? Why can’t we just extract double simply to let double a = 2 * a in OCaml?

The issue is that that such an extraction would technically be incorrect. OCaml’s integers are 63 bit, and can’t represent all integers. In everyday programming practice we usually just ignore such limits, as they are rarely exceeded.

A solution to the aforementioned problem is to use Z instead of nat for arithmetic. I will let the comments from the standard library explain how Z works:

(** [positive] is a datatype representing the strictly positive integers
   in a binary way. Starting from 1 (represented by [xH]), one can
   add a new least significant digit via [xO] (digit 0) or [xI] (digit 1).
   Numbers in [positive] will also be denoted using a decimal notation;
   e.g. [6%positive] will abbreviate [xO (xI xH)] *)

Inductive positive : Set :=
  | xI : positive -> positive
  | xO : positive -> positive
  | xH : positive.

(** [Z] is a datatype representing the integers in a binary way.
    An integer is either zero or a strictly positive number
    (coded as a [positive]) or a strictly negative number
    (whose opposite is stored as a [positive] value).
    Numbers in [Z] will also be denoted using a decimal notation;
    e.g. [(-6)%Z] will abbreviate [Zneg (xO (xI xH))] *)

Inductive Z : Set :=
  | Z0 : Z
  | Zpos : positive -> Z
  | Zneg : positive -> Z.

I won’t show the extraction results using Z as it would be even longer, but with Z we at least don’t lose asymptotic complexity with arithmetic.

Z is still quite inefficient though, as we are storing each bit of a number as a heap object.

For some applications where arithmetic is not critical for performance, this can be worth it in exchange for the correctness guarantees. (E.g. CompCert does this.)

If you want to go further and step into dangerous territory, you can tell Coq to extract Z to OCaml’s native integers. This:

Require Import Coq.extraction.ExtrOcamlZInt.
Definition double (a : Z) := 2 * a.

Now translates into this:

module Z =
  (** val mul : int -> int -> int **)

  let mul = ( * )

(** val double : int -> int **)

let double a =
  Z.mul ((fun p->2*p) 1) a

Close enough, the OCaml compiler can optimize this to achieve glorious full native performance.3Well, the thing is… OCaml actually has 63 bit integers, because it uses the lowest bit for tagging. And unfortunately this does affect performance. In return, we lose correctness, as overflow now becomes possible.

(For completeness I would also like to mention that Coq does have built-in support for 63 bit integers, which are in fact safe to directly extract to OCaml integers. With this you are stepping into the territory of modular arithmetic though, which can complicate some proofs.)

Transpiling to JavaScript

Chaining together the OCaml extraction process with an OCaml to JavaScript transpiler like js_of_ocaml or melange works surprisingly well to get Coq code running in the browser. js_of_ocaml is more mature, and focuses more on producing code with correct behavior, while melange focuses on producing readable JavaScript, that is as close to the original OCaml code as possible. For this reason, I will be using melange’s output in the following few examples.

Taking our previous good_luck_with_this_ocaml example, and transpiling the extracted OCaml with melange, we obtain perfectly legible JavaScript:

function good_luck_with_this_ocaml(param) {
  if (param) {
    return 1;
  } else {
    return false;

The main reason I wanted to show some JavaScript output is to demonstrate why tail-recursion is useful. Here is an example tail-recursive function to sum a list of numbers:

Fixpoint sum (l : list Z) (acc : Z) :=
  match l with
  | [] => acc
  | h :: l' => sum l' (acc + h)

And here is the JavaScript it translates to:

function sum(_l, _acc) {
  while(true) {
    var acc = _acc;
    var l = _l;
    if (!l) {
      return acc;
    _acc = acc + l.hd | 0;
    _l = l.tl;
    continue ;

The variable names are a bit ugly, but most importantly, it got translated into a loop, without any potentially slow (or stack overflowing) recursive function calls.

The sig type

In this section we are going to take a look at the sig type, which can be used for adding pre- and postconditions to functions.

Let’s look at the definition of the sig (which stands for sigma) type from the standard library:

Inductive sig {A : Type} (P : A -> Prop) : Type :=
  exist (x : A) (_ : P x) : sig P.

This is a special kind of pair, where the first item is a value of some type A, while the second member is a proof of some predicate P defined on A. Basically, this lets us package together some value, and a proof about it.

Let’s look at an example:

Theorem five_gt_zero : 5 > 0. lia. Qed.

Definition a_positive_number
  : sig (fun n => n > 0) := exist _ 5 five_gt_zero.

sig (fun n => n > 0) is the type of pairs of a number, and a proof that this number is greater than zero. If we don’t want to be too pedantic with the language, we can simply say that sig (fun n => n > 0) is the type of positive numbers. (For this reason sig is also called a subset type.)

The standard library also defines a notation for sig:

Notation "{ x : A | P }" := (sig (A:=A) (fun x => P)).

This is inspired by the set builder notation, as the sig type is similar to sets defined in this way. {nNn>0}\{ n \in \mathbb{N} \mid n > 0 \} in set theory and {n:natn>0}\{ n : \mathtt{nat} \mid n > 0 \} in Coq’s type theory basically represent the same idea: one is the set of positive numbers, while the other is the type of positive numbers.

The way we use sig for specifying functions is by taking in or returning sig types. A sig argument is basically a precondition, while a returned sig is like a postcondition. You are going to see such uses of sig in this section.

Coq’s #[program] attribute

The #[program]4Note that you will more often see the legacy Program form used elsewhere, but for the sake of consistency I will be using the attribute form #[program]. attribute is a little weird, it provides some loosely related additions to Coq’s term language. I use it extensively in this project, so I wanted to introduce it briefly. What does it do exactly? From the reference manual:

The goal of #[program] is to program as in a regular functional programming language whilst using as rich a specification as desired and proving that the code meets the specification using the whole Coq proof apparatus.

To this end, it enriches Coq’s term language with a couple new features:

#[program] is not a magic bullet, but it still helps a lot when working with rich specifications, and I am using it for most non-trivial functions in my R-tree implementation.


In Coq, all functions have to terminate. If non-terminating functions were allowed, proving False would be trivial:

Fixpoint proof_of_false (n : nat) : False
  := proof_of_false (n + 1).

When trying to check the above in Coq, we get an error message:

Error: Recursive definition of proof_of_false is ill-formed.
In environment
proof_of_false : nat -> False
n : nat
Recursive call to proof_of_false has principal argument equal to
"(n + 1)%nat" instead of a subterm of "n".
Recursive definition is: "fun n : nat => proof_of_false (n + 1)%nat".

On the other hand, this (terminating) function is correctly accepted:

Fixpoint fib (n : nat) :=
  match n with
  | 0 => 0
  | S n' =>
    match n' with
    | 0 => 1
    | S n'' => fib n' + fib n''

So given that the halting problem is undecidable and all, how does Coq magically know which functions terminate?

It doesn’t. It only accepts a small subset of terminating functions, whose termination can be proven using a simple algorithm. The above definition of fib is already a bit contrived to get it accepted by Coq’s termination checker.

Coq uses something called structural recursion. All recursive functions need to have one of their arguments designated as the decreasing argument. (Most of the time Coq automatically guesses which argument is decreasing, so we don’t have to specify it.)

It then checks whether in each recursive call, the decreasing argument gets structurally smaller. According to the Coq manual “the definition of being structurally smaller is a bit technical”, but you are welcome to dive into it if you are interested in the details. Intuitively, it means that the decreasing argument is always contained in (a subterm of) the previous value of the decreasing argument.

This is easy to see in the fib example above: both n' and n'' are obtained from n by pattern matching on it.

Now let’s examine why the straightforward way to write fib doesn’t get accepted by Coq:

Fixpoint fib (n : nat) :=
  match n with
  | 0 => 0
  | 1 => 1
  | _ => fib (n - 1) + fib (n - 2)

n - 1 and n - 2 are not structurally smaller than n, as far as Coq’s simple termination checking algorithm is concerned.

Unfortunately, a lot of useful recursive functions cannot be written in a structurally recursive form. I will discuss a solution in the next section.

Proving termination manually

A commonly used technique for going beyond structural recursion is using well-founded relations.

A relation \prec is well-founded if it contains no infinite sequence of elements x0,x1,x2,x_0, x_1, x_2, \ldots such that nN,xn+1xn\forall n \in \mathbb{N}, x_{n+1} \prec x_n.

Intuitively, this means that the relation behaves like << behaves on the natural numbers: you can’t have an infinite sequence of decreasing numbers. Eventually, you are going to hit a minimal element (00 in the case of N\mathbb{N}). (And indeed, << is well-founded.)

If you don’t quite get well-foundedness yet, don’t worry: natural numbers are all I am going to use in the next example. Just N\mathbb{N} equipped with << already gives rise to a very powerful primitive for proving termination: if I can measure the arguments of a recursive function with a natural number, all it takes to prove termination is proving that the measure decreases with every recursive call. (A “measure” in this case simply means any function that outputs a natural number.)

We will skip over all the stuff about proving relations well-founded in Coq, and how this actually lets us make Coq accept our functions as terminating. (Spoiler: it’s still structural recursion, but now it’s structural recursion on well-foundedness proofs. I recommend [4]A. Chlipala, Certified programming with dependent types: A pragmatic introduction to the Coq proof assistant. Cambridge, MA: The MIT Press, 2013. Available: http://adam.chlipala.net/cpdt/ as a good introduction to well-founded recursion in Coq if you would like to know more.)

We will use Coq’s #[program] attribute, which takes care of everything in the background and lets us use measures directly. So, let’s try again with the “straightforward” fib definition, but now equip it with a measure:

#[program] Fixpoint fib (n : nat) {measure n} :=
  match n with
  | 0 => 0
  | 1 => 1
  | _ => fib (n - 1) + fib (n - 2)

In this case we don’t even have to define a separate measure function, as conveniently n is its own measure. #[program] leaves us with two obligations to prove. By using the Next Obligation command, we can jump straight into proving it:

n : nat
fib : forall n0 : nat, n0 < n -> nat
H0 : 0 <> n
H : 1 <> n
n - 1 < n

Well this is easy, we just have to prove that n - 1 < n. (And also notice how #[program] helpfully added the hypotheses that n is not equal to 0 or 1.) lia makes short work of this. The other obligation is exactly the same, except that we have to prove that n - 2 < n. The finished proofs are very simple:

Next Obligation. lia. Qed.
Next Obligation. lia. Qed.

And we are done! We defined a function and proved that it’s terminating using a measure, which is a much more powerful primitive than structural recursion.

Rewriting with permutations

In Coq, you can use equality proofs for rewriting terms. (I mean, what else would you be using them for?) Let’s say we have a proof state like this (a b c cut from the context for brevity):

ab : a = b
bc : b = c
a = c

By using the rewrite ab tactic, we can rewrite occurrences of a to b in the goal:

ab : a = b
bc : b = c
b = c

Now we could finish the proof either by using rewrite bc to obtain c = c and then appealing to reflexivity, or just by writing exact bc since b = c is already a hypothesis.

Now let’s say we have a very similar proof state, but instead of equality we have Permutation relations:

ab : a ~ b
bc : b ~ c
a ~ c

Here we could of course just say that Permutation is transitive by definition: apply (perm_trans ab bc). But curiously, we can also use rewrite in this situation, just like in the previous example. After using rewrite bc, we get exactly what you would expect:

ab : a ~ b
bc : b ~ c
b ~ c

So… why does this work? As it turns out, Permutation is an equivalence relation: it’s reflexive, symmetric, and transitive. A set (or in this case, a type) equipped with an equivalence relation is called a setoid in mathematics. Coq allows extending the rewrite tactic to work with any setoid, and not just regular equality. (The standard library already contains the relevant proofs about Permutation being an equivalence relation, so we can use rewrite with it.)

Rewriting with morphisms

Let’s say we have this proof state (where f is some arbitrary function):

ab : a = b
f a = f b

This is again easy to prove just by using rewrite ab to rewrite a to b on the left side, and then appealing to reflexivity. Let’s try the same when we are using permutations:

ab : a ~ b
f a = f b

Issuing rewrite ab here yields an error:

setoid rewrite failed: Unable to satisfy the following constraints:
 ?X39==[T f a b ab |- Relation_Definitions.relation (list T)] (internal placeholder) {?r}
 ?X40==[T f a b ab (do_subrelation:=Morphisms.do_subrelation)
         |- Morphisms.Proper (Morphisms.respectful (Permutation (A:=T)) ?r) f] (internal placeholder) {?p}
 ?X42==[T f a b ab |- Relation_Definitions.relation (list T)] (internal placeholder) {?r0}
 ?X43==[T f a b ab (do_subrelation:=Morphisms.do_subrelation)
         |- Morphisms.Proper (Morphisms.respectful ?r (Morphisms.respectful ?r0 (Basics.flip Basics.impl))) eq]
         (internal placeholder) {?p0}
 ?X44==[T f a b ab |- Morphisms.ProperProxy ?r0 (f b)] (internal placeholder) {?p1}
TYPECLASSES:?X39 ?X40 ?X42 ?X43 ?X44
FUTURE GOALS STACK:?X44 ?X43 ?X42 ?X40 ?X39||

Uh-oh. What a clear and concise error message… But if we think about what we are actually trying to do, Coq is definitely right to fail here: f a = f b is not true in this case for an arbitrary f. For example if f is a function that checks whether a list is sorted, this equality is definitely false.

We somehow need to restrict f to being a function that satisfies the above equality. In the language of Coq’s setoid rewriting system, this is what we need to prove about f:

Proper ((@Permutation A) ==> Logic.eq) f

This basically means that if x ~ y, then f x = f y also holds. The ==> here denotes “morphisms that are both covariant and contravariant”. What does “morphism” mean in this context? What does “proper” mean? I am not sure exactly, but importantly they do exactly what we need them to do. You are welcome to dive into the relevant chapter of the Coq reference manual if would like to know more, godspeed.

What’s important is that with a proof of the above property in scope, the previous rewrite works now:

ab : a ~ b
f a = f b

Trying rewrite ab here again does exactly what we wanted:

ab : a ~ b
f b = f b

If you are interested in an actual example, I will be establishing a Proper instance for one of my functions in this section. To see the proof scripts utilizing permutation rewriting, you can check out the source code.

Proof automation

In general, I tried to follow the advice given in Adam Chlipala’s book, Certified Programming with Dependent Types [4]A. Chlipala, Certified programming with dependent types: A pragmatic introduction to the Coq proof assistant. Cambridge, MA: The MIT Press, 2013. Available: http://adam.chlipala.net/cpdt/:

In this book, I have been following an unusual style, where proofs are not considered finished until they are “fully automated,” in a certain sense. Each such theorem is proved by a single tactic.

My position is that any ideas that standard automation can find are not very big after all, and the real big ideas should be expressed through lemmas that are added as hints.

He advocates for a style where “important” ideas are expressed as separate lemmas, while individual proofs are automated to the furthest extent possible.

While I tried to follow this style, my Ltac skills are still only rudimentary. As such, I won’t discuss the details of any non-single-tactic proofs I made, since they probably don’t contain any important ideas, and are just a failure on my part to automate them sufficiently.

Similarly, since I only started getting my feet wet with more advanced proof automation, my Ltac scripts are still quite messy. I am not sure that showcasing them in detail would provide anything valuable, since there are probably better ways to achieve everything I am doing here. Nonetheless, I will show two tactics here as examples for what you can do with proof automation.

Reflection for use with lia

lia can only work with propositions and not booleans, so I made a tactic to turn boolean relations into their propositional equivalents:

Ltac tac_reflect _ :=
  match goal with
  (** normalize equality orders *)
  | H : true = ?x |- _ => symmetry in H
  | H : false = ?x |- _ => symmetry in H
  (** reflect leb as le *)
  | H : (?a <=? ?b) = true |- _ => apply leb_complete in H
  | H : (?a <=? ?b) = false |- _ => apply leb_iff_conv in H
  | H : (?a <? ?b) = true |- _ => apply Nat.ltb_lt in H
  | H : (?a <? ?b) = false |- _ => apply Nat.ltb_ge in H

This tactic can for example turn (x <? y) = false into x >= y. lia can only work with the second form, so this is quite useful.

I also have lemmas for turning boolean operators into propositions:

Theorem reflect_andb_true : forall (a b : bool),
  a && b = true <-> a = true /\ b = true.
Theorem reflect_andb_false : forall (a b : bool),
  a && b = false <-> a = false \/ b = false.

I haven’t integrated these with tac_reflect yet though.

(Note that reflection is a much more general technique that can be used in Coq, check out the relevant chapter in Software Foundations for an introduction. My relatively simplistic tactics were enough for this project though.)

Solving permutations

The following tactic solves goals of the form _ ~ _, where the two sides can be made equal just by rearranging them.

Ltac bring_to_front x :=
  repeat (
    try rewrite (app_assoc _ x);
    rewrite (Permutation_app_comm _ x);
    try rewrite <- (app_assoc x _)

Ltac solve_permutation :=
  solve [
  repeat rewrite <- app_assoc;
  rewrite <- ? Permutation_rev;
  repeat match goal with
  | |- ?x ++ _ ~ _ =>
      bring_to_front x;
      apply Permutation_app_head
  | |- ?x ~ ?x => reflexivity

solve_permutation expects both sides to be a sequence of terms separated by the ++ operator. It looks at the first term on the left hand side, and tries to find the same term and bring it to the front on the right hand side as well. bring_to_front works by repeatedly appealing to the associativity and commutativity of ++. (++ is not commutative with respect to regular equality of course, but it is with respect to permutation.)

If bring_to_front succeeds, the two head terms can be removed from both sides, and the procedure is repeated again. The loop terminates when both sides are equal, and the goal can be solved by simply appealing to the reflexivity of ~.


Now that you have a basic overview of all the techniques that go into developing a verification project like this, we are ready to jump into discussing the actual implementation.

Axis-aligned bounding boxes

First, we have to develop a theory of the bounding boxes that the whole data structure is build upon.

We define bounding boxes as follows:

Definition interval : Type := Z * Z.
Definition bbox : Type := interval * interval.

An interval is a pair of integers, and a bounding box is simply a pair of intervals. (We could of course generalize to dd dimensions by defining a hyper-bounding box to be a vector of dd intervals, but for now I am satisfied with the d=2d=2 special case.)

We define the following functions on intervals:

Definition interval_intersect : interval -> interval -> bool :=
  fun '(pair a_lo a_hi) '(pair b_lo b_hi) =>
    (a_lo <=? b_hi) && (b_lo <=? a_hi).
Definition interval_inside : interval -> interval -> bool :=
  fun '(pair a_lo a_hi) '(pair b_lo b_hi) =>
    (b_lo <=? a_lo) && (a_hi <=? b_hi).
Definition interval_mbr : interval -> interval -> interval :=
  fun '(pair a_lo a_hi) '(pair b_lo b_hi) =>
    (Z.min a_lo b_lo, Z.max a_hi b_hi).
Definition interval_measure : interval -> Z :=
  fun '(pair lo hi) => hi - lo.

Nothing fancy going on here, you would write these functions the same way in any other programming language. (We won’t even be proving them correct, e.g. that for any a=[a1,a2]a = [a_1, a_2] and b=[b1,b2]b = [b_1, b_2], ab    a1b2b1a2a \cap b \neq \empty \iff a_1 \leq b_2 \land b_1 \leq a_2. We will just prove a few select properties which we need.)

Now that we have these definitions, extending them to bounding boxes is straightforward:

Definition bbox_intersect : bbox -> bbox -> bool :=
  fun '(pair ax ay) '(pair bx by_) =>
    interval_intersect ax bx && interval_intersect ay by_.
Definition bbox_inside : bbox -> bbox -> bool :=
  fun '(pair ax ay) '(pair bx by_) =>
    interval_inside ax bx && interval_inside ay by_.
Definition bbox_mbr : bbox -> bbox -> bbox :=
  fun '(pair ax ay) '(pair bx by_) =>
    (interval_mbr ax bx, interval_mbr ay by_).
Definition bbox_measure : bbox -> Z :=
  fun '(pair x y) => interval_measure x * interval_measure y.

I will be using \sube and {\lvert\,\cdot\,\rvert} to denote bbox_inside and bbox_measure respectively.

Next, here are the facts that we will need about bounding boxes:

Theorem bbox_mbr_comm (a b : bbox)
  : bbox_mbr a b = bbox_mbr b a.
Theorem bbox_mbr_assoc (a b c : bbox)
  : bbox_mbr (bbox_mbr a b) c = bbox_mbr a (bbox_mbr b c).
Theorem bbox_inside_refl (a : bbox)
  : bbox_inside a a = true.
Theorem bbox_inside_mbr (a b : bbox)
  : bbox_inside a (bbox_mbr a b) = true.
Theorem bbox_mbr_monotonic a b c
  : bbox_inside a b = true ->
    bbox_inside a (bbox_mbr b c) = true.
Theorem bbox_inside_not_intersect a b c
  : bbox_inside a b = true ->
    bbox_intersect c b = false ->
    bbox_intersect c a = false.

For these particular set of theorems, I was able to put together a single proof script to completely automate all of the proofs. This script can prove all of the theorems above:

Ltac t :=
  repeat match goal with
  | H : interval |- _ => destruct H
  | H : bbox |- _ => destruct H
  simpl in *;
  repeat match goal with
  | |- (_, _) = (_, _) => f_equal

The idea is to completely break up the goal into a system of linear equations and simple logical connectives. For example, for bbox_inside_not_intersect, here is the proof state just before the final lia tactic:

z9, z10, z7, z8, z5, z6, z3, z4, z1, z2, z, z0 : Z
H : (z5 <= z9 /\ z10 <= z6) /\ z3 <= z7 /\ z8 <= z4
H0 : ~ ((z1 <= z6 /\ z5 <= z2) /\ z <= z4 /\ z3 <= z0)
~ ((z1 <= z10 /\ z9 <= z2) /\ z <= z8 /\ z7 <= z0)

This form is already beyond the limit of easy human comprehension. (12 inequalities, 11 variables, various logical connectives interspersed.) Fortunately, a specialized automated solver like lia shines in a situation like this, and makes quick work of the proof.

Next, we extend the theory about minimum bounding rectangles to any list of boxes, not just two:

Definition mbr (l : list bbox) :=
  match l with
  | [] => ((0, 0), (0, 0))%Z
  | x :: l => fold_left bbox_mbr l x

The ((0, 0), (0, 0)) is just a dummy value for when the list is empty. It doesn’t really matter what we put here, in the theorems where it’s necessary we will specify that the list is non-empty anyway.

We extend the idea behind bbox_inside_mbr to our new mbr function as well:

Theorem mbr_inside l i : In i l -> bbox_inside i (mbr l) = true.

This should be quite intuitive, we are just saying that for any list of boxes, all of them are inside their minimum bounding rectangle.

I already talked about Coq’s generalized rewriting ability, and how we can use it to rewrite with the \sim relation. We want to be able to do this rewriting in the argument of the mbr function as well, and essentially what we have to do is prove that the result of mbr does not depend on the order of its inputs.

As discussed in a previous section already, this is how we can express this in the language of Coq’s setoid rewriting system:

Instance Permutation_mbr
  : Proper (@Permutation bbox ==> Logic.eq) mbr.

The proof is quite straightforward, we are basically proving that mbr does not depend on the order of the boxes in its argument.

This should conclude everything we wanted to know about bounding boxes. To recap:

List utility functions

In this section we will define a few simple functions for manipulating lists which can’t be found in the standard library. The catch is that I have a few special requirements:

This is the first time we will be defining functions together with their own specifications (as opposed to proving separate theorems about them as we did in the previous section about bounding boxes), using the sig types I previously introduced.

Removing items from a list

Let’s look at the definition of the remove_ith function, which we will discuss afterwards:

#[program] Fixpoint remove_ith'
  {T : Type}
  (rev_front back : list T)
  (i : { i : nat | i < length back })
  {measure i}
    : { '(e, l) : T * list T | e :: l ~ rev_front ++ back } :=
  match back, i with
  | [], _ => _
  | (x :: back'), 0 => (x, rev_append rev_front back')
  | (x :: back'), (S i') => remove_ith' (x :: rev_front) back' i'

#[program] Definition remove_ith
  {T : Type}
  (l : list T)
  (i : { i : nat | i < length l })
    : { '(x, l') : T * list T | x :: l' ~ l } :=
  proj1_sig (remove_ith' [] l i).

remove_ith' follows an accumulator pattern to achieve tail recursion, while remove_ith is simply a wrapper around it with the initial accumulator values.

The specification for remove_ith states that it returns a pair (x,l)(x, l') which satisfies x::llx \mathop{::} l' \sim l. As you can see, the specification is now directly embedded in the function’s type, unlike in the previous section, where the bounding box functions and the theorems about them were separate.

Also notice how the parameter i : { i : nat | i < length l } is a subset type as well. It ensures that you don’t attempt to index the list out of bounds. In other programming languages handling simple constraints like this can be tricky, and no good solution exists really:

In Coq, there is another unique way to deal with this problem. Let me repeat the relevant match expression from remove_ith':

  match back, i with
  | [], _ => _
  | (x :: back'), 0 => (x, rev_append rev_front back')
  | (x :: back'), (S i') => remove_ith' (x :: rev_front) back' i'

The first branch pattern [], _ matches when the back list is empty. The branch is simply a single underscore, which is a placeholder to be filled later. So, how is it filled?

In Coq, anywhere where a regular value is expected, we can also provide an impossibility proof (a proof of False) instead.5Actually this is not much of a special case at all. If we have a proof p of False, we can just pattern match on it like this: match p with end. This is a weird looking pattern match, which in fact has zero branches, since False has zero constructors. This pattern match can give you any value of your choosing. In Coq, ex falso quodlibet means you can not only prove anything, you can also get a value for anything.6 Here, this is quite easy: i carries a proof of i < length back, which in this case reduces to i < 0. lia can readily prove that this is a contradiction. (And it is part of the automation script that #[program] automatically applies, so the hole is automatically taken care of.)

I also defined remove_pair, which is similar except that it removes two items from a list. (Understanding its definition is an exercise left to the reader.)

Finding minimal items in a list

We start this section by declaring some common parameters:

Context {T M : Type} `{Ord_M : Ord M} (measure : T -> M).

T is the type of items, while M is the type of comparable values that the measure function (not to be confused with the measures we use for proving termination) produces from any T. Ord_M is an implementation of the total order for M, which importantly provides the leq comparator function. (We are over-generalizing here a bit, since we will only be substituting Z for M in later parts of the development, but this way we can ensure that we are only depending on Z being ordered, and not on any of its other properties.)

There is nothing fancy going on with get_min, we tail-recursively iterate the list and select the minimal element:

#[program] Fixpoint get_min'
  (i : nat)
  (l : list T)
  (mini : { mini | mini < i })
  (min : M)
    : { '(n, _) : nat * M | n < length l + i } :=
  match l with
  | [] => (mini, min)
  | x :: l' =>
    let mx := measure x in
    if leq mx min then
      get_min' (S i) l' i mx
      get_min' (S i) l' mini min

#[program] Definition get_min
  (l : { l : list T | 0 < length l })
    : { '(n, _) : nat * M | n < length l }:=
  match l with
  | [] => _
  | x :: l' => get_min' 1 l' 0 (measure x)

Notice how we don’t actually prove get_min correct. In theory my implementation could be wrong and it might not return the minimal item. It turns out that we don’t actually need this for the R-tree correctness proof. (A theoretical bug existing in get_min cannot impact overall correctness, but it could degrade performance.)

I also needed a function to find minimal pairs of items in a list. The parameters are very similar, but now measure takes two Ts:

Context {T M : Type} `{Ord_M : Ord M} (measure : T -> T -> M).

Definition get_min_pair (l : { l : list T | 2 <= length l })
  : { '(ia, ib, _) : nat * nat * M | ia + ib + 1 < length l }.

The signature of get_min_pair is a bit more complicated since it has to handle two indices. The implementation is the straightforward quadratic brute force search.

Partitioning (for node splitting)

When explaining how R-tree insertion works, I glossed over what heuristic algorithm is used exactly to do the node partitioning. In this section we will cover the particular algorithm I used in detail.

The original R-tree paper [2]A. Guttman, “R-trees: A dynamic index structure for spatial searching,” in Proceedings of the 1984 ACM SIGMOD international conference on Management of data - SIGMOD ’84, Boston, Massachusetts: ACM Press, 1984, p. 47. doi: 10.1145/602259.602266. describes three different algorithms for doing node splitting:

I chose to implement the quadratic algorithm as a good middle ground. It works roughly as follows:

Picking the seeds can easily be implemented using get_min_pair:

#[program] Definition partition_pick_seeds
  (l : { l : list T | 2 <= length l })
    : { '(a, b, l') : T * T * list T | a :: b :: l' ~ l } :=
  let '(mini, _) := get_min_pair
    (fun a b => dead_space (item_bbox a) (item_bbox b)) l in
  remove_pair l mini (prf:=_).

The function for picking the next entry is not very difficult either using get_min:

#[program] Definition partition_pick_next
  (bbox1 bbox2 : bbox)
  (l : { l : list T | 1 <= length l })
    : { '(e, l') : T * list T | e :: l' ~ l } :=
  let neg_diff x :=
    let x := item_bbox x in
    Z.opp $ Z.abs (enlargement bbox1 x - enlargement bbox2 x) in
  let '(i, _) := get_min neg_diff l in
  remove_ith l i.

Assembling these together is a bit involved, you can look at the source code if you are interested in the details. The final signature of our partition function looks like this:

Definition partition (l : { l : list T | length l = M + 1 })
  : { '((la, mbra), (lb, mbrb)) : (list T * bbox) * (list T * bbox) |
      m <= length la <= M /\ m <= length lb <= M /\
      la ++ lb ~ l /\
      mbra = mbr (map item_bbox la) /\ mbrb = mbr (map item_bbox lb)

The postcondition says that:

The tree data structure

Now we need to define the type of trees. I initially did try to encode the needed invariants into the tree data structure itself, but I ran into issues with Coq’s positivity requirements, so the tree data structure itself is just like what you would see in non-dependent languages:

Inductive tree : Type :=
  | Leaf : bbox -> list item -> tree
  | Node : bbox -> list tree -> tree.

A leaf has items as its children, while inner nodes have other trees as their children.

Instead of encoding the invariants directly in tree, we will encode them as a separate inductive proposition called wf_tree (“wf” here stands for well-formed):

Inductive wf_tree : tree -> Prop :=
  | wf_leaf
      (b : bbox)
      (l : list item)
      (p : m <= length l <= M)
      (pmbr : b = mbr (map item_bbox l))
        : wf_tree (Leaf b l)
  | wf_node
      (b : bbox)
      (l : list tree)
      (p : m <= length l <= M)
      (pl : forall t, In t l -> wf_tree t)
      (pmbr : b = mbr (map tree_bbox l))
        : wf_tree (Node b l).

A leaf is well-formed if the number of its children is between the required bounds, and its bounding box is the minimum bounding rectangle of its children. Inner nodes have to satisfy the same properties, and additionally all of their children have to be well-formed as well.

We also define a property called wf_root, which is identical to wf_tree, except that the lower bounds for the number of children are relaxed. (Root nodes are a bit special, since we can’t guarantee that they will have at least m children.)

We will often need to deal with well-formed trees, and a pattern you will be seeing is a tree and a proof of its well-formedness packaged together into a single sig type: { t : tree | wf_tree t }.

We can now straightforwardly define tree_items (which we denoted with ι\iota):

Fixpoint tree_items (t : tree) :=
  match t with
  | Leaf _ items => items
  | Node _ nodes => flat_map tree_items nodes

We also prove an intuitively true, but important lemma, namely that all items in a well-formed tree are inside its bounding box (this is just a corollary of pmbr in wf_tree):

Definition Items_inside (t : tree) :=
  forall i, In i (tree_items t) ->
  bbox_inside (item_bbox i) (tree_bbox t) = true.

Lemma wf_tree_items_inside {t} (w : wf_tree t) : Items_inside t.

Last but not least, we need to define a child_tree relation, and prove that it’s well-founded (not to be confused with well-formed):

Definition child_tree (a b : tree) :=
  match b with
  | Leaf _ _ => False
  | Node _ nodes => In a nodes

Theorem wf_child_tree : well_founded child_tree.

The child_tree relation should be straightforward. well_founded comes from Coq’s standard library: we are proving that child_tree is a well-founded relation. (See this previous section for an explanation of well-foundedness.)

Again we are doing a lot of work just to prove something seemingly obvious: when we only recurse on child trees, eventually we have to get to the bottom, and we can’t get into an infinite loop!

Tree operations


The core of the insert implementation looks like this:

#[program] Fixpoint insert'
  (t : { t : tree | wf_tree t })
  (i : item)
  {wf child_tree t}
    : { l |
        1 <= length l <= 2 /\
        (forall t, In t l -> wf_tree t) /\
        i :: tree_items t ~ flat_map tree_items l
      } :=
  match t with
  | Leaf mbr items =>
      if length items <? M then
        [Leaf (bbox_mbr (item_bbox i) mbr) (i :: items)]
        let '((lx, mbrx), (ly, mbry)) := partition item_bbox (i :: items) in
        [Leaf mbrx lx; Leaf mbry ly]
  | Node _ nodes =>
      let '(idx, _) := get_min
        (fun n => enlargement (tree_bbox n) (item_bbox i)) nodes in
      let '(L, nodes) := remove_ith nodes idx in
      let nodes := insert' L i ++ nodes in
      if M <? length nodes then
        let '((lx, mbrx), (ly, mbry)) := partition tree_bbox nodes in
        [Node mbrx lx; Node mbry ly]
        [Node (mbr (map tree_bbox nodes)) nodes]

We take a well-formed tree and an item to insert. We return either 1 or 2 well-formed trees, whose items are exactly the items of the original tree plus the inserted item.

The implementation uses partition to do the heavy lifting when splitting nodes, and is relatively straightforward. All the proofs that are returned by partition, get_min, and insert' itself in the recursive case can be used to assemble the proof for the specification of insert'.

Finally we wrap insert' with a separate function to handle the special cases with the root node, which will be our actual insert implementation:

#[program] Definition insert
  (t : { t : tree | wf_root t })
  (i : item)
    : { t' | wf_root t' /\ i :: tree_items t ~ tree_items t' }.

This signature is actually quite simple compared to some others we have seen. It says that insert takes a well-formed tree and an item, and it returns a well-formed tree, whose items are exactly the items of the original tree plus the new item.

To search for items intersecting a query rectangle q, we traverse the whole tree and skip any subtrees that do not intersect q. The central lemma stating that this optimization is correct is called tree_item_intersect:

Lemma tree_item_intersect
  {q t i}
  (Hwf : wf_tree t)
  (Hin : In i (tree_items t))
  (Hint : bbox_intersect q (tree_bbox t) = false)
    : bbox_intersect q (item_bbox i) = false.

This lemma says that if q does not intersect the bounding box of a tree, then it cannot intersect the bounding box of any of its items either. This fact is a simple corollary of wf_tree_items_inside (all items of a tree are inside its bounding box) and bbox_inside_not_intersect.

Equipped with this lemma, we are ready to prove range_search correct:

#[program] Fixpoint range_search
  (t : { t | wf_root t })
  (q : bbox)
  {wf child_tree t}
    : { l : list item | l ~ filter
        (fun i => bbox_intersect q (item_bbox i)) (tree_items t) } :=
  match t with
  | Leaf _ items => filter (fun i => bbox_intersect q (item_bbox i)) items
  | Node mbr nodes =>
      let ns := filter
        (fun n => bbox_intersect q (tree_bbox (proj1_sig n)))
        (forall_distr (P:=(fun n => In n nodes)) nodes _) in
      flat_map (fun n => range_search (exist _ (proj1_sig n) _) q) ns

The specification is quite straightforward: range_search returns all items of the tree that intersect q. The implementation is quite straightforward, except for a couple extra things (forall_distr, proj1_sig, and exist) needed to shuffle around the correctness proof. (The proof script following the above definition is not long either, around 20 lines.)


I define deletion as a kind of filtering: we are filtering items intersecting a bounding box q with a predicate f.

Context (f : item -> bool) (q : bbox).

For use in the specification of delete, we define f' to determine which items are actually kept: items either not intersecting q, or items which satisfy f.

Let f' i := negb (bbox_intersect q (item_bbox i)) || f i.

The specification of delete is then quite straightforward:

Definition delete (t : { t | wf_root t })
  : { t' | wf_root t' /\ (tree_items t' ~ filter f' (tree_items t)) }.

The items of the returned tree are just the items of the original tree filtered with f'.

To recap, deletion works by removing items that need to be removed, and also dropping any nodes that would become invalid as a result. At the end, we reinsert all items removed due to this “collateral damage”. To do this, we need a little helper function to insert a list of items into a tree:

#[program] Fixpoint insert_many
  (t : { t | wf_root t })
  (items : list item)
  {measure (length items)}
    : { t' | wf_root t' /\ items ++ tree_items t ~ tree_items t' } :=
  match items with
  | [] => t
  | i :: items' => insert_many (insert t i) items'

We can’t just do fold_left insert items t because we also need a correctness proof for insert_many.

The structure of the implementation is similar to that of insertion, with a tail recursive delete' function operating on the inner nodes, and a wrapper called delete handling special cases for the root.

In delete' we return an option { t | wf_tree t } * list tree pair. We either return the updated subtree or nothing (if the whole subtree is just gone as the result of the deletion), and a list of subtrees whose items need to be reinserted at the end.

The specification became quite verbose, so I pulled it out into a separate definition:

Definition delete_R (t : tree)
  : option { t | wf_tree t } * list tree -> Prop := fun '(t', N) =>
    match t' with
    | None => []
    | Some t' => tree_items (proj1_sig t')
    end ++ flat_map tree_items N ~ filter f' (tree_items t).

This basically says that the set of items in the returned trees (the optional one and the list that needs to be reinserted) is the same as the items in the original tree filtered with f'.

With delete_R, the signature of delete' becomes very simple:

Fixpoint delete' (t : { t | wf_tree t })
  : sig (delete_R (proj1_sig t)).

I won’t show the implementation here, it’s a mess, here is the source code if you are interested.

You can check out the final delete implementation here. It handles some special cases for the root, and at the end it uses insert_many to reinsert the needed items.

Putting everything together

Believe it or not, the hard part is over! Remember the Container interface from the section on specifications? All we have to do now is assemble all of our work into an implementation of this interface.

First, let’s define some parameters:

Context {I : Type} {item_bbox : I -> bbox}.

And some constants:

Let m := 4.
Let M := 8.
Let pm1 : 2 <= m := ltac:(lia).
Let pm2 : 2 * m <= M := ltac:(lia).

(We could make m and M parameters as well, but for the sake of simplicity I will leave them as constants for now.)

Let’s define the type of our container to be the well-formed tree:

Definition T := { data : tree I | wf_root m M I item_bbox data }.

Then we have to define some wrapper functions which operate on T (this basically amounts to plugging in m and M, and shuffling around some proofs):

Definition empty :=
  exist _ (empty I) (wf_empty m M pm1 pm2 I item_bbox).
Definition insert t i :=
  let '(exist _ data' (conj wf' _))
    := insert m M pm1 pm2 I item_bbox t i in
  (exist _ data' wf').
Definition search t q :=
  let '(exist _ res _)
    := (range_search m M pm1 pm2 I item_bbox t q) in
Definition delete t q f :=
  let '(exist _ data' (conj wf' _))
    := delete m M pm1 pm2 I item_bbox f q t in
  (exist _ data' wf').

Next, we have to prove the necessary theorems about these wrapper functions, exactly the ones that Container expects (again, this is very easy and just amounts to plugging in the proofs that the wrapped functions already return about themselves):

Ltac t :=
  unfold all_items, insert, delete, search;
  repeat (match goal with
  | |- context[let '(exist _ _ _) := ?x in _] => destruct x
  | prf : _ /\ _ |- _ => destruct prf
  | prf : _ ~ _ |- _ => simpl; simpl in prf; now rewrite prf

Theorem empty_correct
  : all_items empty ~ [].
Proof. reflexivity. Qed.

Theorem insert_correct
  : forall t i, all_items (insert t i) ~ [i] ++ all_items t.
Proof. t. Qed.

Theorem delete_correct
  : forall t q f, all_items (delete t q f) ~ filter
    (fun i => negb (bbox_intersect q (item_bbox i)) || f i) (all_items t).
Proof. t. Qed.

Theorem search_correct
  : forall t q, search t q ~ filter
    (fun i => bbox_intersect q (item_bbox i)) (all_items t).
Proof. t. Qed.

And now we have exactly what’s needed to define an implementation of Container:

Instance rtree_container : Container := {
  T := T;
  items := all_items;

  empty := empty;
  insert := insert;
  search := search;
  delete := delete;

  empty_correct := empty_correct;
  insert_correct := insert_correct;
  search_correct := search_correct;
  delete_correct := delete_correct;

And that’s it. rtree_container is now an R-tree implementation that carries the proof of its own functional correctness with it.


In a famous quote Donald Knuth once wrote:

Beware of bugs in the above code; I have only proved it correct, not tried it.

So, let’s actually run the code. While performance was not the focus of this project, I still wanted to know how a program that has been directly extracted from a verified Coq implementation performed. Here are the results, comparing it to various other implementations:

my C implementation0.05 ms
rbush (JavaScript) [5]V. Agafonkin, “RBush,” GitHub repository. Available: https://github.com/mourner/rbush0.07 ms
brute force in C0.60 ms
my JavaScript implementation0.74 ms
brute force (JavaScript)1.15 ms
verified-rtree (OCaml)1.26 ms
verified-rtree (js_of_ocaml)2.65 ms
ocaml-rtree [6]M. Eriksen and P. Ferris, “Ocaml-rtree,” GitHub repository. Available: https://github.com/geocaml/ocaml-rtree3.11 ms
brute force (OCaml)4.93 ms

Querying was measured with N=2105N = 2 \cdot 10^5 boxes, k=214k = 214 results from the query, and M=8M = 8 maximum branching factor. Results are averaged over 10 runs. Both Z and nat were extracted to OCaml integers for performance, I talked about the caveats here. Language implementations used: OCaml 5.1.0, Node.js 18.17.1, and GCC 12.3.0.

My naive unoptimized C implementation is the fastest, I am only mildly surprised by this. The big surprise in this benchmark is rbush [5]V. Agafonkin, “RBush,” GitHub repository. Available: https://github.com/mourner/rbush, written in JavaScript, which is basically on par with the C implementation in terms of speed. Shout out to whoever optimized it this well.

Next up is brute force (linear search) in C, not too surprised with this one either, it is well known that on modern CPUs simple linear algorithms which utilize cache prefetching and other hardware optimizations often outperform smarter algorithms.

The other results seem to suggest that in general JavaScript outperforms OCaml, which is quite surprising to me. Even brute force search in JavaScript is better than anything in OCaml.

My verified R-tree performs quite well in OCaml (it beats ocaml-rtree by a significant margin), but not so well in JavaScript (it’s outperformed by brute force search).

The worst one is ocaml-rtree, the only one it beats is brute force search in OCaml.

Conclusion? A lot of these results are surprising to me. If you discover something wrong with my testing setup, please let me know. It would also be interesting to dig into why some implementations are performing unexpectedly badly (like my R-tree compiled to JavaScript, or ocaml-rtree), and why some (like rbush) are performing so well.

The most important takeaway though in the context of this article is that formal verification doesn’t incur performance overhead. As this benchmark demonstrates, there are R-tree implementations out there (ocaml-rtree) that are conclusively beaten by my verified implementation in terms of performance.


On the topic of writing dependently typed programs in Coq, I have mixed feelings. When reading Adam Chlipala’s book [4]A. Chlipala, Certified programming with dependent types: A pragmatic introduction to the Coq proof assistant. Cambridge, MA: The MIT Press, 2013. Available: http://adam.chlipala.net/cpdt/, I got the feeling that dependently typed programming is something that he recommends you do in Coq. I unfortunately ran into obstacles when trying to scale it beyond the simple examples showcased in the book, manually inserting convoy patterns and inline proofs everywhere is simply prohibitive. The #[program] facility in Coq helps a bit with these issues (and I used #[program] extensively for this project), but #[program] has many issues and isn’t developed anymore as far as I can tell.

I still need to get better at writing proofs in Coq, I feel like too many things were done just by trial-and-error instead of advance planning. Still, this is by far the largest proof development I have done to date, and I learned a lot during the course of this project. If you made it this far, I hope you learned something new as well, and I managed to get you interested in using formal methods for improving software correctness.

  1. Container is actually a typeclass, a feature popularized by Haskell. In Coq they are mostly equivalent to record types, which are just syntax sugar for inductive types. ↩︎

  2. While in theory there is no clear delineation between proofs and programs, in any reasonable Coq development all proofs will be in the Prop sort. Extraction erases everything in Prop↩︎

  3. Well, the thing is… OCaml actually has 63 bit integers, because it uses the lowest bit for tagging. And unfortunately this does affect performance↩︎

  4. Note that you will more often see the legacy Program form used elsewhere, but for the sake of consistency I will be using the attribute form #[program]↩︎

  5. Actually this is not much of a special case at all. If we have a proof p of False, we can just pattern match on it like this: match p with end. This is a weird looking pattern match, which in fact has zero branches, since False has zero constructors. This pattern match can give you any value of your choosing. In Coq, ex falso quodlibet means you can not only prove anything, you can also get a value for anything.6Actually, it is kind of a special case, since pattern matching on a type of sort Prop to get something other than Prop is usually not allowed. As a special case, it is allowed for types with exactly zero or one constructors. ↩︎

  6. Actually, it is kind of a special case, since pattern matching on a type of sort Prop to get something other than Prop is usually not allowed. As a special case, it is allowed for types with exactly zero or one constructors. ↩︎


H. B. Curry, J. R. Hindley, and J. P. Seldin, Eds., To H.B. Curry: Essays on combinatory logic, lambda calculus, and formalism. London ; New York: Academic Press, 1980.
A. Guttman, “R-trees: A dynamic index structure for spatial searching,” in Proceedings of the 1984 ACM SIGMOD international conference on Management of data - SIGMOD ’84, Boston, Massachusetts: ACM Press, 1984, p. 47. doi: 10.1145/602259.602266.
P. Letouzey, “A New Extraction for Coq,” in Types for Proofs and Programs, G. Goos, J. Hartmanis, J. Van Leeuwen, H. Geuvers, and F. Wiedijk, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 200–219. doi: 10.1007/3-540-39185-1_12.
A. Chlipala, Certified programming with dependent types: A pragmatic introduction to the Coq proof assistant. Cambridge, MA: The MIT Press, 2013. Available: http://adam.chlipala.net/cpdt/
V. Agafonkin, “RBush,” GitHub repository. Available: https://github.com/mourner/rbush
M. Eriksen and P. Ferris, “Ocaml-rtree,” GitHub repository. Available: https://github.com/geocaml/ocaml-rtree