Get Complete Project Material File(s) Now! »

## Proofs of ML programs

The system presented in this thesis is not only a programming language, but also a proof assistant focusing on program proving. Its proof mechanism relies on equality types of the form t u, where t and u are arbitrary (possibly untyped) terms of the language itself. Such an equality type is inhabited by {} (i.e., the record with no {elds) if the denoted equivalence is true, and it is empty otherwise. Equivalences are managed using a partial decision procedure that is driven by the construction of programs. An equational context is maintained by the type checker to keep track of the equivalences that are assumed to be true during the construction of proofs. This context is extended whenever a new equation is learned (e.g., when a lemma is applied), and equations are proved by looking for contradictions (e.g., when two di|ferent variants are supposed equivalent).

To illustrate the proof mechanism, we will consider simple examples of proofs on unary natural number (a.k.a. Peano numbers). Their type is given below, together with the corresponding addition function de{ned using recursion on its {rst argument.

type rec nat = [Zero ; Succ of nat]

val rec add : nat nat nat =

fun n m {

case n { Zero m | Succ[k] Succ[add k m] }

}

As a {rst example, we will show that add Zero n n for all n. To express this property we can use the type n: , add Zero n n, where can be thought of as the set of all the usual program values. This statement can then be proved as follows. val add_z_n : n: , add Zero n n = {}

Here, the proof is immediate (i.e., {}) as we have add Zero n n by de{nition of the add function. Note that this equivalence holds for all n, whether it corresponds to an element of nat or not. For instance, it can be used to show add Zero true true.

Let us now show that for every n we have add n Zero n. Although this property looks similar to add_z_n, the following proof is invalid. // val add_n_z : n: , add n Zero n = {}

Indeed, the equivalence add n Zero n does not hold when n is not a unary natural number. In this case, the computation of add n Zero will produce a runtime error. As a consequence, we need to rely on a form of quanti{cation that only ranges over unary natural numbers. This can be achieved with the type n nat, add n Zero n, which corresponds to a (dependent) function taking as input a natural number n and returning a proof of add n Zero n. This property can then be proved using induction and case analysis as follows.

val rec add_n_z : n nat, add n Zero n =

fun n {

case n {

Zero {}

Succ[k] let ih = add_n_z k; {}

}

}

If n is Zero, then we need to show add Zero Zero Zero, which is immediate by de{nition of add. In the case where n is Succ[k] we need to show add Succ[k] Zero Succ[k]. By de{nition of add, this can be reduced to Succ[add k Zero] Succ[k]. We can then use the induction hypothesis (i.e., add_n_z k) to learn add k Zero k, with which we can conclude the proof.

It is important to note that, in our system, a program that is considered as a proof needs to go through a termination checker. Indeed, a looping program could be used to prove anything otherwise. For example, the following proof is rejected.

// val rec add_n_z_loop : n nat, add n Zeron =

// fun n { let ih = add_n_z_loop n; {} }

It is however easy to see that add_z_n and add_n_z are terminating, and hence valid. In the following, we will always assume that the programs used as proofs have been shown terminating. There are two main ways of learning new equations in the system. On the one hand, when a term t is matched in a case analysis, a branch can only be reached when the corresponding pattern C[x] matches. In this case we can extend the equational context with the equivalence t C[x]. On the other hand, it is possible to invoke a lemma by calling the corresponding function. In particular, this must be done to use the induction hypothesis in proofs by induction like in add_z_n or the following lemma.

val rec add_n_s : n m nat, add n Succ[m] Succ[add n m] =

fun n m {

case n {

Zero {}

Succ[k] let ind_hyp = add_n_s k m; {}

}

}

In this case, the equation corresponding to the conclusion of the used lemma is directly added to the context. Of course, more complex results can be obtained by combining more lemmas. For example, the following proves the commutativity of addition using a proof by induction with add_n_z and add_n_s.

val rec add_comm : n m nat, add n m add m n =

fun n m {

case n {

Zero let lem = add_n_z m; {}

Succ[k] let ih = add_comm k m;

let lem = add_n_s m k; {}

}

}

‘a ref

Many more examples of proofs and programs are provided in Chapter 7 (and even more with the implementation of the system). Each of them (including those in the current chapter) have been automatically checked upon the generation of this document. They are thus correct with respect to the implementation.

### A brief history of value restriction

A soundness issue related to side-e|fects and call-by-value evaluation arose in the seventies with the advent of ML. The problem stems from a bad interaction between side-e|fects and Hindley-Milner polymorphism. It was {rst formulated in terms of references, as explained in [Wright 1995, Section 2]. To extend an ML-style language with references, the naive approach consist in de{ning an abstract type and polymorphic procedures with the following signature (given in OCaml syntax).

type ‘a ref

val ref : ‘a ‘a ref

val (:=) : ‘a ref ‘a unit

val (!) : ‘a ref ‘a

Here, the function ref takes as input a value of some type and creates a new reference cell containing an element of the corresponding type. The value of a reference can then be updated using the in{x operator (:=) (to be pronounced ÷setø), and its value can be obtained using the pre{x operator (!) (to be pronounced ÷getø).

These immediate additions quickly lead to trouble when working with polymorphic references. The problem can be demonstrated by the following example, which is accepted by the naive extension of the type system.

let l = ref [] in

l := [true]; (List.hd !l) + 1

On the {rst line, variable l is given the polymorphic type ‘a list ref, which can be uni{ed both with bool list ref and int list ref on the second line. This is an obvious violation of type safety, which is the very purpose of a type system.

To solve the problem, alternative type systems such as [To{te 1990, Damas 1982, Leroy 1991, Leroy 1993] were designed. However, they all introduced a complexity that contrasted with the elegance and simplicity of ML systems (see [Wright 1995, Section 2] and [Garrigue 2004, Section 2] for a detailed account). A simple and elegant solution was {nally found by Andrew Wright in the nineties. He suggested restricting generalisation (i.e., introduction of polymorphism) to syntactic values [Wright 1994, Wright 1995].

In ML, generalisation usually happens in expressions of the form ÷let x = u in tø, called let-bindings. The type-checking of such an expression proceeds by inferring the type of the term u, which may contain uni{cation variables. The type of u is then generalized by universally quantifying over these uni{cation variables. Finally, the term t is type-checked under the assumption that x has the most general type of u. With value restriction, the generalisation of the type of u only happens if u is a syntactic value. Consequently, the example above is rejected since ref [] is not a value, and hence its inferred type ‘_a list ref is only weakly polymorphic (i.e., it can only be uni{ed with exactly one, yet unknown type). Thus, it cannot be uni{ed with both bool list ref and nat list ref.

As mentioned in Section 1.1, the system presented in this thesis does not include references, but control structures. One way of extending ML with control structures is again to introduce an abstract type equipped with polymorphic operations.

type ‘a cont

val callcc : (‘a cont ‘a) ‘a

val throw : ‘a cont ‘a ‘b

The function callcc corresponds to the control operator call/cc, which was {rst introduced in the Scheme programming language. When called, this function saves the current continu-ation (i.e., the current state of the program’s environment) and feeds it to the function it is given as an argument. The continuation can be restored in the body of this function using the throw function.

As for references, the addition of control structures breaks the type safety of ML in the presence of polymorphism. A complex counterexample was {rst discovered by Robert Harper and Mark Lillibridge [Harper 1991].

let c = callcc

(fun k ((fun x x), (fun f throw k (f, (fun _ ())))))

in

print_string ((fst c) « Hello world! »);

(snd c) (fun x x+2)

Intuitively, the program {rst saves the continuation and builds a pair c containing two functions. The {rst one is simply the identity function. The second one takes a function f as argument and calls throw to restore the previously saved continuation. It then replaces c with a pair containing f and a constant function. Consequently, the {rst element of the pair c can be used as the (polymorphic) identity function as long as the second element of c has not been used. However, when the second element of c is called with a function g, then g becomes the {rst element of c and the computation restarts. This is problematic since the function fun x x+2 is then applied to a value of type string, which is thus fed to an integer addition.

During type-checking, the type that is inferred for the pair c (prior to generalisation) is (‘_a ‘_a) * ((‘_a ‘_a) unit). Thus, in absence of value restriction, the last two lines of the counterexample are type-checked under the assumption that c has the polymorphic type (‘a ‘a) * ((‘a ‘a) unit). In particular, the type ‘a ‘a can be uni{ed with both string string and int int. As with references, the value restric-tion forbids such uni{cations.

Note that it is relatively easy to translate the counter example into our language. Indeed, terms of the form callcc (fun k t) are translated to save k t and terms of the form throw k u to restore k u. Moreover, as our system contains system F, value restric-tion needs to be stated di|ferently. It appears on the typing rule for the introduction of the universal quanti{er. In the system, value restriction corresponds to only applying this rule to terms that are values.

**Dependent functions and relaxed restriction**

One of the main features of our system is a dependent function type. It is essential for building proofs as it provides a form of typed quanti{cation. However, combining call-by-value evaluation, side-e|fects and dependent functions is not straightforward. Indeed, if t is a dependent function of type x a, b x and if u has type a, then is it not always the case that t u evaluates to a value of type b u . As a consequence, we need to restrict the applica-tion of dependent functions to make sure that it is type safe. The simplest possible approach consists in only allowing syntactic values as arguments of dependent functions, which is another instance of the value restriction. It is however not satisfactory as it considerably weakens the expressiveness of dependent functions. For example, add_n_z cannot be used to prove add (add Zero Zero) Zero add Zero Zero. Indeed, the term add Zero Zero is not a value, which means that it cannot be used as argument of a dependent function. This problem arises very oùten as proofs rely heavily on dependent functions. As a consequence, the value restriction breaks the modularity of our proof system.

Surprisingly, our equality types provide a solution to the problem. Indeed, they allow us to identify terms having the same observable computational behaviour. We can then relax the restriction to terms that are equivalent to some value. In other words, we consider that a term u is a value if we can {nd a value v such that u v. This idea can be applied whenever value restriction was previously required. Moreover, the obtained system is (strictly) more expressive that the one with the syntactic restriction. Indeed, {nding a value that is equivalent to a term that is already a value can always be achieved using reöexivity. Although this new idea seems simple, establishing the soundness of the obtained system is relatively subtle [Lepigre 2016].

In practice, using a term as a value is not always immediate. For example, the system is not able to directly prove that add n m is a value, provided that n and m are two natural numbers. It is however possible to establish this fact internally as follows.

val rec add_total : n m nat, v: , add n m v =

fun n m {

case n {

Zero {}

Succ[k] let ih = add_total k m; {}

}

}

Here, add_total proves that for any values n and m in the type nat, there is some value v such that add n m v. Note that we did not speci{cally require v to be a natural number as this is usually not necessary in practice. Thanks to add_total, we can give a proof of the associativity of our addition function.

val rec add_asso : n m p nat, add n (add m p) add (add n m) p = fun n m p {

let tot_m_p = add_total m p;

case n {

Zero {}

Succ[k] let tot_k_m = add_total k m;

let ih = add_asso k m p; {}

}

}

Note that the proof requires two calls to add_total. The {rst one is used in both the base case and the induction case. It is required so that the system can unfold the de{nition of add n (add m p) according to the head constructor of n. As we are in call-by-value, we can only reduce the de{nition of a function when it is applied to values. It is the case here as n is a variable and add m p is equivalent to some value, as witnessed by tot_m_p. The second call to add_total is required for a similar reason in the successor case.

#### Handling undecidability

Typing and subtyping are most likely to be undecidable in our system. Indeed, it contains Mitchell’s variant of System F [Mitchell 1991] for which typing and subtyping are both known to be undecidable [Tiuryn 1996, Tiuryn 2002, Wells 1994, Wells 1999]. Moreover, as argued in [Lepigre 2017], we believe that there are no practical, complete semi-algorithms for extensions of System F like ours. Instead, we propose an incomplete semi-algorithm that may fail or even diverge on a typable program. In practice we almost never meet non termination, but even in such an eventuality, the user can always interrupt the program to obtain a relevant error message. This design choice is a very important distinguishing feature of the system. To our knowledge, such ideas have only be used (and implemented) in some unpublished work of Christophe Ra|falli [Ra|falli 1998, Ra|falli 1999] and in [Lepigre 2017].

One of the most important ideas, that makes the system practical and possible to implement, is to only work with syntax-directed typing and subtyping rules. This means that only one of our typing rules can be applied for each di|ferent term constructor. Similarly, we have only two subtyping rules per type constructor: one where it appears on the leùt of the inclusion, and one where it appears on the right. As type-checking can only diverge in subtyping, an error message can be built using the last applied typing rule.

Moreover, all the undecidability of the system is concentrated into the management of uni{cation variables, termination checking and equivalence derivation.

As a proof of concept, we implemented our system in a prototype called PML2. The last version of its source code is available online (http://lepigre.fr/these/). Its implementation mostly follows the typing and subtyping rules of the system given in Chapter 6. Overall, our system provides a similar user experience to statically typed functional languages like OCaml or Haskell. In such languages, type annotations are also required for advanced features like polymorphic recursion.

**Related work and similar systems**

To our knowledge, the combination of call-by-value evaluation, side-e|fects and depen-dent products has never been achieved before. At least not for a dependent product fully compatible with e|fects and call-by-value. For example, the Aura language [Jia 2008] forbids dependency on terms that are not values in dependent applications. Similarly, the F language [Swamy 2011] relies on (partial) let-normal forms to enforce values in argument position. Daniel Licata and Robert Harper have de{ned a notion of positively dependent types [Licata 2009] which only allow dependency over strictly positive types. Finally, in languages like ATS [Xi 2003] and DML [Xi 1999] dependencies are limited to a speci{c index language.

The system that seems the most similar to ours is NuPrl [Constable 1986], although it is inconsistent with classical reasoning and not e|fectful. NuPrl accommodates an obser-vational equivalence relation similar to ours (Howe’s squiggle relation [Howe 1989]). It is partially reöected in the syntax of the system. Being based on a Kleene style realizability model, NuPrl can also be used to reason about untyped terms.

The central part of this paper consists in the construction of a classical realizability model in the style of Jean-Louis Krivine [Krivine 2009]. We rely on a call-by-value presen-tation which yields a model in three layers (values, terms and stacks). Such a technique has already been used to account for classical ML-like polymorphism in call-by-value in the work of Guillaume Munch-Maccagnoni [Munch 2009]. It is here extended to include dependent products. Note that our main result (Theorem 5.4.15) is unrelated to Lemma 9 in Munch-Maccagnoni’s work [Munch 2009].

The most actively developed proof assistants following the Curry-Howard correspon-dence are Coq and Agda [CoqTeam 2004, Norell 2008]. The former is based on Coquand and Huet’s calculus of constructions and the latter on Martin-Lúf’s dependent type theory [Coquand 1988, Martin-Löf 1982]. These two constructive theories provide dependent types, which allow the de{nition of very expressive speci{cations. Coq and Agda do not directly give a computational interpretation to classical logic. Classical reasoning can only be done through a negative translation or the de{nition of axioms such as the law of the excluded middle. In particular, these two languages are not e|fectful.

**Table of contents :**

**1 From Programming to Program Proving**

1.1 Writing functional programs

1.2 Proofs of ML programs

1.3 A brief history of value restriction

1.4 Dependent functions and relaxed restriction

1.5 Handling undecidability

1.6 Related work and similar systems

1.7 Thesis overview

**2 Untyped calculus and abstract machine**

2.1 The pure -calculus

2.2 Evaluation contexts and reduction

2.3 Call-by-value Krivine machine

2.4 Computational e|fects and -calculus

2.5 Full syntax and operational semantics

2.6 Classi{cation of processes

**3 Observational equivalence of programs**

3.1 Equivalence relation and properties

3.2 Compatible equivalence relations

3.3 Equivalences from reduction

3.4 Inequivalences from counter-examples

3.5 Canonical values

**4 Types and realizability semantics**

4.1 Observational equivalence type

4.2 Quanti{cation and membership type

4.3 Sorts and higher-order types

4.4 Typing judgments for values and terms

4.5 Call-by-value realizability semantics

4.6 Adequacy

4.7 Typing stacks

**5 A model for a semantical value restriction**

5.1 Dependent function types

5.2 The limits of the value restriction

5.3 Semantical value restriction

5.4 Semantics for semantical value restriction

5.5 Final instance of our model

5.6 Derived type system

5.7 Understanding our new equivalence

**6 Introducing subtyping into the system**

6.1 Interests of subtyping

6.2 Symbolic witnesses and local subtyping

6.3 Typing and subtyping rules

6.4 Semantics of subtyping

6.5 Completeness on pure data types

6.6 Normalisation, safety and consistency

6.7 Toward (co-)inductive types and recursion

**7 Implementation and examples**

7.1 Concrete syntax and syntactic sugars

7.2 Encoding of strict product types

7.3 Booleans and tautologies

7.4 Unary natural numbers and totality

7.5 Lists and their vector subtypes

7.6 Sorted lists and insertion sort

7.7 Lookup function with an exception

7.8 An in{nite tape lemma on streams

**References**