Pure Type System in Natural Deduction

Get Complete Project Material File(s) Now! »

The purpose of programming languages

Hopefully, we now all agree on the fact that we want to give orders to a computer in a human-friendly manner. Nowadays, computers are quite so-phisticated, but they are merely bigger calculators than the ones we all used at school to compute simple equations like “what is the result of 126 di-vided by 3?”. So programming languages should be about describing the data-structures we want to work with, and the things we want to compute with them (this is question 1 of the previous section), and also about how to compute “in a safe way” (this is question 2): we want to give orders to a computer, and have some fairly good insurances that the computed results are what we actually expected.
What do we have at hand to do such computations ? We will keep it simple by stating: a CPU which will blindly execute the orders we give him and some space for data storage, which is commonly called memory. This memory can be seen as a big array of zeros and ones (called bits), most of the time stored in packs of eight (eight bits are called an octet). In our language, we will have to handle data such as natural numbers, strings, lists and so on, but in the end, they will all be stored as a chunk of memory. So, we need a way to distinguish one chunk of memory from another: let us say we have the plus function that take two natural numbers and add them together. What if we call this function on a string and a list of natural numbers ? What should happen ? What will happen ?
If it happens, we can almost be sure that your program will crash. But it should never have been allowed in the first place! Just by taking a look at the types involved in such a program, we can see it fails to work. This is where question 1 and 2 are facing one another: we manage to write down a program for our computer, but it was not safe to execute.
How did we guess that this program was not safe to run ? This is an easy question: plus is waiting for two natural numbers, and we gave it something else. But we are clever human beings who saw this issue, while the computer only saw two chunk of memory with octets in them. It is our work, through the programming language, to give more information to the computer, so it can avoid such traps. We need to make the computer understand that a chunk of memory representing a string should not be used when a natural number is expected. This is commonly achieved by adding labels to this structure, saying for example “this chunk is a natural number and this other one is a list of strings”. With this information, the computer can be aware of the kind of data it is computing over, and will reject a program like plus (3, »foo ») just because « foo » is not a natural number.
These labels are called types, and the action of seeing if the type of an input matches the expected type of a function is called type checking. The structure of types inside a programming language is called a type system. They are here to enforce some guard properties in a program, to have some guarantees that the computation will be done in a safe manner. The expres-siveness of type systems is directly linked to the kind of guarantees we will have: if they are too simple, we will only be able to have simple information as in the toy example we just saw. With more expressive languages, we can have much more powerful assertions, like pre-conditions to fulfill before being able to use a function, or additional information on the values returned by a program.
A common example to emphasize the power of a type system is Euclid’s division algorithm: extensionally, it can be stated as “given a and b two natural numbers, if b is not equal to 0, we want to find q and r such that a = b ∗ q + r and r < b”. In real programming languages, the type of this program may vary depending of the power of the type system:
• in C-LIKE: void div(int a, int b, int *q, int *r)
• in ML-LIKE: div: int -> int -> (int * int)
• in COQ: div : forall (a b:nat), 0 < b ->
{q : nat & {r : nat & a = b * q + r /\ r < b }}.
In the first two examples, the programming language is not informative enough to carry around information “about” the data, like b is not 0 or a = b ∗ q + r. The only information we have is that div is expecting two natural numbers (or integers) and will return two natural numbers, without any information about the computation itself.
In the last one, there is a pre-condition and two post-conditions: to be able to call this function, we need to give a witness that b is not 0, and this function will give us two natural numbers that verify the two relations that we are interested in. It is pretty nice to have such power in the type system, but there is a drawback: in the code of the third program, we need to build the proof that the resulting q and r enjoy the nice relations with the input. More expressiveness will require more work from the programmer. In this particular example, we already know some automatic ways to prove these arithmetical results, but we have to keep in mind the following statement: the more we want, the more we need to provide first.

A practical example

To be able to study a programming language, we first need to explain its syntax, and then we can try to prove some nice properties over it. In fact, we need to define two things: the syntax of the language, and the process of computation. We need to formalize our language along with the process of computation. As an example of the properties a programming language can have, we are going to consider a simple (but still powerful) example by studying the λ-calculus.
This language was introduced by Alonzo Church [Chu51, Bar84] in the 1930’s as a tool to study the foundations of mathematics. Since then, many extensions of this language have been used as a basis for real programming language and for logic programming [HAS, SML, Gal, PS]. Here is its syntax: M, N ::= x | M N | λx.M
A term of the λ-calculus can be built from three different constructors. The first one is about variables (which we will always write as lower-case letters x, y, . . . ). They are names for more complex terms. The second one, M N , is called an application: given a function f and an argument a, f a stands for “let us apply the function f to a”. Back in high-school, we would write it f (a), this is just another notation for it. Please remember that this is not the result of the computation of this function when applied to a particular argument, it is just the syntactic juxtaposition of two terms: for example, if we consider the identity function id that just return whatever data we give it, id a is syntactically different from a, but it will compute to a. If a function has several arguments (like the plus function for example), we simply use several applications :(plus x) y. By convention, application is left-associative, so we can even get rid of the parentheses and just write plus x y.
Finally, the most awkward of the symbols above, the λ-abstraction is used to define functions. Until now, we gave name to functions but names can be complicated to handle during a strict formalization, so we introduce anonymous functions with this abstraction. Here are some simple examples to illustrate this λ construction:
• the identity function is denoted by the term λx.x, which is equivalent to the declarative statement “for any term x, we return x”.
• the plus function1 is denoted by the term λxλy.x + y: “given a x and a y, we return x + y”.
• the app function, that takes a function, an argument, and apply this function to this argument is denoted by the term λf λx.f x.
In a λ-abstraction λx.M , the variable x is bound by the λ inside the body M . It means that if we apply this function to an argument N , all occurrences of x inside M will be placeholders for N .
Now that we have a simple syntax for terms (without types at the mo-ment), we need to explain how to compute with them: how do we go from id x to x ? This process of rewriting a term into another one is called β-reduction, and is informally defined as follows: if an abstraction λx.M is applied to term N , the application (λx.M ) N can be β-reduced to M [N/x], which stands for “replace all the occurrences of x in M by N ”. This process will be noted →β for a one step reduction, and ։β for several consecutive steps:
• id N →β N
• app id a →β (λx.id x) a →β id a
• plus 3 4 ։β 7
1Here we cheated: + is not part of our syntax, but it was just to illustrate with a known example.
This language is rather simple, but it is quite interesting to study its computational behavior, even without type information: we can encode a lot of interesting data-structures (natural numbers, pairs, lists, . . . ), but this is not what we are interested in at this point.
What can we say about this language ? What properties does it give us on the structure of our program, or on its computation behavior? In fact, nothing much at the moment: we need types! Since we only have one function constructor (using the λ-abstraction), we only need one type con-structor, written →, which is pretty much like the mathematical implication for function:
• id has type A → A: it takes a data of type A and returns it directly, so the type is not changed.
• plus has type nat → nat → nat: it takes two natural numbers and returns a natural number.
• app has type (A → B) → A → B: it takes a function of type A → B, an argument of type A, and returns the result of their application, which is of type B.
We will also need some base types (like nat or string) for our data, but we do not really care about it for our study. Last thing, we need to store all the type information we will collect in a context, as a remainder of the types that we already know: as soon as we know that plus has type nat → nat → nat, we can put this information in our context so we do not need to rebuild this information the next time we will need it.
With all this, we can build a type system for the λ-calculus, called the Simply Typed λ-calculus. This extension of the λ-calculus was also introduced by Church [Chu40] a few years after first presenting the λ-calculus to avoid some paradoxes that were discovered in the untyped calculus. The system is defined according to the following rules:
• a variable x has type A if this information is in our context.
• if, knowing that x has type A, we can prove that M has type B, then λx.M has type A → B.
• if M is a function of type A → B and N has type A, the resulting application M N has type B.
Since we are trying to be a little more formal, we now give the same set of rules, but written in a much more concise way, it will be easier to talk about a specific rule. The typing rules for the Simply Typed λ-calculus are detailed in Fig. 1.1. This kind of presentation is called a type system.
(x : A) ∈ (x : A) ⊢ M : B ⊢M:A→B⊢N:A
VAR LAM APP
⊢ x : A ⊢ λx : A.M : A → B ⊢MN:B
Figure 1.1: The Simply Typed λ-Calculus
This presentation is really simple to read: a typing judgment (also called a sequent) ⊢ M : A means that in the context , the term M has type A, and the lines can be read as a simple implication:
If all the conditions on top are true
the conclusion at the bottom is also true
NAME OF THE RULE
You may have noticed that, in the LAM rule, we wrote λx : A.M instead of λx.M . This additional annotation is quite handy to reason about typing judgments, but it is not mandatory: both presentations, with or without the annotation on λ-abstractions, are commonly used. For the application rule, we take care to check that the type of the argument N matches the type expected by the function M : they are both the same A.
A well-typed term is a term for which we can compute a type using only the previous rules: plus 3 4 is a well-typed term because 3 and 4 are natu-ral numbers, but plus 3 « foo » is not because « foo » is a string, so we will not be able to correctly apply the APP rule.
With this nice syntax and its simple type system, we can start proving properties about our programming language. We are going to focus on two properties which are, according to us, the most important ones: Termination and Preservation.
A term is said to be terminating if its computation terminates, that is if we can only apply a finite number of β-reduction steps: our running exam-ple plus 3 4 is terminating, because 7 can not be reduced anymore. This property is used to ensure there is no infinite loop inside a program. As you may guess, all the terms are not terminating (otherwise we would not have to state such a property). However, all the well-typed terms are. Take a look at the term λx.x x, that we will call δ. Without the typing information, we can do some reductions:
δ δ = (λx.x x) δ
→β (x x)[δ/x] = δ δ
։β δ δ
։β . . .
Having such an infinite reduction sequence can be a bit annoying inside a program, especially when we want to compute a result: it will loop forever! But what if we try to attach types to this term ? Let us try to guess the type of δ, with the previous rules:
1. δ is a λ-abstraction, so its type is of the shape T1 → T2, and in a context where x has type T1, x x has to have type T2.
2. the first occurrence of x forces the shape of T1: T1 = A → B.
3. the second one forces the type of x to be equal to the domain of T1, so we get another equation : T1 = A.
All this leads to a single equation A = A → B which is unsolvable: δ is not typable in the Simply Typed λ-Calculus. This example shows how typing can reject non-terminating programs, without having to run them.
The second property we want to emphasize is Preservation (also known as Subject Reduction). It states that computation does not change the type of a term: if M →β N and M is of type A, then N is also of type A. In a more formal presentation:
⊢M:A M→βN
⊢N:A
This property ensures that computation does not mess with the content of the data: for example, just from the typing information we have so far, we know that plus 3 4 is a natural number, because we already know the type of plus, and because 3 and 4 are natural numbers, but preservation guarantees that the resulting computation, 7, is also a natural number, even if we do not perform this computation. In this case, it was easy to check that plus 3 4 will compute to a natural number, but it is not always so simple to verify.

READ  orsional Alfvén modes in a non-axisymmetric domain

Computation on types

A nice property about the Simply Typed λ-Calculus is that it has literally simple types: the world of types and the world of terms are completely distinct one from the other. With this kind of system, it is impossible to have complicated information in the types: if you recall the example about Euclid’s algorithm, the third one was the most informative one, but there were actually terms (namely a, b, q and r) inside the type of the function.
In order to achieve such expressiveness, we need to add a dependency between types and terms. There are several ways to do this, at several lev-els (types depending on types, types depending on terms, terms depending on types, . . . ) and we do not want get into all of them at the moment. These systems are significantly more difficult to understand than the simple λ-calculus, so we are not going to try to describe a particular one, but we are just going to think about one question : why would we want to compute inside a type?
The usual running example at this point is to consider a particular kind of lists. The basic type for list of terms has only one parameter: the type of the data it contains:
• the list l1 = [1; 2; 3; 4; 5] has type list nat.
• the list l2 = [′a′;′ b′;′ z′] has type list char.
• a list containing data of type A has type list A.
We are going to extend this definition by adding a second information in the type: we want to know the size of the list just by typing, and will call this new type nlist:
• l1 would have type nlist 5 nat.
• l2 would have type nlist 3 char.
• a list of length n containing data of type A has type nlist n A.
This new type allows us to be more informative while writing a function. For example, if we want to extract the first element of a list, we need to check that this list is not empty: with nlist, this can be done with typing !
head : forall (A:Type) (n:nat), nlist (n+1) A -> A.
The head function expects a list of length at least 1, so its execution will never involve the empty list. This is a pretty interesting feature, but what if we go with more complicated functions ? Let us consider the concatenation of two lists:
concat:
forall (A:Type) (n m:nat), nlist n A -> nlist m A -> nlist (n+m) A.
The concat function takes two lists of arbitrary size, n and m, and return a list which size is n + m by gluing the second list at the end of the first one. A practical example: what is the result and the type of concat [1;2] [3;4;5] ?
concat [1;2] [3;4;5] = [1;2;3;4;5] : nlist (2+3) nat.
Why is it 2+3 and not 5 ? Simply because, as we previously said, 2+3 is not syntactically equal to 5, but it computes to 5. If we want to embed terms into types, we also need a way to embed the computation at the level of types. The usual way to do this is to add a conversion rule to our favorite type systems, which looks like:
⊢ M : AA computes to B
⊢M:B
The critical notion here is how to define the “computes to”. Several differ-ent presentations have been proposed until now, all designed for a particular purpose:
• If one is only concerned with program evaluation, one only needs to have an untyped notion of reduction, and rely on preservation to type the result of its computation.
• If one is concerned with consistency, or if its computation needs type (it is the case with η-expansion for example), one may need to have a typed notion of reduction, but will have trouble to prove preservation.
The following chapters are a technical investigation about this conversion rule, in order to finally prove that all the definitions that we study in this dissertation are actually just different ways to state the very same things: all these presentations are equivalent.

Where to go next ?

In this first chapter, we wanted to explain as simply as possible the underlying motivations of our work: why are we interested in type theory, and what are the possible applications of this field. From now on, we will forget a bit about computer programming, and focus on the study of the meta-theory of a particular family of type systems called Pure Type Systems, which are used as a basis for the underlying theory of proof assistants and proof search engines.
Our work will mainly focus on the conversion rule of those systems which is the main reason why there are several different presentations of those sys-tems, depending on the kind of guarantees one wants about the computation.
This investigation aims to improve our understanding of the theories be-hind proof assistants, in order to improve those software. As a practical application of this concept of “proof assisted by computer”, this dissertation has been completely formalized within the Coq proof assistant [Coq], as a supporting tool which helped to build some complex parts of the proofs, and also to be sure that we did not forget anything in the formalization, and thus ensure that everything was correctly proved. The whole development can be found here [Sila, Silb] and has been tested with both the trunk version of July 2010 (revision 13303) and the 8.3 version of Coq.

Table of contents :

I Introduction 
1 Type systems in a nutshell
1.1 The general setting
1.2 The purpose of programming languages
1.3 A practical example
1.4 Computation on types
1.5 Where to go next ?
II Untyped equality 
2 Pure Type Systems
2.1 Pure Type System in Natural Deduction
2.1.1 Terms and Untyped Reductions
2.1.2 Presentation of Pure Type Systems
2.2 Pure Type Systems in Sequent Calculus
2.2.1 Terms and Reduction
2.2.2 Confluence of β-reduction in λ
2.2.3 Typing Rules
2.2.4 Properties of the system
2.3 Delayed Pure Type System in Sequent Calculus
2.3.1 Typing Rules
2.3.2 Properties of the system
2.4 Expansion Postponement in Delayed System
2.4.1 Postponement in Sequent Calculus
2.4.2 Postponement in Natural Deduction
2.5 Sequent Calculus and Type Inference
2.6 A brief look back at syntactical Pure Type Systems
III Typed equality 
3 Judgmental Equality
3.1 PTSs with Judgmental Equality
3.1.1 Typing Rules
3.1.2 Subject Reduction and Equivalence
3.2 Basic meta-theory of the annotated system
3.2.1 Definition of PTSs with Annotated Type Reduction
3.2.2 General properties of PTSatr
3.2.3 The Church-Rosser Property in PTSatr
3.2.4 Consequences of the Church-Rosser property
3.3 Equivalence of PTSatr and PTSs
3.3.1 Confluence of the annotation process
3.3.2 Consequences of the Erased Confluence
3.3.3 Consequences of the equivalence
4 Formalization in Coq
4.1 Formal proof: paper or computer ?
4.1.1 What is a formal proof ?
4.1.2 Automatic resolution and induction schemes
4.2 Encoding PTSs in a proof assistant
4.2.1 Questions about encodings of binders
4.2.2 Higher order encodings
4.2.3 Our final choice: de Bruijn indices
IV Conclusion and Further Research 
5 Extensions of PTS
5.1 Sorts, order and subtyping
5.2 Toward a typed conversion for CC!
5.2.1 The straightforward approach
5.2.2 Other attempts and possible leads
5.3 Other leads for future investigations

GET THE COMPLETE PROJECT

Related Posts