30.5.20

Intuitionistic Propositional Logic and Natural Deduction

Here is a primer on intuitionistic propositional logic and natural deduction, along with a brief account of their historical context. The bits of historical context may explain why formal logic is somewhat entangled with the discourse on "foundations of mathematics". I frankly wanted to research and type up my notes on something else, but then started "writing backwards" and thought I'd make this the introductory notes I wish I had before learning about these things.

The topics here can be seen as useful background knowledge for anyone learning about sequent calculus, proof theory, Martin-Löf type theory (aka intuitionistic type theory aka constructive type theory), the Curry-Howard-Lambek correspondence with category theory and particularly cartesian closed categories, categorical logic; I may write about these connections at another time, but this is supposed to be an accessible overview. We proceed according to this plan:

The focus on propositional logic is for simplicity, and to build a bridge to my last post where I briefly described the Curry-Howard correspondence. One can add symbols and rules for quantifiers to obtain first-order predicate logic, and the inclined reader should consult the more detailed texts on these topics, but proof terms would no longer be simply-typed $\lambda$-calculus, and then with higher-order logic there is also a different way to talk about sets of individual objects.

The beginnings of formal logic

For ages, mathematics involved theorems and proofs. Yet, the beginning of formal logic is marked by Gottlob Frege's and Bertrand Russell's works.

Gottlob Frege described a graphical notation that captured sound reasoning in precise rules. It was the beginning of the logicist program of demonstrating that all mathematics could be carried out in formal logic. Bertrand Russell advanced this program considerably, also inventing type theory in the process.

Formal logic, symbolic logic refers to logical statements and reasoning being expressed in a formal language, with symbols representing logical statements and rules of symbol manipulation describing how to obtain new statements from existing ones, preserving validity. Frege and Russell's deductive systems of reasoning are today presented as Hilbert systems (more on those below).

Georg Cantor had proposed set theory as a foundational system earlier, but a formal account of set theory had to wait until Zermelo published his axioms. People (including Cantor himself) had found paradoxes in set theory or what they understood to be set theory. The search for the "right" foundations continued, while alternative systems of formal logic were proposed (for example, Schönfinkel's combinatory logic, which does not use any variables). The period that followed the beginnings saw what is today called the foundational crisis of mathematics.

Axiomatic proof systems

David Hilbert and his school systematized mathematical logic, the application of formal logic to formalizing mathematics, and "metamathematics".

Around 1920, the “Hilbert style” axiomatic approach, as it is often called, was known to everyone and dominated the logical scene [1]

Hilbert systems are a uniform presentation of formal deductive reasoning and can be treated as synonymous with axiomatic proof systems. There are "logical axioms" that capture the process of sound reasoning and there are "mathematical axioms" which would capture the mathematical theory we want to formalize. The only inference rules are the rule of substitution and modus ponens. A proof is a sequence of formulas, with each term being either an axiom or derived from a previous terms by application of the inference rules. If formula $A$ is derivable this way this is written $\vdash A$.

Later, John von Neumann refined axiomatic proof systems using axiom schemes: axioms schemes contain metavariables, and all instance count as axioms and modus ponens becomes the only rule. Here is such a Hilbert-style axiomatic proof system for propositional logic, due to Jan Łukasiewicz: $$ \begin{array}{l} 1.~ A \rightarrow (B \rightarrow A) \\ 2.~ (A \rightarrow (B \rightarrow C)) \rightarrow ((A \rightarrow B) \rightarrow (A \rightarrow C)) \\ 3.~ (\neg B \rightarrow \neg A) \rightarrow (A \rightarrow B) \\ \end{array} $$

The rule of Modus Ponens lets us infer from $A \rightarrow B$ and $A$ the new formula $B$. Note that the other operators $\vee$ and $\wedge$ are considered as abbrevations; this is referred to as Ockham's razor, since it uses a minimum of primitives $A \vee B$ is turned into $\neg A \rightarrow B$, and $A \wedge B$ is $\neg (A \rightarrow \neg B)$.

It is amazing that this axiomatic proof systems is complete: all tautologies can be proven from a few axiom schemes. Many texts on formal logic still use axiomatic proof system, presumably because these permit to talk about formal logic with a minimal amount of machinery for syntax and logical axioms; appealing to the deduction theorem and admissible rules hide much of the structure of the proof, which is not considered interesting to begin with; all that matters is the fact that something can be proven.

[1] Stanford Encyclopedia of Philosophy The Development of Proof Theory

[2] Stanford Encyclopedia of Philosophy The Emergence of First-Order Logic

What is intuitionistic logic?

L.E.J. Brouwer rejected the idea that logics would be prior to mathematics and proposed intuitionism as philosophy of mathematics; he understood mathematics as mental constructions that come first, before any rendering in logic.

We summarize the intuitionist positions as follows:

  • An intuitionist rejects the law of the excluded middle $A \vee \neg A$ when $A$ is a statement about infinite sets
  • An intuitionist rejects the idea of an "actual infinite" altogether and regards infinity as "potential infinity", approximable by repeating a process, but never completed.
  • An intuitionist aims to base all mathematics on effective constructions that can be verified based on intuition (such as the intuition of being able to carry out a countable number of steps)

These restrictions avoid paradoxes of naive set theory, at the cost of rejecting large parts of established mathematical practice.

Brouwer's student Heyting [3] and also Kolmogorov [4] formulated axioms of intuitionistic logic as axiomatic proof systems.

Their efforts is what enabled further discussion (beyond philosophical disagreement) and defined intuitionistic logic [5]. Today, maybe due to the multiple available proof systems for formal logic, many sources define intuitionistic logic informally by reproducing the "Brouwer-Heyting-Kolmogorov (BHK) interpretation" [wikipedia]. This is a description of what is an acceptable proof. We skip a discussion of these because we will be giving natural deduction rules below. The key part is that structure of the proof matters now; the structure contains the constructions that the intuitionist cares about.

Intuitionistic logic is only considered one form of constructive logic and constructive mathematics; there are other logics that are compatible with the BHK interpretation, and multiple possible interpretations of the word construction and function, and then I think most mathematicians do not bother to formal logic (while still caring about precise language and rigour). A different system of logic that is motivated by enabling constructive reasoning is Martin-Löf type theory (also called intuitionistic type theory or constructive type theory).

[3] Arend Heyting (1930). Die formalen Regeln der intuitionistischen Logik., 3 Teile, In: Sitzungsberichte der preußischen Akademie der Wissenschaften. phys.-math. Klasse, 1930, 42–56, 57–71, 158–169.

[4] Thierry Coquand. Kolmogorov's contribution to intuitionistic logic. 10.1007/978-3-540-36351-4_2.

[5] Stanford Encyclopedia of Philosophy The Development of Intuitionistic Logic.

Gentzen's natural deduction and sequent calculus

Gerhard Gentzen was David Hilbert's student and sparked the entire field of structural proof theory with his seminal article and thesis that introduced both natural deduction and sequent calculus - in both intuitionistic and classical variants [6].

Gentzen was not the first to invent natural deduction. Jan Łukasiewicz observed that the axiomatic systems of Frege, Russell and Hilbert do not reflect mathematical practice; mathematicians start their proofs from assumptions, applying "proof figures" to derive new results. His student Stanisław Jaśkowski came up with a calculus of natural deduction, where derivations from assumptions are drawn in boxes [7].

Gentzen pointed out that proofs come with structure: a proof may contain a "detour", and such detours can always be removed, yielding a normal form. In order to demonstrate this, he reformulated natural deduction as the sequent calculus and the normalization through his celebrated cut-elimination theorem. In the sequent calculus, lists of formulae are manipulated (the word sequent is a random word chosen by Kleene to resemble "Sequenz"). Much later, Dag Prawitz proved the corresponding normalization result directly for natural deduction [8].

Natural deduction captures formally and precisely reflect how a mathematician proves things while still admitting a "natural" understanding of what is going on - unlike axiomatic proof systems where structure is artificial and ultimately does not matter as it was obtained by modus ponens and axioms.

[6] Gentzen, Gerhard Karl Erich (1935). "Untersuchungen über das logische Schließen. I". Mathematische Zeitschrift. 39 (2): 176–210. doi:10.1007/BF01201353. (link)

[7] Stanisław Jaśkowski (1934) On the Rules of Suppositions in Formal Logic Studia Logica 1, pp. 5–32 (reprinted in: Storrs McCall (ed.), Polish Logic 1920-1939, Oxford University Press, 1967 pp. 232–258)

[8] Dag Prawitz (1965). "Natural Deduction : A proof theoretical study" (reissued 2006 Dover) \(\newcommand{\andintro}[0]{\color{blue}{(\wedge{\mathrm{\small INTRO}})}}\) \(\newcommand{\andelim}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM}})}}\) \(\newcommand{\andelimleft}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/1}})}}\) \(\newcommand{\andelimright}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/2}})}}\) \(\newcommand{\orintro}[0]{\color{blue}{(\vee{\mathrm{\small INTRO}})}}\) \(\newcommand{\orelim}[0]{\color{blue}{(\vee{\mathrm{\small ELIM}})}}\) \(\newcommand{\implintro}[0]{\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}}\) \(\newcommand{\implelim}[0]{\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}}\) \(\newcommand{\notelim}[0]{\color{blue}{(\neg{\mathrm{\small ELIM}})}}\) \(\newcommand{\botelim}[0]{\color{blue}{(\bot{\mathrm{\small ELIM}})}}\)

Intuitionistic natural deduction NJ

We now look at the calculus Gentzen calls NJ, but for propositional logic. For notation, we use $\wedge$ instead of & for conjunction and $\rightarrow$ instead of $\supset$ for implication. Our formulae are defined as:

  • $\bot$ is a formula ('falsehood')
  • a propositional variable $p$ is a formula,
  • if $A$ is a formula, so is $\neg A$
  • if $A, B$ are formulae, so are $A \vee B$, $A \wedge B$ and $A \rightarrow B$.

We shall need to use inference rules and derivations. An inference rule has the form: $$ \frac{X_1 \quad \ldots \quad X_n}{Y} $$

with the meaning that $X_i$ are premises and $Y$ the conclusion of the rule. In natural deduction, each premise and each conclusion of an inference rules is simply a formula. A derivation is an arrangement of inference rules in tree form; the derived formula is at the bottom and inference rules connect each conclusion to premises above the bar. The derivations built up from (instance of) inference rules this way gives a proof of the formula at the root.

The rules are as follows (the notation will be explained below): $$ \cfrac{A\quad B}{A \wedge B}~\color{blue}{(\wedge\mathrm{\small INTRO})} \quad \cfrac{A \wedge B}{A}~\color{blue}{(\wedge\mathrm{\small ELIM/1}}) \quad \cfrac{A \wedge B}{B}~\color{blue}{(\wedge\mathrm{\small ELIM/2})}$$ $$ \cfrac{A}{A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/1}}}) \quad \quad \frac{B}{A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/2}})} $$ $$ \cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$ $$ \cfrac{\begin{array}[cc]{} [A] \\ ~~\vdots \\ \bot \end{array} }{\neg A}~\color{blue}{(\neg{\mathrm{\small INTRO}})} \quad \cfrac{A \quad \neg{A}}{\bot}~\color{blue}{(\neg{\mathrm{\small ELIM}})} $$ $$ \cfrac{ \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}B\phantom{]} \end{array}}{A \rightarrow B} ~\implintro % % -> ELIM % \quad \quad \frac{A \rightarrow B \quad A}{B}~\implelim $$ $$ \cfrac{\bot}{A}~\botelim $$

Each rule preserves validity, which ensures that derivations will produce correct results (soundness). If one does not consider the proof-theoretical explanation of what an inference rule is as sufficient, one needs to talk about models and interpretations.

Moreover the collection of rules is systematic in a different way: if we ignore the duplicity of $\color{blue}{(\wedge{\mathrm{\small ELIM}})},\color{blue}{(\vee{\mathrm{\small INTRO}})}$ rules, we can say: for every logical operator $\neg, \wedge, \vee, \rightarrow$ there is one introduction rule and one elimination rule. In a sense, the introduction rule defines the meaning of the operator, and the elimination rule is ultimately a consequence of this definition: it removes precisely what was introduced! These are the "detours" in the structure of proofs, which can be removed.

In order to avoid unnecessary complication, we need to convene that in rule $\notelim$, the conclusion $A$ is always different from $\bot$. If we omit the rule $\notelim$, we end up with minimal logic. Let us now take two rules and read what they are saying:

  • $\andelimleft$ is the (left) elimination rule for $\wedge$, and says: if you have a proof of $A \wedge B$, then you also have a proof of $A$
  • $\andintro$ is the introduction rule for $\wedge$, and says: if you have a proof of $A$ and a proof of $B$, you may combine them into a proof of $A \wedge B$.

Assumptions. As mentioned, a derivation may start in any formula $A$ at the leafs. In the course of the proof, the assumptions may be discharged, as seen in the inference rules $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$, $\color{blue}{(\neg{\mathrm{\small INTRO}})}$ and $\color{blue}{(\vee{\mathrm{\small ELIM}})}$. The discharged assumption is then marked $[A]$ to indicate that the conclusion does not depend on it. The rule $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$ says if you can prove $B$ from assumption $A$, then we have proven from $A$ follows from $B$, i.e. $A \rightarrow B$. Note that it is possible that there are several other, undischarged assumptions still present in the derivation, and that the discharged assumption may have occurred multiple times in the subderivation, with all those occurrences being discharged at once.

Let's go through an example proof of $X \vee (Y \wedge Z) \rightarrow (X \vee Y) \wedge (X \vee Z)$. For this, we will need a derivation that ends in $\implintro$. $$ \cfrac{ \begin{array}[cc]{} [X \vee (Y \wedge Z)] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} }{ X \vee (Y \wedge Z) \rightarrow (X \vee Y) \wedge (X \vee Z) }{\implintro} $$ For the upper part, we need $\orelim$. $$ \cfrac{ \begin{array}[cc]{} {\phantom{X}} \\ {\phantom{\vdots}} \\ X \vee (Y \vee Z) \end{array} \quad \begin{array}[cc]{} [X] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} \quad \begin{array}[cc]{} [Y \wedge Z] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} }{ (X \vee Y) \wedge (X \vee Z) }{\orelim} $$ The two branches are now easy: $$ \cfrac{\cfrac{X}{(X \vee Y)}\orintro\quad\cfrac{X}{(X \vee Z)}\orintro} {(X \vee Y) \wedge (X \vee Z)}{\andintro} $$ $$ \cfrac{\cfrac{\cfrac{Y \wedge Z}{Y}\andelimleft}{(X \vee Y)}\orintro\quad\cfrac{\cfrac{Y \wedge Z}{Z}\andelimright}{(X \vee Z)}\orintro} {(X \vee Y) \wedge (X \vee Z)}{\andintro} $$

Intuitionistic treatment of negation. What if we treated $\neg A$ as an abbreviation for $A \rightarrow \bot$? Then the the role of $\color{blue}{(\neg{\mathrm{\small INTRO}})}$ is played by $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$ and the role of $\color{blue}{(\neg{\mathrm{\small ELIM}})}$ is played by $\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}$. In fact, every derivation in the language with primitive $\neg$ could be rewritten into a derivation for the language where $\neg$ is an abbreviation. Neat!

Normal form. A natural deduction proof is said to be in normal form if there is no detour, i.e. if there is no formula that is both the conclusion of an introduction rule and at the same time a major premise of an elimination rule. This is due to what Prawitz calls the inversion principle: whenever there is an elimination rule which operates on the result of an introduction rule above, then the results of that elimination rule were already present among the premises of the introduction rule. A normal derivation will - roughly - take apart the assumptions at the leaves with elimination rules and then build up the conclusion with introduction rules. (The $\orelim$ rule complicates this a little, since it passes through a formula $C$ from its minor premises; the proof by induction on the structure of derivations will transform such proofs by rewriting them and push the application of $\orelim$ downwards).

Curry-Howard correspondence. As discussed last time, we can assign terms of simply-typed $\lambda$-calculus to these derivations. Propositions of IPC are then exactly the types of simply-typed $\lambda$-calculus, with $\wedge$ product types and $\vee$ for sum (coproduct) types. Or, from the vantage point of logic, $\lambda$-terms are proof terms. Normalization of proofs corresponds to normalization of $\lambda$-terms; $\beta$-reduction corresponds to removal of a detour involving $\implintro$ and $\implelim$.

Still on Curry-Howard, it's a bit less obvious how to interpret $\botelim$ as a typing rule; the answer is that it needs an $\mathsf{abort}$ constant which (similar to throwing an exception) passes control to an imaginary top level function that never terminates. This and more is explained succinctly in Zena Ariola's slides [8], but I do want to point out how this yields a neat programmer's intuition for the convention regarding $\botelim$: since obviously an program $\mathsf{abort}(\ldots(\mathsf{abort}(t)\ldots)$ could always be replaced by a simple $\mathsf{abort}(t)$.

[8] Zena Ariola. The interplay between logic and computation. slides from a talk at "Women in Logic 2019".

Natural deduction in sequent calculus style

There is something odd about the above presentation: we have propositions at the leaves of a derivation which are temporarily treated as assumptions and then cease to be assumptions when they are discharged. It is also possible to not discharge them, or discharge only some of them when making a proof step. Here, we present an alternative formulation that makes it clear what happens with assumptions at every inference step.

(I thought for a long time that it was type theorists or programming language researchers who invented this, suspecting typographical reasons, but to my surprise I learned that it was in fact Gentzen [9] himself in 1936 who introduced this alternative in the paper where he proved the consistency of Peano arithmetic using transfinite induction.)

We shall present now intuitionistic natural deduction "in sequent calculus style". Derivations are still tree-shaped, however instead for formulas, premises and conclusions are sequents of the form $\Gamma \vdash A$ where $\Gamma$ is a (possibly empty) set of formulae. The meaning of a sequent is "from assumptions $\Gamma$, one can (intuitionistically) prove $A$". We write the union of such sets with a comma, omit braces around singleton sets and treat $\neg A$ as abbreviation for $A \rightarrow \bot$ so we don't have to give any additional rules. Then, the rules look like this: $$ \cfrac{\Gamma \vdash A\quad \Delta \vdash B}{\Gamma, \Delta \vdash A \wedge B}~\color{blue}{(\wedge\mathrm{\small INTRO})} \quad \cfrac{\Gamma \vdash A \wedge B}{\Gamma \vdash A}~\color{blue}{(\wedge\mathrm{\small ELIM/1}}) \quad \cfrac{\Gamma \vdash A \wedge B}{\Gamma \vdash B}~\color{blue}{(\wedge\mathrm{\small ELIM/2})}$$ $$ \cfrac{\Gamma \vdash A}{\Gamma \vdash A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/1}}}) \quad \quad \frac{\Gamma \vdash B}{\Gamma \vdash A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/2}})} $$ $$ \cfrac{\Gamma \vdash A \vee B \quad\quad \Delta, A \vdash C \quad\quad \Theta, B \vdash C}{\Gamma, \Delta, \Theta \vdash C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$ $$ \cfrac{\Gamma, A \vdash B}{\Gamma \vdash A \rightarrow B}~\implintro % % -> ELIM % \quad \quad \cfrac{\Gamma \vdash A \rightarrow B \quad \Delta \vdash A}{\Gamma, \Delta \vdash B}~\implelim $$ $$ \cfrac{\Gamma \vdash \bot}{\Gamma \vdash A}~\botelim $$

These rules make it clear that it is now "compulsory" to discharge assumptions. When a rule has multiple premises, the assumptions are combined. By choosing broad enough set of assumptions ahead of time (weakening) the different sets of assumptions can be assumed to be the same. When moving towards sequent calculus, one can do away with the treatment of assumptions as sets and treat them as ordered sequences but with structural rules like exchange and contraction; removing one or both structural rules then yields sub-structural logic and proof theory.

[9] Gerhard Gentzen (1936) "Die Widerspruchsfreiheit der reinen Zahlentheorie"

Conclusion

We are at the end of our tour; I hope this can provide a good starting point for studying the multiple connections between $\lambda$-calculus, logic and proof theory.

Sound reasoning - preferably automated - and some form of formal logic and proof are going to be key when modeling some domain and constructing correct software for it. Programmers are used to deal with precise formulations (programming languages, specs) but only a subset will be interested in the "formalization of all mathematics"; a programmer with a basic understanding of simply-typed $\lambda$-calculus should ideally have a very clear path to understanding logic, proof assistants, type theory - I don't think that this is the case today.

The activities in constructing models and software may be similar to mathematics, but they are certainly not the same. Each domain is a mathematical universe, with random and often contradictory pieces of information waiting to be organized in "islands of consistency". Hopefully we will one day reach some widespread understanding and usable tool support for formal reasoning that can serve both practical informatics as well as formalizing mathematics.

[If you want to discuss, the HN thread is a great place to leave comments]