26.12.20

Kripke semantics and Tableaux for Intuitionistic Logic

Simply-typed $\lambda$ calculus terms can be seen as encoding of natural deduction proofs of intuitionistic propositional logic (IPC - the C is for calculus; IPL is also common). This is the Curry Howard correspondence.

So we can prove an IPC formula by finding a suitable term of simply-typed $\lambda$ calculus. Sounds great - but there are other ways to represent proofs and I wanted to understand better how structure of these natural deduction proofs relates to the structure of those other proofs. \(\newcommand{\qfalse}[0]{\color{blue} {\bot}}\) \(\newcommand{\qp}[0]{\color{blue} p}\) \(\newcommand{\qq}[0]{\color{blue} q}\) \(\newcommand{\qX}[0]{\color{blue} X}\) \(\newcommand{\qY}[0]{\color{blue} Y}\) \(\newcommand{\qZ}[0]{\color{blue} Z}\) \(\newcommand{\qand}[0]{\color{blue} \wedge}\) \(\newcommand{\qor}[0]{\color{blue} \vee}\) \(\newcommand{\qimplies}[0]{\rightarrow}\) \(\newcommand{\qneg}[1]{{\neg #1}}\) \(\newcommand{\andintro}[0]{\color{blue}{(\wedge{\mathrm{\small INTRO}})}}\) \(\newcommand{\andelim}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM}})}}\) \(\newcommand{\andelimleft}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/1}})}}\) \(\newcommand{\andelimright}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/2}})}}\) \(\newcommand{\orintro}[0]{\color{blue}{(\vee{\mathrm{\small INTRO}})}}\) \(\newcommand{\orelim}[0]{\color{blue}{(\vee{\mathrm{\small ELIM}})}}\) \(\newcommand{\implintro}[0]{\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}}\) \(\newcommand{\implelim}[0]{\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}}\) \(\newcommand{\notelim}[0]{\color{blue}{(\neg{\mathrm{\small ELIM}})}}\) \(\newcommand{\botelim}[0]{\color{blue}{(\bot{\mathrm{\small ELIM}})}}\)

As an example, consider the formula $\qneg{\qneg{(\qp \qor \qneg{\qp)}}}$. We can take $\qneg{\qX}$ as abbreviation for $\qX \qimplies \bot$. Is it "obvious" how to find a term of type $((\qp \qor \qneg{\qp}) \qimplies \bot) \qimplies \bot$? Here is a natural deduction proof of (the names are from this list of intuitionistic natural deduction rules from the post above; I can't be bothered with typesetting a tree). $$\begin{array}{lll} 1. & \qneg{(\qp \qor \qneg{\qp)}} & \mathrm{(assumption)} \\ 2. & \qp & \mathrm{(assumption)} \\ 3. & \qp \qor \qneg{\qp} & \mathrm{(via}~\orintro/1~\mathrm{from~2.}\\ 4. & \bot & \mathrm{(via}~\implelim~\mathrm{apply~1.~to~3.)} \\ 5. & \qneg{\qp} & \mathrm{(via}~\implintro~\mathrm{from~[2.]~to~4.)} \\ 6. & \qp \qor \qneg{\qp} & \mathrm{(via}~\orintro/2~\mathrm{from~5.)} \\ 7. & \bot & \mathrm{(via}~\implelim~\mathrm{apply~1.~to~6.)} \\ 8. & \qneg{\qneg{(\qp \qor \qneg{\qp)}}}& \mathrm{(via}~\implintro~\mathrm{from~[1.]~to~7.)} \\ \end{array} $$

Maybe a better question to ask is: Why does this proof looks the way it looks? How, starting from the formula, can we arrive at such a proof, with steps that are so simple that we can leave them to a machine?

Intuitionistic Tableaux

Kripke semantics and tableaux give us an alternative view of proving IPC formulae. The tableau calculus we discuss here is not very different from sequent calculus, it uses "block tableaux" where each node is set of (signed) formulae. It will be a refutation calculus - despite proving intuitionistic logic!

The definitions here follow Melvin Fitting's book [1] and Smullyan's unified presentation [2].

In contrast to block tableaux, "analytic tableaux" have one formula per node and we consider the set of all formulas on a branch instead of looking at a node). These and connexion methods are out of scope today, just like modal logic or the algebraic view (denotational semantics). The only differences from the aforementioned texts is that instead of using negation as primitive, I will use $\qfalse$ as primitive, and I make it explicit when a principal $T \qX$ formulae has to be preserved in tableau extension.

[1] Melvin Fitting. Intuitionistic Logic, Model Theory and Forcing [researchgate pdf].

[2] Raymond M. Smullyan. A generalization of intuitionistic and modal logics. https://doi.org/10.1016/S0049-237X(08)71543-X

Syntax for Intuitionistic and Minimal Propositional Logic

Once more, we define of well-formed formulae (wff):

  • a propositional variable $\qp$ is a wff
  • $\qfalse$ is a wff
  • if $\qX$ and $\qY$ are wff, then $\qX \wedge \qY$ is a wff.
  • if $\qX$ and $\qY$ are wff, then $\qX \vee \qY$ is a wff.
  • if $\qX$ is a wff, then $\qX \qimplies \qY$ is a wff

We use $\qq,\qp$ for propositional variables and $\qX, \qY, \qZ$ as metavariables for formulae. The symbol $\qfalse$ stands for "falsehood", "falsity" or "absurd" and is also written as $\bot$. We use $\qfalse$ and $\qimplies$ as primitive (built-in) symbols, and define negation $\qneg{X}$ as an abbreviation for $\qX \qimplies \qfalse$. This goes well with the intuitionist notion that "a proof of $\qneg{X}$ amounts to a construction that transforms a proof $\qX$ to a proof of falsehood."

Kripke semantics

A (Kripke) model for intuitionistic propositional logic is a triple $\langle \mathcal{G}, \mathcal{R}, \Vdash \rangle$ where $\mathcal{G}$ is a set of states, $\mathcal{R}$ an accessibility relation between states that is reflexive and transitive, and a relation $\Vdash$ between states and formulae that satisfies conditions $K_0 - K_3$ below. We use $\Gamma, \Delta$ for elements of $\mathcal{G}$, read $\Gamma \Vdash \qX$ as "$\Gamma$ forces $\qX$" and write $\Gamma^\star$ for "any $\Delta$ with $\Gamma \mathcal{R} \Delta$".

($K_0$) If $\Gamma \Vdash \qX$ then $\Gamma^\star \Vdash \qX$

($K_1$) $\Gamma \Vdash \qX \qand \qY$ iff $\Gamma \Vdash \qX$ and $\Gamma \Vdash \qY$

($K_2$) $\Gamma \Vdash \qX \qor \qY$ iff $\Gamma \Vdash \qX$ or $\Gamma \Vdash \qY$

($K_3$) $\Gamma \Vdash \qX \qimplies \qY$ iff for every $\Gamma^\star$, either $\Gamma^\star \not\Vdash \qX$ or $\Gamma^\star \Vdash \qY$.

($K_4$) $\Gamma \not\Vdash \qfalse$.

A formula is valid in a model if $\Gamma \Vdash \qX$ for all $\Gamma \in \mathcal{G}$. A formula is valid if it is valid in all models. Note that with $K_0$ in mind, we could also have written conditions $K_1$ and $K_2$ differently:

($K_1'$) $\Gamma \Vdash \qX \qand \qY$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \qX$ and $\Gamma^\star \Vdash \qY$

($K_2'$) $\Gamma \Vdash \qX \qor \qY$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \qX$ or $\Gamma^\star \Vdash \qY$

In the above definition, it is understood that $\Vdash$ is "seeded" with a relation $\Vdash^{0}$ between states and propositional variables $\qp$. This $\Vdash^{0}$ is then extended to a relation between states and whole formulae. It is also common to define states directly as subsets of the "available" propositional variables (the ones forced in that state). Here is an example, with two states $\Gamma = \{\qq\}$ and $\Delta = \{\qp, \qq\}$:

The relation $\Vdash$ takes all reachable states into consideration, and once a state forces $\Gamma \Vdash \qq$, any subsequent $\Gamma^\star$ must force it, too. The "truth in a state" is rather more dynamic than absolute "truth" we are used to from classical logic. One can give this a temporal epistemic interpretation and say states represent subjective knowledge of true facts, and that this knowledge "grows over time" (as we access subsequent states).

As an exercise: Convince yourself that $\Delta \Vdash \qp \qand \qq$, $\Gamma \not\Vdash \qp$ and the formula $\qp \qimplies \qq$ is valid in this model. Note further that $\qp \vee \qneg{p}$ is not valid. We have $\Gamma \not\Vdash \qp$, which is very different from saying $\Gamma \Vdash \qneg{p}$.

[When Kripke models are used for modal logics, people consider accessibility relation with different properties, leading to different modal logics; there is an explicit symbol $\square$ to talk about accessibility, and our reflexive and (in particular) transitive $\mathcal{R}$ corresponds to S4 modal logic, with the S4 axiom $\square \qp \rightarrow \square\square \qp$. We will just mention here that intuitionistic logic can be interpreted in S4 modal logic.] \(\newcommand{\qT}[0]{T}\) \(\newcommand{\qF}[0]{F}\) \(\newcommand{\qS}[0]{\mathrm{S}}\)

Kripke semantics for Signed Formulae

On towards a proof system for IPC based on an intuitionistic tableau calculus (or just intuitionistic tableaux). To this end, we introduce signed formulae, using symbols $T$ and $F$ and writing $T\qX, F\qX$ for a signed version of formula $X$:
  • $\Gamma \vdash T\qX$ means $\Gamma \Vdash \qX$ and
  • $\Gamma \vdash F\qX$ means $\Gamma \not\Vdash \qX$.

    In classical logic, $T\qX$ would mean $\qX$ is true and $F\qX$ would mean $\qX$ is false (or $\qneg{X}$ is true). The way to understand $\Gamma \vdash F\qX$ non-classically is to read it as "it is not known that $\Gamma \Vdash \qX$." Note that $\Gamma$ is merely a state, not a context or set of formulae - we are using this assertion as a building block to compare proof situations. For example, the assertions $T\qX \qand \qY$ and $F\qX$ cannot both be true at the same time (or in the same state).

    We follow Smullyan's uniform notation, and classify our formulae into different types (whether they are "conjunctive" or "disjunctive" in nature):

    • type $\alpha$ is any signed formula of the shape $T\qX \qand \qY, F\qX \qor \qY, F\qX \qimplies \qY, F\qfalse$
    • type $\beta$ is any formula $F\qX \qand \qY, T\qX \qor \qY, T\qX \qimplies \qY, T\qfalse$.

    Since we added $\qfalse$ as primitive, we need to say something about its components, too. We treat $\qfalse$ like we treat an empty disjunction.

    For a signed formula $\alpha (\beta)$ we called certain signed subformula its components $\alpha_1$ and $\alpha_2$ (same for $\beta$), according to this table: $$ \begin{array}{c|c|c}% \alpha & \alpha_1 & \alpha_2 \\ \hline T\qX \qand \qY & T\qX & T\qY \\ \hline F\qX \qor \qY & F\qX & F\qY \\ \hline F\qX \qimplies \qY & T\qX & F\qY \\ \hline F \qfalse & F \qfalse & F \qfalse \end{array} \quad \quad \begin{array}{c|c|c}% \beta & \beta_1 & \beta_2 \\ \hline F\qX \qand \qY & F\qX & F\qY \\ \hline T\qX \qor \qY & T\qX & T\qY \\ \hline T\qX \qimplies \qY & F\qX & T\qY \\ \hline T \qfalse & T \qfalse & T \qfalse \end{array} $$

    Again, for $\qfalse$ we have subformulae that are not proper, but this is not going to be a concern.

    Let us call the conjugate of $\overline{T\qX}$ the corresponding $F\qX$ and vice versa. Note that the conjugate of a formula of type $\alpha$ is of type $\beta$ and vice versa. In classical logic, a set of signed formulae $\{T\qX_1,\ldots,T\qX_n, F\qY_1, \ldots, \qY_m\}$ would correspond to a sequent calculus sequent $\qX_1, \ldots \qX_n \longrightarrow \qY_1, \ldots, \qY_m$. The sequent would be provable iff the set of signed formulae is unsatisfiable. The distinction into type $\alpha$ and $\beta$ explain how a formula can either be split into its components for conjunctive formulae $\alpha$ or the proof needs to branch into two possibilities for disjunctive formulae $\beta$.

    For intuitionistic logic, we are going to call all formulae $T \qX$ permanent and all formulae $F \qX$ co-permanent. With all this setup, we can remix our definition of Kripke model and $\Vdash$ into the following conditions using signed formulae $\qX$:

    $(U_0(a))$ Either $\Gamma \Vdash \qX$ or $\Gamma \Vdash \overline{\qX}$, but not both.

    $(U_0(b))$ If $\Gamma \Vdash \qX$ and $\qX$ is permanent, then $\Gamma^\star \Vdash \qX$.

    $(U_1)$ For any permanent $\alpha$, $\Gamma \Vdash \alpha$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \alpha_1$ and $\Gamma^\star \Vdash \alpha_2$

    $(U_2)$ For any permanent $\beta$, $\Gamma \Vdash \beta$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \beta_1$ or $\Gamma^\star \Vdash \beta_2$

    The behavior on co-permanent $\alpha$ and $\beta$ is determined by $U_1$ and $U_2$ too, by taking into consideration $(U_0(b))$. Observe:

    $(U_1')$ For any co-permanent $\alpha$, $\Gamma \Vdash \alpha$ iff for some $\Gamma^\star$, $\Gamma^\star \Vdash \alpha_1$ and $\Gamma^\star \Vdash \alpha_2$

    $(U_2')$ For any co-permanent $\beta$, $\Gamma \Vdash \beta$ iff for some $\Gamma^\star$, $\Gamma^\star \Vdash \beta_1$ or $\Gamma^\star \Vdash \beta_2$

    As an exercise: Go back to the example above, and convince yourself that $\Delta \Vdash T\qp \qand \qq$ and $\Gamma \Vdash F\qp$. Check that the formula $T \qp \qimplies \qq$ is valid in every state, using the conditions $U_\star$ above. Now do the same for $T \qp \vee \qneg{p}$, remembering that it stands for $T \qp \vee (\qp \qimplies \qfalse)$. Now, to keep it interesting, what about $\Gamma \Vdash F \qp \vee \qneg{p}$? Is $F \qp \vee \qneg{p}$ valid?

    A final definition is to call a formula special if it is co-permanent but has a permanent component, otherwise it is called ordinary. The only special formula we have is $F\qX \qimplies \qY$, of type $\alpha$. If we had negation as a built-in operator, that would be a second special $\alpha$. Check out Smullyan's paper for predicate logic and other generalizations that require special $\beta$.

    Tableau Proofs

    A tableau proof is like a sequent calculus proof written upside-down: we draw it with the root at the top, growing towards the bottom. Instead of $S \cup \qX$, we write $S, \qX$. A vertical bar $\vert$ indicates that there are two children. $\qS_T$ means $\{ \qT\qX \vert \qT\qX \in \qS\}$, and $\alpha_P$ means repeat $\alpha$ if it is permanent (same for $\beta$). This is just to emphasize that $T \qX$ (possibly) needs be to preserved; in sequent calculus, this would surface as a structural operation. $$ \begin{array}{ll} (A1) & \mathrm{for~ordinary} ~ \alpha: ~ \LARGE \frac{\qS,~ \alpha }{\qS,~ \alpha_1,~ \alpha_2 } \\ (A2) & \mathrm{for~special} ~ \alpha: ~ \LARGE \frac{\qS,~ \alpha }{\qS_P,~ \alpha_1,~ \alpha_2 } \\ (B1) & \mathrm{for~ordinary} ~ \beta: ~ \LARGE \frac{\qS,~ \beta }{\qS,~ \beta_1 ~ \vert ~ \qS, ~\beta_2 } \\ \end{array} $$

    Now, a configuration is a collection $[~\qS_1, \qS_2, \ldots, \qS_n~]$ of sets of signed formulae. A rule takes a configuration and replaces one $S_i$ with its result (or its two results). A tableau is a finite sequence of configurations $C_1, \ldots C_n$ in which each configuration except the first resulted from its predecessor by application of a rule.

    A set of signed formulae is closed if it contains both $\qT \qX$ and $\qF \qX$ for some formula $\qX$ or if it contains $\qT \qfalse$. A configuration $[~\qS_1, \qS_2, \ldots, \qS_n~]$ is closed if all sets $\qS_i$ are closed. A tableau $C_1, \ldots C_n$ is closed if some $C_i$ is closed.

    A tableau for a signed set of formula $\qS$ is a tableau $C_1, \ldots, C_n$ with $C_1 = \{\qS\}$. A finite set of of signed formulae $\qS$ is inconsistent if some tableau for $S$ is closed. Otherwise, $\qS$ is consistent. A formula $\qX$ is a theorem, written $\vdash_I \qX$, if $\{\qF \qX\}$ is inconsistent, and a closed tableau for $\{\qF\qX\}$ is called a proof of $\qX$.

    What we have done here is to define provability of $X$ as inconsistency of $FX$. This refutation calculus may not be appealing to an intuitionist.

    Let's look at a proof of $\qneg{\qneg{(\qX \vee \qneg{X})}}$ in our $\qfalse$-primitive logic. $$ \begin{array}{l} [~\{\qF \qneg{\qneg{(\qX \vee \qneg{\qX})})} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}})\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX \vee \qneg{\qX}\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX, \qF \qneg{\qX}\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qT \qX\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX \vee (\qneg{\qX}), \qT \qX\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX, \qF \qneg{\qX}, \qT \qX\} ] \\ \end{array} $$ The same proof in a bit more verbose: $$ \begin{array}{l} [~\{\qF ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse) \qimplies \qfalse\}~] \\ \alpha (F\qimplies) \\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse, \qF \bot)\}~] \\ \beta (T\qimplies) \\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF (\qX \vee (\qX \qimplies \qfalse)), \qF \qfalse\}~, \{\ldots, \qT \qfalse\}] \\ \alpha (F\qor)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF \qX, \qF (\qX \qimplies \qfalse), \qF \qfalse\}~, \{\ldots, \qT \qfalse\}] \\ \alpha (F\qimplies)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qT \qX, \qF \qfalse\}, \{\ldots, \qT \qfalse\}] \\ \beta (T\qimplies)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF (\qX \vee (\qX \qimplies \qfalse)), \qT \qX, \qF \qfalse \}, \\ ~ \{\ldots, \qT \qfalse\}, \{\ldots, \qT \qfalse\}] \\ \alpha (F\qor)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF \qX, \qF (\qX \qimplies \qfalse), \qT \qX, \qF \qfalse \}, \\ ~ \{\ldots, \qT \qfalse\}, \{\ldots, \qT \qfalse\}] \\ \end{array} $$

    What is convenient about this?

    In applying the rules of the tableau calculus, we are taking the formula we want to prove apart and deal with its subformulae until we close a branch (configuration). However, it is not an entirely deterministic procedure, in the sense that we still need to pick which of the formulas we want to take apart, and if we pick the wrong one, we may have to backtrack. Automated reasoning! And it should be easy to convert the result into a natural deduction proof (similar to Prawitz's view that a sequent calculus proof can be seen as a description of how to construct a natural deduction proof).

    Another thing that is great about the tableaux method is that if after taking everything apart, you end up with open branches (that cannot be closed), you can read them as models that provide counter-examples. The application of a "special" rule indicates that we transitioned to another state.

    What is special about intuitionistic tableaux?

    Let's consider that signed formula $\qF \qX$ correspond to the right-hand side of a sequent calculus (the succedent). So unlike Gentzen's LJ, it is quite fine to have multiple formulae mingle on the right-hand side and stay intuitionistic, as long as the right-$\rightarrow$ rule (i.e. $\qF \qX \qimplies \qY$) keeps only one. This has also been explained here:

    [3] Valeria de Paiva, Luiz Carlos Pereira. A short note on intuitionistic propositional logic with multiple conclusions [researchgate pdf]

    The assertion $\qF \qX \qimplies \qY$ says: "at present, we think we can prove $\qX$ without at the same time proving $\qY$". In considering such a possible proof $\qT \qX$, we need to do bookkeeping to ensure that anything we prove along the way is separate from other formulae $\qF \qZ$ that we labeled as not-yet-known. This is why in the special $\alpha$ rule we use $\qS_P$ instead of $\qS$. This looks like the essence of intuitionistic reasoning as far as proof rules are concerned. It also makes intuitionistic tableaux a bit more complicated than classical.

    Coda

    I hope the above sheds some light on the connection between tableaux calculus, Kripke semantics and intuitionistic logic. Soundness and completeness can be found in Fitting's book, as well as generalizations to modal logic, predicate logic. Of course, for stronger logics, carrying over soundness and completeness does not mean we retain decidability; stronger logics require not only taking formulae apart, but also instantiating them, which increases non-determinism and search space. (Slightly updated 2022-01-22, for clarity.)

  • 28.9.20

    Intuitionistic Higher-Order Logic and Equality

    Intro

    I have posted plenty of motivation for considering equality and higher-order logic elsewhere on this blog by now. Let me finally get to the intuitionistic version of higher-order logic, since I talked about the classical version previously and then also talked at length about "Intuitionistic Logic and Natural Deduction" without really getting closer to the matter. Intuitionistic higher-order logic is what let's you talk about an elementary topos, and using equality to characterize the logical connectives gives an insightful perspective on those commutative diagrams for people who are familiar with $\lambda$ calculus. \( \newcommand{\qeq}[0]{\color{blue} {\mathrm{eq}}} \newcommand{\qSigma}[0]{\color{blue} \Sigma} \newcommand{\qPi}[0]{\color{blue} \Pi} \newcommand{\qor}[0]{\color{blue} {\mathrm{or}}} \newcommand{\tprop}[0]{\omicron} \newcommand{\qtrue}[0]{{\color{blue} {\mathrm{true}}}} \newcommand{\qand}[0]{\color{blue} {\mathrm{and}}} \newcommand{\qimplies}[0]{\color{blue} {\mathrm{implies}}} \newcommand{\qPi}[0]{\color{blue} \Pi} \newcommand{\qfalse}[0]{{\color{blue} {\mathrm{false}}}} \newcommand{\qnot}[0]{{\color{blue} {\mathrm{not}}}} \)

    I am going to refer to material in "Higher order logic and equality" often. Like in that post, there is actually no original material here; I am just gathering definitions that is probably well-known to some grad students and experts (yet nobody calls it standard since neither higher-order logic nor intuitionistic logic are known very well outside certain circles who deal with formal logic and proof assistants and the like). This is my feeble attempt at an exposition.

    The Logic of a Topos

    We take a $\lambda$ calculus with surjective pairing and a type $\omicron$ which we take as a substitute for "truth values" (topos theory texts use $\Omega$ but we want easy comparison with the classical version linked above, and we spell out $\qPi \lambda x. t$ without setting up an abbreviation $\forall x. t$ and we also spell out $\qSigma \lambda x. t$ without setting up an abbreviation $\exists x. t$). We want to set this up in an intuitionistic manner, which means doing away with the idea that $\omicron$ stands for two truth values, and not using classical logical equivalences to define $\qor$ and $\qSigma$. The terms would then correspond to a particular cartesian-closed category, the free topos.

    We again want to define everything in terms of equality $\qeq_\alpha: \alpha \rightarrow \alpha \rightarrow \omicron$. What is this equality doing? In our langugage, it talks about equality of terms, and in categorical logic, terms correspond to arrows and thus equality should talk about equality of arrows. (There is some detail hidden here as our language will also let us talk about "global elements" $1 \stackrel{c}{\longrightarrow} C$ which are written as constant terms $c: C$). We take the following elements of our logical language: $$ \begin{align*} \qtrue:\ & \omicron\\ \qand:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qimplies:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qPi_\alpha:\ & (\alpha \rightarrow \omicron) \rightarrow \omicron\\ \end{align*} $$

    ... and define them using the only primitive $\qeq$: $$ \begin{align*} \qtrue &:= \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \qeq_\tprop \qeq_\tprop\\ \qand &:= \lambda x:\tprop. \lambda y:\tprop. \\ &\quad \quad \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~\qtrue~\qtrue) \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~x~y) \\ \qimplies &:= \lambda x:\tprop. \lambda y:\tprop. \qeq~x~(\qand~x~y)\\ \qPi_\alpha &:= \qeq_{(\alpha \rightarrow \tprop) \rightarrow (\alpha \rightarrow \tprop) \rightarrow \tprop}~(\lambda x : \alpha. \qtrue) \end{align*} $$

    These definitions are the same as for classical logic. An intro to topos theory will typically describe $\qand$ and $\qimplies$ using commutative diagrams, but here we are defining the language first. Moreover, the language has only one primitive $\qeq$, though that primitive is instantiated at different types.

    False

    Now we look at $\qfalse$, which is also defined the same as in classical case: $$ \begin{align*} \qfalse &: \omicron\\ \qfalse &:= \qeq_{\tprop \rightarrow \tprop} (\lambda x:\tprop. \qtrue) (\lambda x:\tprop. x)\\ \end{align*} $$

    In a topos theory text, one encounters $\forall v:\omicron. v$, which is shorthand for $\qPi\lambda v:\omicron. v$ and actually expands to the very same definition above. The definition that mentions $\qeq$ explicitly has the advantage that it explains what is going on: it morally says "if there is anything other than $\qtrue$ in $\omicron$, then $\lambda x:\omicron. \qtrue$ and $\mathsf{id}_\omicron$ are not the same." So we have a value $\qfalse$, but it need to be the only thing distinct from $\qtrue$ in $\omicron$.

    Not, Or and Exists

    On to intuitionistic negation $\qnot$. In intuitionistic logic, we would define negation like this: $$ \begin{align*} \qnot &: \omicron \rightarrow \omicron\\ \qnot &:= \lambda x: \tprop. \qimplies(x, \qfalse) \end{align*} $$

    If we consult the classical version $\qnot := \qeq_{\tprop \rightarrow \tprop} \qfalse$, we again find that this is not that different.

    Finally, let's look at $\qor$ and $\qSigma$. We define these using $\qPi$: $$ \begin{align*} \qor &: \omicron \rightarrow \omicron \rightarrow \omicron\\ \qor &:= \lambda x: \omicron. \lambda y: \omicron. \qPi \lambda r:\omicron. \\ & \quad \quad \qimplies(\qand(\qimplies(x, r), \qimplies(y,r)), r) \\ \\ \qSigma_\alpha &: (\alpha \rightarrow \omicron) \rightarrow \omicron \\ \qSigma_\alpha &:= \lambda x: \alpha \rightarrow \omicron. \qPi \lambda r:\omicron. \\ &\quad \quad (\qPi \lambda z:\alpha. \qimplies(\qimplies(x z, r), vr) \end{align*} $$

    We can get an intuitive understanding of these definitions: for $\qor$, if we have a way of obtaining an element $r$ (result) from $x$ and one of obtaining the same from $y$, then also from $(\qor~x~y)$. Similarly for $\qSigma$, if we have a way of obtaining any result $r$ from the "instantiated" predicate $x z$, then also from $\qSigma x$.

    The definition of $\qor$ in terms of $\qand$ and $\qPi$ is not very different from natural deduction rule for "or elimination": $$ \cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$

    Checking the the natural deduction rule for existential elimination and comparing it with the encoding above is left as an exercise.

    The end of the beginning

    Higher order logic seems to deserve more attention. We have set up a logical system based on $\lambda$ calculus without making use of the Curry Howard correspondence. While the previous discussion focused on classical higher order logic as introduced by Church in 1940, in this post we looked at intuitionistic higher order logic. The interest in the latter is at least very heavily influenced by topos theory.

    It is slightly confusing that when discussing elementary toposes, one talks of first-order logic when in reality the logic is inherently higher-order (because a topos is cartesian closed). I guess the charitable perspective is what people mean is "you can do everything that you would want to do in formal set theory using first-order logic."

    The Lambek and Scott book "Introduction to higher order categorical logic" dwells a bit on the fact that it is always possible to get a formulation of the internal language of a topos using equality. Andrews $Q_0$ develops simple type theory / higher order logic using a single equality operator $Q$. Various theorem proving systems have been based on this (TPS and HOL).

    One could say that the way higher-order logic admits the definition of logical connectives in terms of other is an "encoding" of natural deduction rules. Could one use this idea to build a logical framework, a representation of derivations? Indeed this is what is underlying Isabelle/Pure framework and $\lambda$Prolog.

    It would seem worthwhile to write a proper book that takes a logical perspective on the $\lambda$ calculus, developing higher order logic as language, category theory as semantics and arriving at elementary toposes as models. It would probably feel very elementary and pedestrian in the academic world, but I do think it would be useful. It feels to me that there is so much "background knowledge" to gather the relevant theory on logic and computation that most practitioners will not have the energy (and this says something about the potential for adopting of formal methods). Until such a book arrives (I'm not sure if I would have the energy), I hope my out-of-order blog posts provide some help on the reader's journey. There are certainly a lot of of connections between logic, language, types and semantics that are left to be explored, besides Curry Howard and dependent types.

    27.9.20

    Programmers love Higher-Order Logic

    My previous take on higher-order logic was classical, channeling Church's simple theory of types and Andrews $Q_0$. In that last post, I was hinting at motivation. So I want to expand a bit now how this stuff can be useful for building software at large. I am posting some ramblings here.

    Equality is Fundamental

    There are two different "levels" at which a programmer deals with some kind of reasoning that involves equality:

    • while writing code: need to be able to tell when two objects are the same.
    • when designing or arguing that code is correct: reasoning whether behavior like output values conform to a (however implicit) specification

    In both cases, one needs to be able to identify whether objects (or behaviors or input-output pairs) are "the same" as expected ones. But the latter activity, reasoning about program correctness, happens in some form of logic. Much that has been written and is being written on formal logic is about mathematics and mathematical convention. This is true whether we talk about semantics of programming languages or program specification.

    But for programmers, there are benefits to knowing formal reasoning even if one is not interested in the foundations of math, axiomatizations of set theory etc. It is no secret that in mathematics as in programming, you run into situations where there are multiple different ways to represent some data, and one needs to chose or translate between those representations. While programmers resign to writing "proto-to-proto" translation over and over (which often are not as trivial as it seems), in mathematics it is customary to leave out the awkward details, introduce "abuse of notiation" and quotienting to keep arguments informal, but concise.

    The heart and soul of much mathematics consists of the fact that the “same” object can be presented to us in different ways. [1]

    [1] Barry Mazur. When is one thing equal to some other thing?

    Equality in First Order Logic

    We take a leap towards "formal systems", where objects are represented using (strings of) symbols, along with strict rules to form statements and derive new true statements from existing ones.

    A typical definition of a first-order logic language contains constant and function symbols, predicate symbols, logical operators.

    There is a "design choice" here whether equality is considered as built-in, magically denoting exactly the equality in the underlying sets, or whether equality is a predicate with certain special properties. Wikipedia" talks about "first-order logic with equality" (when built-in) and first-order logic without equality (when it is not).

    When equality does not come built-in, one needs to add axioms. The equality axioms are special, in the sense that they have to be preserved by all the function and relation symbols. This looks like:

    • $\mathsf{eq}$ is reflexitve, symmetric, transitive
    • if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $\mathsf{eq}(f(x_1,\ldots,x_n),f(y_1,\ldots,y_n)$ for all function symbols $f$ of arity $n$
    • if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $R(x_1,\ldots,x_n) \Leftrightarrow R(y_1,\ldots,y_n)$ for all relation symbols $R$ of arity $n$

    With a "defined" equality, is it is well possible to interpret logical statements involving $\mathsf{eq}$ in such a way that two distinct elements are regarded as equal according to $\mathsf{eq}$.

    Coding our own equality

    When we are talking about practical software development, there is really no hope of defining a platonic ideals; yet when techniques of reasoning, informaticians are in various ways confined to results and methods that formalizing mathematics.

    Take this example: We'd like our terms to represent free monoid on $X$, so we'd like these terms to satisfy the following equations (or identities): $$s \cdot 1 = s$$ $$t = 1 \cdot t$$ $$(s \cdot t) \cdot u = s \cdot (t \cdot u)$$

    Consider definitions in some fictional language to express the "interface" of a monoid: $$ \begin{array}{ll} \mathsf{interface}~ M[X] \\ \quad \mathsf{unit}(): M[X]\\ \quad \mathsf{emb}(x: X): M[X]\\ \quad \mathsf{dot}(s: M[X], t: M[X]): M[X]\\ \quad \mathsf{eq}(s: M[X], t: M[X]): \mathsf{bool} \end{array}$$

    Here, $\mathsf{unit}$ stands for $1$, $\mathsf{emb}(x)$ is for embedding $x$ and $\mathsf{dot}$ is the combine operaion in $MX$. If our fictional language has algebraic data types, we can use those to build terms which represent the algebraic theory of monoids: $$ \begin{array}{ll} \mathsf{data}~ \mathsf{Mon}[X] \\ \quad \mathsf{Unit}(): \mathsf{Mon}[X]\\ \quad \mathsf{Emb}(x: X): \mathsf{Mon}[X]\\ \quad \mathsf{Dot}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{Mon}[X]\\ \quad \mathsf{eqMon}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{bool} \end{array}$$

    It is now really obvious what these constructors stand for, but we note that while the equality $\mathsf{eqMon}$ returns true whenever two terms are equal when considered as elements of the free monoid, there is at least one more equality relation on terms that is different ("structural equality" or "syntactic equality", which will consider $\mathsf{dot}(\mathsf{dot}(s, t), u)$ as distinct from $\mathsf{dot}(s, \mathsf{dot}(t, u))$).

    Mathematically speaking, we want to consider the quotient set $\mathsf{Mon}[X] / \mathsf{eqMon}$, and work with equivalence classes.

    Formulas = Functions to Truth Values

    After all this talk about equality in algebraic theories, we want to turn equality in the context of logic, specifically higher-order logic.

    In classical logic, a proposition (logical statement) stands for a truth value, and there are exactly two of then (the two-value type "bool"). Boole was the first to associate propositional connectives with algebraic operations. In this perspective, the the truth values are part of the language, even though this was later abandoned as in Tarski-style semantics.

    Programmers are quite used to having functions return $\mathsf{bool}$. In terms of mathematical logic, our custom-made equality "relation" $\mathsf{eqMon}$ above is in fact a function to the set of truth values. Could we not do away with relations and talk in terms of functions?

    Yes, we can; using truth values and functions in a formal language of propositions is the essential summary of classical higher order logic (or Church's simple theory of types. There is a more compelling example: if you consider "set builder notation" $$ \{ x \in A \ |\ \phi(x) \} $$ then this describes a subset of $A$. We can choose to describe $\phi$ as defining a (unary) relation, but viewing it as a function to $\mathsf{bool}$ is no doubt more natural in a context like SELECT x FROM A WHERE $\phi$ where "relations" are given by their extensions (as tables) and computation in the form of expressions is available.

    The point of (classical) higher-order logic is that using the $\lambda$ calculus and truth values, we have everything we need to define the meaning of all logical connectives, and the types prevent logical paradoxes of "naively-untyped set theory". Since the truth values are part of the language, it is possible to define the meaning purely through definitions that involve equality.

    More on this and classical higher order logic in that last post on Higher Order Logic and Equality and the next one I'm going to write in a second, where I talk about the intuitionistic version.

    I will just end with a note that there is no distinction between "predicate", "relation" and "set" is higher-order logic, and this is a good thing. An expression of type $\iota \rightarrow \omicron$ (where $\iota$ are the individuals and $\omicron$ the truth values) defines a (sub)set of individuals, a predicate on the set of individuals, a unary relation, all at the same time. $$\mathsf{even} := \lambda x:\iota. \Sigma \lambda y:\iota. \mathsf{eq}(x, \mathsf{times}(2, y))$$ $$\mathsf{greater} := \lambda x:\iota.\lambda y:\iota. \Sigma \lambda z:\iota. \mathsf{eq}(x, \mathsf{plus}(y,z))$$ $$\mathsf{lowerbound} := \lambda x:\iota.\lambda y:\iota \rightarrow \omicron. \Pi \lambda z:\iota. \mathsf{implies}(y(z), \mathsf{greater}(z, x))$$

    The confusing bit for programmers may be that this thing is called simple type theory when in fact there is only one "sort" of individuals. This makes more sense when considering that it is no problem to come up with multiple "sorts" of individuals but that wouldn't give sets ($\iota \rightarrow \omicron$), sets of sets $(\iota \rightarrow \omicron) \rightarrow \omicron$, sets of sets of sets (and so on) of individuals. It's this distinction of hierarchies of sets/predicates/relations that earn Church's higher order logic the name "simple type theory": only terms with the right types are considered well-formed and meaningful.

    19.7.20

    Modules, components and their limits

    I read Clear explanation of Rust's module system. The TL;DR the module system in rust is a way to (somewhat) break/hide the link between what things are defined and what source files the things are defined in. A rust module provides control over visibility across source files and makes explicit how references reach across "modules".

    The post starts with a request to ignore what "module" means in other languages, and this plunged me into thoughts on all the things that are called modules. It is somewhat sad but to be expected that a basic word like "module" (or "package") could never have a standard meaning, despite all these programming languages supposedly built while we have all this PL theory.

    The more academic vocabulary has "module" as a language concept that comes with a scope and visibility, a way to bundle types and named values. It's somehwat independent of installation, deployment (the only touch point being support for separate compilation), but - to my knowledge - there is no talk of an "installed-at-a-version" module. So the rust module concept is pretty consistent with that. Arranging a set of modules that fit together is the programmer's problem, not the researcher's.

    Alas, there is this thing that happens at the level of Rust crates/npm packages, and it is really a different problem. Nevertheless, those things are also called modules in some languages (C++20 modules, Java modules, go modules).

    In Java, source files in the same directory make a Java package and they come with a way to control visibility. The association between files and packages is rigid. More recently, Java modules were added to express dependencies between modules, visibility of packages - something like a "packages of packages", but really more on the level of Rust crates.

    I am reminded of "components" (if anybody remembers component-oriented programming) because deployment is part of this game - the notion of dependencies only makes sense in the context of some installation. But then components used to be about runtime services (in addition to code), object brokers and such and here we would actually like to talk about less dynamic things, just code and functionality in the form of source and/or libraries. Still, components are things that get assembled and exist at versions, so I'd say it's an appropriate, if old-fashioned name.

    Now, such a (static) component would abstractly describe what npm packages or Rust crates achieve, and also form of Debian development packages (sigh) where you install a C library in binary form and its header files. That gives us even more of a twist now, which is that C++ 20 modules are really modules in the sense of achieving grouping for visibility and scope (check) but at the same time that whole notion has a huge impact on C++ since it should actually make it easier to deploy C++ components. A relevant quote from the C++20 link above:

    After a module is compiled once, the results are stored in a binary file that describes all the exported types, functions and templates. That file can be processed much faster than a header file, and can be reused by the compiler every place where the module is imported in a project.

    So while a C++20 module is language concept, it is foreseeable that one day all self-respecting newly developed libraries will be deployed as (installed at a version) modules. Oh well. There was really no reason to hope that a standard meaning for "modules" and the concept "above" could be justified in any way, across languages. At least, there remains an insight that it really makes sense to deal with the two levels separately.

    I sure wish we could all call the installed-at-a-version thing "component", and the language concept "modules". Just two paragraphs on how that might sound:

    Every language needs a module system: a way to deal with grouping, interfaces (in terms of making visible - "exporting" only the interface and keeping the rest internal). Including files was always bad and indicates the lack of a module system. It's when one needs to deal with independently developed, deployed and release units that there comes the problem of dependencies, evolution and versions, and these things are the components. Components are related to modules in that they need to contain at least one of them for talking about the things that are provided and brought in scope.

    If there is a dependency between versions of components and there are backwards incompatible (!) changes in the depended-upon component (or a left-pad situation), then this is something that needs to be dealt through changes to the depending component. That isn't something that module systems (or any other language design) can help much with.

    18.7.20

    Relating partial evaluation, multi-stage programming and macros

    Here is some quick fact on two metaprogramming techniques. $$ \begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{Int}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ 1\ (x\ * (\mathtt{pow}\ (n - 1)\ x)) \end{array} $$

    What can we do with an expression $\mathtt{pow}\ 3$?

    Partial evaluation

    The "partial evaluation" answer is to inline the definition, and evaluate what can be evaluated, yielding a residual program. $$ \begin{array}{l} \lambda x : \mathsf{Int}. \mathsf{if}\ (\mathsf{eqInt}\ 3\ 0)\ 1\ (x * (\mathtt{pow}\ (3 - 1)\ x)\\ \lambda x : \mathsf{Int}. x * (\mathtt{pow}\ 2\ x)\\ \ldots\\ \lambda x : \mathsf{Int}. x * (x * (x * 1)) \end{array} $$

    Staging

    The multi-stage programming answer (or staging) is to rethink $\mathtt{pow}$ as a transformation of code. $$ \newcommand{\qquote}[1]{\color{blue}{[|}#1\color{blue}{|]}} \begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{\color{blue}{Code[Int]}}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ \color{blue}{[|}1\color{blue}{|]}\ \color{blue}{[|}($x * $(\mathtt{pow}\ (n - 1)\ x)\color{blue}{|]} \\ \mathsf{let}\ \mathtt{mkpow}\ =\ \lambda n : \mathsf{Int}. \qquote{\lambda y : \mathsf{Int}. $(\mathtt{pow}\ n \qquote{y})}\\ \end{array} $$

    The notation was (probably) not designed to scare you:

    • $\qquote{\ldots}$ is quasi-quotation. What is between these brackets is code
    • $\$(\ldots)$ is used within a quasi-quoted expression to splice code.

    Now $\mathtt{mkpow}\ 3$ is going to give us the code of the desired power function: $$ \begin{array}{l} \qquote{\lambda y : \mathsf{Int}. $(\mathtt{pow}\ 3\ \qquote{y})}\\ \qquote{\lambda y : \mathsf{Int}. $(\mathsf{if}\ (\mathsf{eqInt}\ 3\ 0)\ \qquote{1}\ \qquote{($\qquote{y} * $(\mathtt{pow}\ (3 - 1)\ \qquote{y})})}\\ \qquote{\lambda y : \mathsf{Int}. $\qquote{($\qquote{y} * $(\mathtt{pow}\ 2\ y)})}\\ \qquote{\lambda y : \mathsf{Int}. $\qquote{y} * $(\mathtt{pow}\ 2\ y)}\\ \qquote{\lambda y : \mathsf{Int}. y * $(\mathtt{pow}\ 2\ y)}\\ \qquote{\ldots}\\ \qquote{\lambda y : \mathsf{Int}. y * (y * (y * 1))} \end{array} $$

    There can be nested levels of $\color{blue}{\mathsf{Code}}$ and this explains the term multi-stage programming. For objects of this type to be really useful, there should be a method $\mathsf{run}: \color{blue}{\mathsf{Code}[T]} \rightarrow T$.

    Macros are very related to the above: a macro gives the programmer a way to give a definition that is elaborated at compile time. This is similar to what is going on in splicing. The special case is that it is the compiler that calls $\mathsf{run}$; in other words,the context in which a macro is elaborated is the "compile-time stage". The design of Scala's metaprogramming facilities is consistent with all this, and I picked this up from "A practical unification of multi-stage programming and macros" by Stucki, Biboudis, Odersky.

    That's all; this is shallow as all my posts and there would be more to say; "Modal analysis of staged computation" by Davies and Pfenning would be a good place to continue for the curious and/or impatient. I'd like to connect these views on staging and macros and modal logic, but that's for another day.

    4.7.20

    Language, Logic and Modeling

    Language is the parrot of thought, and a hard to teach one, nothing more. („Sprache ist der Papagei des Gedankens, und ein schwer gelehriger, nichts weiter.“) ― Friedrich Hebbel

    Here's a short note about a perspective on logic for informatics, and decidedly not mathematics.

    Formal logic has its roots in formalizing mathematics. Expressing the steps of reasoning in a formal system would assure us that there are no "gaps". I wrote a brief account on the beginnings of formal logic elsewhere.

    The 20th century brought with it the advent of machines for information processing. This has gone way beyond mathematics already. In informatics (a better word for "computer science") one deals with information processing, and its practical kin, software engineering, deals with systems of all kinds where information gets in contact with reality.

    There is a view that holds that we can only understand reality through models (see constructivist epistemology). Now, whether one subscribes to this or not, it is obvious that informatics relies on models of reality; information itself is always part of some model of reality. If you worked with customers or product managers, you know it is shaky and always incomplete series of models of reality, always adapting to new circumstances.

    All I am saying is: formal logic can and should help us as a tool for formulate precisely these models. Like in mathematics, we do not want to have "gaps" in our reasoning. This is a somewhat ambitious direction, as software systems are not always built according to specification, and specifications are usually prose. And it is not only about modeling the domain.

    "Functional programming" has gathered a large following in recent years, and I think this is because less because "software should be mathematics", but programmers benefit from constructing their code according to models that leaves no "gaps". The tool at hand, a (statically typed) functional programming languages, forces one to think thoroughly about data and the operations that are going to be carried out.

    One does not need to resort to philosophy for challenging this. Large software systems are distributed over many machines, and gives rise to challenges that require methods different from those that are mathematics (and different from programs that run on a single machine). While functional requirements lead us to modeling domain concepts, designing a system that meets those is constrainted by non-functional requirements. We need to consider of performance and machine capacity, and evaluate which underlying technologies to build on, what guarantees they provide. It is here that engineering can be meaningfully distinguished from programming, and work is influenced less by theorems, more by best practices and experience.

    Engineering and domain modeling are full of abstract models (the favorite tool being: diagrams of boxes with their meaning provided by handwaving). Formal logic and actual, manifest models where people could experiment are not encountered very often, yet expressing these models, reasoning with them and testing/verifying them would be critical since mistakes are costly. Moreover, a large number of people have to understand and act in accordance to these models. Often, the only time where this becomes obvious are the APIs between subsystems, a point where much modeling has already happened and cannot be changed.

    I will go further and claim that Conway's Law is a consequence of each team forming their own "unit of modeling". It would be a lot better when we learnt to communicate explicitly and precisely what our models are, explain our needs to model things differently and how we can translate between our different models instead of insisting on some "right" way of seeing something. I place my hopes in logic as a language and tool to do just that. Mathematics is a discipline where the practice of modeling a mathematical "universe" has advanced very far, and it is no wonder that people seeking to use precise methods turn to it. It is a mistake to interpret this field of application of formal logic as some sort of mathematical monopoly. Language is a tool to communicate, a logic is a style of language.

    2.7.20

    I don't know how to Semantics

    I missed out my daily writing practice of bad, unfinished, unedited thoughts. So why not combine it with idle thoughts on logic. The title alludes to the friendly LOGO error message, and this exchange on Twitter:

    Andrej is not wrong, but axiomatic semantics is a thing in PL. So I'm collecting now the two different contexts for the word and idea of "semantics", the mathematical logic, one in programming languages. I am generally trying to embrace multitude of meanings in different contexts, but given the strong connections between PL and logic, it is worth pondering what is going on here.

    Linguistic meaning: the writing on the wall

    The word "semantics" comes from σῆμα "sign", "marking" and refers to the study of meaning (somewhat associated with linguistics, the study of language). Now meaning a very broad thing to study, and how we create meaning and interpret signs proper (beyond language) leads us to semiotics. I promise I am not going to spiral into ever higher level of abstractions and get back to the more mundane divergences of academic disciplines; still I want to share a story to show the depth of the question "how we make meaning of something" and what meaning really is.

    Here is an example of linguistic and semiotic meaning. I believe I had heard and understood in English the idiom "the writing is on the wall" long before I heard the biblical story. It is hard to tell, since I do not remember. Maybe I looked up the German word "Menetekel" – which is the same but doesn't sound as cool and portentous – and then forgot about it. Maybe I learned and chose to forget biblical reference. English is full of references to stories that become idioms and detached from their origins (I also learned very recently that "drinking the Kool-Aid" originates from a Jonestown massacre), yet somehow they convey meaning.

    Conveying meaning in written language can only happen through more written language; however, language is definitely more than what is written. I claim it's possible that even if you never heard this story from the Bible and have no access to it, you can nevertheless learn the full meaning of the saying "the writing is on the wall" when you witness it being used in some situation that discusses the terrible ending to come. Whether that is from the textual, narrative context, or rather from the faces, gestures of the participant or from your understanding of the reality of the thing it is said in reference to, that's hard to tell.

    Kamus writes how many meanings each word has, then, I wondered and tried to write down a Kamus showing, to how many words a meaning corresponds. (Bediuzzaman Said Nursi)
    Wisdom is a tree which sprouts in the heart and fruits on the tongue. (Ali ibn Abi Talib)

    Now, back to our divergent use of "semantics" in mathematical logic and programming languages.

    Semantics in mathematical logic

    The predominant view in mathematical logic today is Tarski's conception of semantics and truth, adapted to model theory. This is what is taught to undergrads, and I shall like to deconstruct the assumptions that come with it:

    • Statements are either true or false, or otherwise put, there is an objective notion of truth that we try to capture with logic languages.
    • The meaning for a logical model is given by describing the interpretation of the language in a model
    • The way we communicate or describe a model and the interpretation are well understood.

    All issues one may have are subdued with the last point. You cannot really question why models are given in "informal set theory" because the whole meaning exercise requires some common ground. If one ever wonders how to give meaning to those things that are supposed to provide the well-understood meaning, one finds that formal set theory is relatively disconnected from what one is trying to do (and turns to type theory).

    I find it very striking that at the time Tarski formulated his original criterion for defining truth, the lingua franca was higher-order logic (which is equivalent to simple type theory), not set theory. I do feel something was lost there. Classical higher-order logic, we are concerned with mapping to truth values, but point being about the mappings (not the "sets"). So rather than being the "informal set", a grouping of objects by some property, or a formal set which described by an obscure string formal set theory, a "higher-order logic" set is defined by its mapping. A "woman" is defined by a mapping

    woman: Individuals -> {true, false}

    ... and you can argue when it returns true. I find this closer to discussions on meaning in the linguistic sense.

    Semantics in programming languages

    In programming languages, "semantics" refers to the "meaning" to programming language (this is actually a surprising and confusing statement). Now, "meaning" is something a lot narrower than linguistic meaning.

    • Something does not have to be universally and objectively true to give "meaning"
    • The conventions we can rely on when giving meaning are entirely different.
    • The way we communicate or describe a model and the interpretation are well enough understood.
    • "Meaning" can be something like a specification that clarifies what statements and expressions of the language do. In practice, this is all informal. In research, there are three "styles":

      • Denotational semantics. We choose to embrace the conventions of mathematics as our medium for conveying meaning as objective truth
      • Operational semantics. The meaning is in the behavior, the execution of the programs. See what this interpreter does? That is the meaning.
      • Axiomatic semantics. The meaning of a program is properties you can state.

      This distinction is entirely standard and helpful, I learnt these as undergrad and e.g. Benjamin Pierce "Types and Programming Languages". It is also largely ignored in industry, where a natural languages specification relying on conventions is all you get for "meaning" (and stackoverflow may cover the gaps).

      Finale

      Programming actually has its roots in formal logic. It is not a coincidence that the act of setting up formal languages and the burden of having to define meaning is shared by people who mathematical logic (ok, I should say formal logic to not exclude the philosophers) and programming languages, and that informaticians take an interest in proof assistants and machine-checked proof.

      Observe that all styles leave open a path to full formalization, but for axiomatic semantics it is somewhat odd since we'd have to formalize logic in logic. The boundaries are also not as clear-cut, since one can also describe an interpreter by some mathematical structure (labeled transition systems) and there is als formal methods, verification and the "theory of a program" and how it needs some semantics to build on (not necessarily "the" semantics). The taxonomy is still very useful, though, since it says something about the intention.

      My personal hobby is to study stacking systems of logic and reasoning, "compositional". Category theory is useful here, precisely because it permits to consistently ignore what is the foundation, and describe everything in terms of mappings (arrows), just like higher-order logic. I shall leave this topic for another post.

    29.6.20

    Go Karma Schizophrenia

    Today, I tried to busy myself with doing some programming exercise in go. Not the board game, the language. Here are some idle thoughts on programming languages.

    I had forgotten how good I can be at procrastination. I drank about a dozen cups of tea (soar throat), checked Twitter and mail... it was terrible.

    If you studied programming languages, there are many things that seem off about golang. I could even understand if the lack of generics had been justifed as: "we value type inference and structural subtyping higher than having generics." but no, they are on the record as "we never needed them". ok... Opinions on programming language design are a dime a dozen.

    I do not aspire to exercise my golang skills because it is a "great programming language", or the appeal to simplicity. Following a change of role at work, I simply expect to be dealing with a lot of go code in my job. That, and "modern C++". It's as simple as that. And sure enough, you can get used to anything.

    When I was a grad student working on a certain programming language, I remember how convinced we were that we'd be changing the world. And indeed we have, but then a whole bunch of others have, too. Golang did not exist back then, and neither did "modern C++". I did have some experience in C++ programming, from peeking at programming courses for math students during uni (absolutely nothing related to actual C++ in my informatics curriculum, though there was a lecture on object-oriented programming) and my two internships. And also learned some C from a friend, one seminar and then many hours of toying around. This changed with my job of course, but the point is I perceived this dichotomy betwen programming as taught and research in academia and programming as done in industry or applied research contexts "with an industrial flavor". As grad students working on a language, we were convinced we have the best language, and determined to make it bestest, but I had seen "the other side".

    I continued to dabble in C++ in those times, because I was sure I'd need it someday, it's the reality that cannot be ignored when studying programming languages. I described this as "professional schizophrenia": you split yourself into the part of you that accepts the "industry" programming languages, and the part that supports progress based on research which mostly gets ignored. I was then quite surprised when C# 2.0 got generics and Java 1.5 quickly followed suit. It looks like pattern matching makes it into the mainstream, too.

    Another way to view this discrepancy is pragmatism: you use what's there, things are the way they are due to path dependence, economic pressure or some other reason that was of concern to somebody. Languages are tools, no need for judgment, just move towards the objective. This is my modus operandi, but it feels good to know that a domain-specific language that is using elements of an advanced type system is sometimes the thing that moves you towards that objective. Still, I am surprised how much one can be stuck in "my programming language is my identity", at myself in hindsight but also all the fan discussion.

    Today, after procrastination, I think there might also be some karma. While I had not been too vocal, I'm sure in a weak moment, I must have said something like "golang is not a good language" aloud. I remember looking at source code that I considered to be looking like line noise (the inline & turned me off), surely I must have expressed my disaffection a few times. After a whole day of pain-avoidance behavior – the other name for procrastination – I realize now that maybe this is what has been hunting me. I have been paying for my remarks, dearly. Wake up, Burak, it's time to do your exercise and show how much of a pragmatic you really are.

    On the bright side, after all this suffering, I am hoping to be born again as a happy gopher tomorrow.

    28.6.20

    Nothing is real

    We are the children of concrete and steel
    This is the place where the truth is concealed
    This is the time when the lie is revealed
    Everything is possible, but nothing is real

    Living Colour - Type

    Ryuichi had to get out. The kids were running around the house and screaming, their juvenile energy levels blending with the boredom of a cloudy Sunday morning to form an explosive mixture. He checked the viral pollution levels and left unceremoniously. It wasn't ok to leave Lingzhao alone with the kids, but still he had to get out or the noise and shouting would get to him and erase all possibility of having thought. He needed space to think, and silence.

    The streets had not yet filled with people, only a few families strolled, then shining beautiful Black faces gleaming in the African sunlight. The Chromatic Cure had worked its wonders, he thought. After the first "baby pandemic" of 2020 and the 2nd civil war for racial justice came the Dinosaur diseases, caused multiple strains of airborne virus out of the remnants of Siberian permafrost. It affected people with light skin color disproportionately, which had eventually led to a treatment. The skin's pigments were modified to transform UV light into virus-defeating proteins, but it needed Black pigments to work, a lot of them. Ryuichi vaguely remembered from his medical biochemistry class in kindergarten. "And that is why everybody is Black now", the teacher had said. The Unified Society logo on her uniform caught his eye, it was one of his earliest memories.

    Weekend or not, those remaining pesky white supremacists were still trouble, Ryuichi thought. Their enclaves were on life support from the Unified Society. Large compounds, former prison complexes that had been sealed airtight and determined to hold out, fueled by conspiracy theories and delusion. Endless debates had ensued whether the move to reactivate the old underground supply channels to keep them alive was a justified use of scarce resources. Compassion won out and the prison enclaves became the only refuge for white people who preferred living with other fellow white male supremacists.

    Ryuichi worked on the team that intercepted electomagnetic signals and gathered intelligence. Two individuals, bugmold and GDJ, had emerged as leaders, one based on ideology and the other seemingly controled the software that kept the modified prison complex working. Very little was known about the fate of women in the prison complexes; there had been many who chose to give up freedom to retain their skin color. But violent fights had erased all but a few, who willingly accepted bugmold's tales of superior intelligence and the day of revenge, and GDJ's programming school of thought in order to keep the automated manufacturing going and launch InfoSec and fake news attacks on the Unified Society. The brutal fights in the White Supremacy enclaves had become a pattern that repeated itself once a decade. They even had a name for this: Natural Selection.

    Ryuichi found a bench and took a break from his walk. It would soon be time.

    Compassion also was the argument that led to the Plan being accepted. The intelligence gathered from Ryuichi's group revealed more and more signs that the crooked ideology and its implementation through software protocols led to prisons being a place of crimes against humanity and intolerable suffering, and human rights group revolted. The cure had been perfected and encapsulated; no complicated week-long procedure was necessary anymore, the biochemists of the Unified Society had created a new airborne virus that would set off the process of transforming the skin color. Everybody has a right to keep their skin color, the chairman of the assembly said, as long as it is Black. Even the most ardent supporters of "tolerance" could not bear the tales of natural selection.

    Ryuichi checked his watch: it was noon. The sealed gray metal boxes with today's supplies to the prisons, delivered synchronously on a global automated schedule, just opened with a click. He smiled and thought, now that White is history, the team will be asking for a new project tomorrow. But it was the weekend, after all. Time to get back and see what the kids are up to.

    27.6.20

    Gotta not-do what you gotta not-do

    Today's writing is about wu wei - "effortless action".

    Wu wei comes from Taoism. I don't know very much about Taoism. I do not have a plan to become a thought leader on Taoism. My only personal connection to Chinese philosophical tradition is that I learned Tae Kwon Do as a teenager and I got curious about the "way" teachers description that the roots of these are not "martial arts" but a "philosophy (or way) of movement".

    I was intrigued by this idea that "attaining perfection (of movement) through practice" was not for becoming be perfect fighter, but to attain wisdom. I went on to read Tao Te Ching, I Ching, and while I did not understand much, I became sensitive to these topics. [A friend at work, big fan of martial-arts, laughed at me for taking this philosophy stuff seriously - but then, "If the Tao would not be laughed at this way, it would not be the Tao!"]

    The point is, you don't have to be an philosopher to appreciate that the idea of effortless action is one that is found in many forms and cultures. It's a pattern easy to observe, if you have the eye for it.

    Take "flow or "being in the zone". There are three conditions for flow to occur:

    • Goals are clear
    • Feedback is immediate
    • A balance exists between opportunity and capacity

    This resonates with the programmer working on a software artifact, with the amateur musician playing a piece on the piano, when doing sports (I am going to stop here). It also resonates indirectly, think of a manager who wants to create a productive environment for their team.

    The analytical, self-conscious and doubtful mind takes a back seat and you "become one" with the thing you are doing. There is a view of taoism where this is not only at the individual level, but also at the level of society; this is what is described in the Tao Te Ching and I Ching (and discussed in the wikipedia article on wu wei).

    It should be intuitive (if somewhat illogical in the realm of words) that "effortless action" requires to put in some effort. The programmer had to learn many things in order to get flow state, hacking away at this code. The musician had to practice countless hours in order to be able to play the piece. That effort is not felt in the moment of flow. You are building up capacity.

    Building up capacity, if done right, is subject to compounding effects. You don't want to only learn, but also watch your rate of learning. Is there a better method? Am I still working towards the same goal?

    Now, when we think of "goals" we are slightly at odds with "effortless action" and the Taoist ideal, since setting a goal can already be "forced." "Effortless action" is not forced, and also not goal-oriented.

    Effortless action and taoism in general are not concerned with values or morality. Taoism features the theme of man's inability to force changes upon society. This is a bit revolting at first sight, since today's society is full of injustice. Black lives matter!

    Things start making a bit more sense when one thinks of the attempts to shape a society or the world by ideology, this is bound to fail or become unsustainable (let's count capitalism as the ideology of $$$ here). You can have the best law, it is not by itself going to guarantee justice. Or, when you think of some software to help with some process, it is helpful to look at those things that do not need to be specified - because it is hopeless to specify them and you need to rely on "common sense" or some norm or culture. Also related is: "There is no way to peace, peace is the way"

    The effort on the society level could be framed as "building the capacity and opportunity for justice". So an abstract, non-descript goal, direction, objective like "justice" or "I want to be good at X" is ok and not a goal in the sense of forcing some direction; we call it a goal since we live in times where we have an extraordinary freedom in choosing what we want to do with our lives. Choices do not preclude wu wei. I'm inclined to think that wu wei is related authenticity and focus; not getting distracted in futile matters, not doing something because you feel you have to, not doing something for purely "political" (tactical) reasons, but focus on the essence. Another quote that comes to mind here, "Never mistake activity for progress."

    There is a lot more to find out about this. I am not an expert on Taoism and I still do not intend to become one; but maybe there's a reminder here for looking at one's own patterns of doing and not doing.

    26.6.20

    Life has a limit but knowledge has none

    “Your life has a limit, but knowledge has none. If you use what is limited to pursue what has no limit, you will be in danger.” Zhuangzi

    If DRMacIver in Subsetting life can draw an analogy between the limited nature of consciousness / the human condition and how C++ programmers effectively know and use only a fraction of the C++ standard — a little subset of the possibilities — then I can do the same with Zhuangzi's warning and ... something. Thank you, David.

    I have written on intuitionist mathematics and logic before, so today it's not going to be about effective computations being terminating ones, potential infinity vs actual infinity. No, let's talk about models.

    I found this quote in a little notebook of mine, an entry dated 20 years ago. I had an actual habit of writing notes, long-winded streams of consciousness, full of philosophy and reflection; a note on the fleeting nature consciousness next to a bitter account of unrequited love; a seemingly never-ending search. I threw away most of those notebooks, but for some reason I kept this one and found it the other day while clearing out the basement.

    A model is always a simplified image of something. That something is not necessarily "reality"; sometimes the thing that is simplified is just another model, or construction. When you make a model of anything, you leave out some parts (attributes) and retain others (Herbert Stachowiak calls this Verkürzungsmerkmal, or attribute of abstraction). Constructing models is key activity for the informatician, and I am glad that undergrads get exposed to Stachowiak's model theory (slides in German).

    Any programmer engages in the business in creating models. Every data structure is part of a model, and data structures serve different purposes. What is right in one context is not appropriate in another. Premature optimization is the adapation of code (and frequently also data) for the purpose of optimizing the performance of some operation, often losing flexibility in the process, and making other operations slow or more expensive. Premature means it's not good and the implicit recommendation is that it is good to keep things general until you know what operations you need.

    Let's not forget about consciousness and life. As a human, you need to make models, too. You need to understand and interact with your fellow humans, understand the consequences of your actions. This is not easy. For goodness sake, you need to understand yourself. What was it that made me throw those notebooks away?

    There is this theory that there is only one consciousness. Carl Gustav Jung writes about this in Typologie, as a key pitfall that makes some introverts insufferable (I felt seen) since they tend to confuse their self and that universal consciousness. "One love" stands for universal love and respect for all beings, no distinctions. Religion literally means connecting your self with the One. The vedic religion talks about consciousness. You don't need to reach for religion for accepting a universality of the basic conditions and limitations in which consciousness and thinking occurs, though.

    But not only life is limited, what each individual knows and is able to do with the knowledge they have, is limited, too. Every human is absolutely unique, not only in the various social or economic conditions they are born in, but also in their continued perspective. I like this opening phrase from Jean Piaget's genetic epistemology

    GENETIC EPISTEMOLOGY attempts to explain knowledge, and in particular scientific knowledge, on the basis of its history, its sociogenesis, and especially the psychological origins of the notions and operations upon which it is based.

    Piaget set out to study the development of children, and ended up with a theory of knowledge. Is it constructed models all the way down? Piaget classifies knowledge into three kinds, phyical, social and "relationships" which he calls logico-mathematical models. Why does he calls the latter thing "logico-mathematical"? I think because mathematics and to some extent logical descriptions of mathematics are used here as synonyms for a kind of knowledge where we are completely free to choose what we are modeling and why; at the same time the models are supposed to be consistent. We want to be sure that "they work", that they are good for something.

    Now, the human condition isn't only about the subject. Different people have different models, so it is not unusual that programmers end up spending 50% or more of their time resolving their disagreements about models. True progress is when people learn something from each other, bridge differences, teach each other and enable others to overcome their limitations.

    I don't know how I end up on taoism, but these are the "ten thousand things". Consciousness is not limited, but each individual slice of it is. And there are challenges and limits to collaboration, many of them stemming from social, economical limitations or preconceptions. Every individual does feel the need to belong; this is where rationality itself reaches its limits.

    I am just going to end here. A lot of disagreements would be easily bridged if we made more of an effort to understand each other: what models, what limitations, what perspectives makes the people say what they say, do what they do. It is not consciousness that is limited; it is its manifestation in the individual — and sometimes just the willingness to take the perspective of the other.

    30.5.20

    Intuitionistic Propositional Logic and Natural Deduction

    Here is a primer on intuitionistic propositional logic and natural deduction, along with a brief account of their historical context. The bits of historical context may explain why formal logic is somewhat entangled with the discourse on "foundations of mathematics". I frankly wanted to research and type up my notes on something else, but then started "writing backwards" and thought I'd make this the introductory notes I wish I had before learning about these things.

    The topics here can be seen as useful background knowledge for anyone learning about sequent calculus, proof theory, Martin-Löf type theory (aka intuitionistic type theory aka constructive type theory), the Curry-Howard-Lambek correspondence with category theory and particularly cartesian closed categories, categorical logic; I may write about these connections at another time, but this is supposed to be an accessible overview. We proceed according to this plan:

    The focus on propositional logic is for simplicity, and to build a bridge to my last post where I briefly described the Curry-Howard correspondence. One can add symbols and rules for quantifiers to obtain first-order predicate logic, and the inclined reader should consult the more detailed texts on these topics, but proof terms would no longer be simply-typed $\lambda$-calculus, and then with higher-order logic there is also a different way to talk about sets of individual objects.

    The beginnings of formal logic

    For ages, mathematics involved theorems and proofs. Yet, the beginning of formal logic is marked by Gottlob Frege's and Bertrand Russell's works.

    Gottlob Frege described a graphical notation that captured sound reasoning in precise rules. It was the beginning of the logicist program of demonstrating that all mathematics could be carried out in formal logic. Bertrand Russell advanced this program considerably, also inventing type theory in the process.

    Formal logic, symbolic logic refers to logical statements and reasoning being expressed in a formal language, with symbols representing logical statements and rules of symbol manipulation describing how to obtain new statements from existing ones, preserving validity. Frege and Russell's deductive systems of reasoning are today presented as Hilbert systems (more on those below).

    Georg Cantor had proposed set theory as a foundational system earlier, but a formal account of set theory had to wait until Zermelo published his axioms. People (including Cantor himself) had found paradoxes in set theory or what they understood to be set theory. The search for the "right" foundations continued, while alternative systems of formal logic were proposed (for example, Schönfinkel's combinatory logic, which does not use any variables). The period that followed the beginnings saw what is today called the foundational crisis of mathematics.

    Axiomatic proof systems

    David Hilbert and his school systematized mathematical logic, the application of formal logic to formalizing mathematics, and "metamathematics".

    Around 1920, the “Hilbert style” axiomatic approach, as it is often called, was known to everyone and dominated the logical scene [1]

    Hilbert systems are a uniform presentation of formal deductive reasoning and can be treated as synonymous with axiomatic proof systems. There are "logical axioms" that capture the process of sound reasoning and there are "mathematical axioms" which would capture the mathematical theory we want to formalize. The only inference rules are the rule of substitution and modus ponens. A proof is a sequence of formulas, with each term being either an axiom or derived from a previous terms by application of the inference rules. If formula $A$ is derivable this way this is written $\vdash A$.

    Later, John von Neumann refined axiomatic proof systems using axiom schemes: axioms schemes contain metavariables, and all instance count as axioms and modus ponens becomes the only rule. Here is such a Hilbert-style axiomatic proof system for propositional logic, due to Jan Łukasiewicz: $$ \begin{array}{l} 1.~ A \rightarrow (B \rightarrow A) \\ 2.~ (A \rightarrow (B \rightarrow C)) \rightarrow ((A \rightarrow B) \rightarrow (A \rightarrow C)) \\ 3.~ (\neg B \rightarrow \neg A) \rightarrow (A \rightarrow B) \\ \end{array} $$

    The rule of Modus Ponens lets us infer from $A \rightarrow B$ and $A$ the new formula $B$. Note that the other operators $\vee$ and $\wedge$ are considered as abbrevations; this is referred to as Ockham's razor, since it uses a minimum of primitives $A \vee B$ is turned into $\neg A \rightarrow B$, and $A \wedge B$ is $\neg (A \rightarrow \neg B)$.

    It is amazing that this axiomatic proof systems is complete: all tautologies can be proven from a few axiom schemes. Many texts on formal logic still use axiomatic proof system, presumably because these permit to talk about formal logic with a minimal amount of machinery for syntax and logical axioms; appealing to the deduction theorem and admissible rules hide much of the structure of the proof, which is not considered interesting to begin with; all that matters is the fact that something can be proven.

    [1] Stanford Encyclopedia of Philosophy The Development of Proof Theory

    [2] Stanford Encyclopedia of Philosophy The Emergence of First-Order Logic

    What is intuitionistic logic?

    L.E.J. Brouwer rejected the idea that logics would be prior to mathematics and proposed intuitionism as philosophy of mathematics; he understood mathematics as mental constructions that come first, before any rendering in logic.

    We summarize the intuitionist positions as follows:

    • An intuitionist rejects the law of the excluded middle $A \vee \neg A$ when $A$ is a statement about infinite sets
    • An intuitionist rejects the idea of an "actual infinite" altogether and regards infinity as "potential infinity", approximable by repeating a process, but never completed.
    • An intuitionist aims to base all mathematics on effective constructions that can be verified based on intuition (such as the intuition of being able to carry out a countable number of steps)

    These restrictions avoid paradoxes of naive set theory, at the cost of rejecting large parts of established mathematical practice.

    Brouwer's student Heyting [3] and also Kolmogorov [4] formulated axioms of intuitionistic logic as axiomatic proof systems.

    Their efforts is what enabled further discussion (beyond philosophical disagreement) and defined intuitionistic logic [5]. Today, maybe due to the multiple available proof systems for formal logic, many sources define intuitionistic logic informally by reproducing the "Brouwer-Heyting-Kolmogorov (BHK) interpretation" [wikipedia]. This is a description of what is an acceptable proof. We skip a discussion of these because we will be giving natural deduction rules below. The key part is that structure of the proof matters now; the structure contains the constructions that the intuitionist cares about.

    Intuitionistic logic is only considered one form of constructive logic and constructive mathematics; there are other logics that are compatible with the BHK interpretation, and multiple possible interpretations of the word construction and function, and then I think most mathematicians do not bother to formal logic (while still caring about precise language and rigour). A different system of logic that is motivated by enabling constructive reasoning is Martin-Löf type theory (also called intuitionistic type theory or constructive type theory).

    [3] Arend Heyting (1930). Die formalen Regeln der intuitionistischen Logik., 3 Teile, In: Sitzungsberichte der preußischen Akademie der Wissenschaften. phys.-math. Klasse, 1930, 42–56, 57–71, 158–169.

    [4] Thierry Coquand. Kolmogorov's contribution to intuitionistic logic. 10.1007/978-3-540-36351-4_2.

    [5] Stanford Encyclopedia of Philosophy The Development of Intuitionistic Logic.

    Gentzen's natural deduction and sequent calculus

    Gerhard Gentzen was David Hilbert's student and sparked the entire field of structural proof theory with his seminal article and thesis that introduced both natural deduction and sequent calculus - in both intuitionistic and classical variants [6].

    Gentzen was not the first to invent natural deduction. Jan Łukasiewicz observed that the axiomatic systems of Frege, Russell and Hilbert do not reflect mathematical practice; mathematicians start their proofs from assumptions, applying "proof figures" to derive new results. His student Stanisław Jaśkowski came up with a calculus of natural deduction, where derivations from assumptions are drawn in boxes [7].

    Gentzen pointed out that proofs come with structure: a proof may contain a "detour", and such detours can always be removed, yielding a normal form. In order to demonstrate this, he reformulated natural deduction as the sequent calculus and the normalization through his celebrated cut-elimination theorem. In the sequent calculus, lists of formulae are manipulated (the word sequent is a random word chosen by Kleene to resemble "Sequenz"). Much later, Dag Prawitz proved the corresponding normalization result directly for natural deduction [8].

    Natural deduction captures formally and precisely reflect how a mathematician proves things while still admitting a "natural" understanding of what is going on - unlike axiomatic proof systems where structure is artificial and ultimately does not matter as it was obtained by modus ponens and axioms.

    [6] Gentzen, Gerhard Karl Erich (1935). "Untersuchungen über das logische Schließen. I". Mathematische Zeitschrift. 39 (2): 176–210. doi:10.1007/BF01201353. (link)

    [7] Stanisław Jaśkowski (1934) On the Rules of Suppositions in Formal Logic Studia Logica 1, pp. 5–32 (reprinted in: Storrs McCall (ed.), Polish Logic 1920-1939, Oxford University Press, 1967 pp. 232–258)

    [8] Dag Prawitz (1965). "Natural Deduction : A proof theoretical study" (reissued 2006 Dover) \(\newcommand{\andintro}[0]{\color{blue}{(\wedge{\mathrm{\small INTRO}})}}\) \(\newcommand{\andelim}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM}})}}\) \(\newcommand{\andelimleft}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/1}})}}\) \(\newcommand{\andelimright}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/2}})}}\) \(\newcommand{\orintro}[0]{\color{blue}{(\vee{\mathrm{\small INTRO}})}}\) \(\newcommand{\orelim}[0]{\color{blue}{(\vee{\mathrm{\small ELIM}})}}\) \(\newcommand{\implintro}[0]{\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}}\) \(\newcommand{\implelim}[0]{\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}}\) \(\newcommand{\notelim}[0]{\color{blue}{(\neg{\mathrm{\small ELIM}})}}\) \(\newcommand{\botelim}[0]{\color{blue}{(\bot{\mathrm{\small ELIM}})}}\)

    Intuitionistic natural deduction NJ

    We now look at the calculus Gentzen calls NJ, but for propositional logic. For notation, we use $\wedge$ instead of & for conjunction and $\rightarrow$ instead of $\supset$ for implication. Our formulae are defined as:

    • $\bot$ is a formula ('falsehood')
    • a propositional variable $p$ is a formula,
    • if $A$ is a formula, so is $\neg A$
    • if $A, B$ are formulae, so are $A \vee B$, $A \wedge B$ and $A \rightarrow B$.

    We shall need to use inference rules and derivations. An inference rule has the form: $$ \frac{X_1 \quad \ldots \quad X_n}{Y} $$

    with the meaning that $X_i$ are premises and $Y$ the conclusion of the rule. In natural deduction, each premise and each conclusion of an inference rules is simply a formula. A derivation is an arrangement of inference rules in tree form; the derived formula is at the bottom and inference rules connect each conclusion to premises above the bar. The derivations built up from (instance of) inference rules this way gives a proof of the formula at the root.

    The rules are as follows (the notation will be explained below): $$ \cfrac{A\quad B}{A \wedge B}~\color{blue}{(\wedge\mathrm{\small INTRO})} \quad \cfrac{A \wedge B}{A}~\color{blue}{(\wedge\mathrm{\small ELIM/1}}) \quad \cfrac{A \wedge B}{B}~\color{blue}{(\wedge\mathrm{\small ELIM/2})}$$ $$ \cfrac{A}{A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/1}}}) \quad \quad \frac{B}{A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/2}})} $$ $$ \cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$ $$ \cfrac{\begin{array}[cc]{} [A] \\ ~~\vdots \\ \bot \end{array} }{\neg A}~\color{blue}{(\neg{\mathrm{\small INTRO}})} \quad \cfrac{A \quad \neg{A}}{\bot}~\color{blue}{(\neg{\mathrm{\small ELIM}})} $$ $$ \cfrac{ \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}B\phantom{]} \end{array}}{A \rightarrow B} ~\implintro % % -> ELIM % \quad \quad \frac{A \rightarrow B \quad A}{B}~\implelim $$ $$ \cfrac{\bot}{A}~\botelim $$

    Each rule preserves validity, which ensures that derivations will produce correct results (soundness). If one does not consider the proof-theoretical explanation of what an inference rule is as sufficient, one needs to talk about models and interpretations.

    Moreover the collection of rules is systematic in a different way: if we ignore the duplicity of $\color{blue}{(\wedge{\mathrm{\small ELIM}})},\color{blue}{(\vee{\mathrm{\small INTRO}})}$ rules, we can say: for every logical operator $\neg, \wedge, \vee, \rightarrow$ there is one introduction rule and one elimination rule. In a sense, the introduction rule defines the meaning of the operator, and the elimination rule is ultimately a consequence of this definition: it removes precisely what was introduced! These are the "detours" in the structure of proofs, which can be removed.

    In order to avoid unnecessary complication, we need to convene that in rule $\notelim$, the conclusion $A$ is always different from $\bot$. If we omit the rule $\notelim$, we end up with minimal logic. Let us now take two rules and read what they are saying:

    • $\andelimleft$ is the (left) elimination rule for $\wedge$, and says: if you have a proof of $A \wedge B$, then you also have a proof of $A$
    • $\andintro$ is the introduction rule for $\wedge$, and says: if you have a proof of $A$ and a proof of $B$, you may combine them into a proof of $A \wedge B$.

    Assumptions. As mentioned, a derivation may start in any formula $A$ at the leafs. In the course of the proof, the assumptions may be discharged, as seen in the inference rules $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$, $\color{blue}{(\neg{\mathrm{\small INTRO}})}$ and $\color{blue}{(\vee{\mathrm{\small ELIM}})}$. The discharged assumption is then marked $[A]$ to indicate that the conclusion does not depend on it. The rule $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$ says if you can prove $B$ from assumption $A$, then we have proven from $A$ follows from $B$, i.e. $A \rightarrow B$. Note that it is possible that there are several other, undischarged assumptions still present in the derivation, and that the discharged assumption may have occurred multiple times in the subderivation, with all those occurrences being discharged at once.

    Let's go through an example proof of $X \vee (Y \wedge Z) \rightarrow (X \vee Y) \wedge (X \vee Z)$. For this, we will need a derivation that ends in $\implintro$. $$ \cfrac{ \begin{array}[cc]{} [X \vee (Y \wedge Z)] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} }{ X \vee (Y \wedge Z) \rightarrow (X \vee Y) \wedge (X \vee Z) }{\implintro} $$ For the upper part, we need $\orelim$. $$ \cfrac{ \begin{array}[cc]{} {\phantom{X}} \\ {\phantom{\vdots}} \\ X \vee (Y \vee Z) \end{array} \quad \begin{array}[cc]{} [X] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} \quad \begin{array}[cc]{} [Y \wedge Z] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} }{ (X \vee Y) \wedge (X \vee Z) }{\orelim} $$ The two branches are now easy: $$ \cfrac{\cfrac{X}{(X \vee Y)}\orintro\quad\cfrac{X}{(X \vee Z)}\orintro} {(X \vee Y) \wedge (X \vee Z)}{\andintro} $$ $$ \cfrac{\cfrac{\cfrac{Y \wedge Z}{Y}\andelimleft}{(X \vee Y)}\orintro\quad\cfrac{\cfrac{Y \wedge Z}{Z}\andelimright}{(X \vee Z)}\orintro} {(X \vee Y) \wedge (X \vee Z)}{\andintro} $$

    Intuitionistic treatment of negation. What if we treated $\neg A$ as an abbreviation for $A \rightarrow \bot$? Then the the role of $\color{blue}{(\neg{\mathrm{\small INTRO}})}$ is played by $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$ and the role of $\color{blue}{(\neg{\mathrm{\small ELIM}})}$ is played by $\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}$. In fact, every derivation in the language with primitive $\neg$ could be rewritten into a derivation for the language where $\neg$ is an abbreviation. Neat!

    Normal form. A natural deduction proof is said to be in normal form if there is no detour, i.e. if there is no formula that is both the conclusion of an introduction rule and at the same time a major premise of an elimination rule. This is due to what Prawitz calls the inversion principle: whenever there is an elimination rule which operates on the result of an introduction rule above, then the results of that elimination rule were already present among the premises of the introduction rule. A normal derivation will - roughly - take apart the assumptions at the leaves with elimination rules and then build up the conclusion with introduction rules. (The $\orelim$ rule complicates this a little, since it passes through a formula $C$ from its minor premises; the proof by induction on the structure of derivations will transform such proofs by rewriting them and push the application of $\orelim$ downwards).

    Curry-Howard correspondence. As discussed last time, we can assign terms of simply-typed $\lambda$-calculus to these derivations. Propositions of IPC are then exactly the types of simply-typed $\lambda$-calculus, with $\wedge$ product types and $\vee$ for sum (coproduct) types. Or, from the vantage point of logic, $\lambda$-terms are proof terms. Normalization of proofs corresponds to normalization of $\lambda$-terms; $\beta$-reduction corresponds to removal of a detour involving $\implintro$ and $\implelim$.

    Still on Curry-Howard, it's a bit less obvious how to interpret $\botelim$ as a typing rule; the answer is that it needs an $\mathsf{abort}$ constant which (similar to throwing an exception) passes control to an imaginary top level function that never terminates. This and more is explained succinctly in Zena Ariola's slides [8], but I do want to point out how this yields a neat programmer's intuition for the convention regarding $\botelim$: since obviously an program $\mathsf{abort}(\ldots(\mathsf{abort}(t)\ldots)$ could always be replaced by a simple $\mathsf{abort}(t)$.

    [8] Zena Ariola. The interplay between logic and computation. slides from a talk at "Women in Logic 2019".

    Natural deduction in sequent calculus style

    There is something odd about the above presentation: we have propositions at the leaves of a derivation which are temporarily treated as assumptions and then cease to be assumptions when they are discharged. It is also possible to not discharge them, or discharge only some of them when making a proof step. Here, we present an alternative formulation that makes it clear what happens with assumptions at every inference step.

    (I thought for a long time that it was type theorists or programming language researchers who invented this, suspecting typographical reasons, but to my surprise I learned that it was in fact Gentzen [9] himself in 1936 who introduced this alternative in the paper where he proved the consistency of Peano arithmetic using transfinite induction.)

    We shall present now intuitionistic natural deduction "in sequent calculus style". Derivations are still tree-shaped, however instead for formulas, premises and conclusions are sequents of the form $\Gamma \vdash A$ where $\Gamma$ is a (possibly empty) set of formulae. The meaning of a sequent is "from assumptions $\Gamma$, one can (intuitionistically) prove $A$". We write the union of such sets with a comma, omit braces around singleton sets and treat $\neg A$ as abbreviation for $A \rightarrow \bot$ so we don't have to give any additional rules. Then, the rules look like this: $$ \cfrac{\Gamma \vdash A\quad \Delta \vdash B}{\Gamma, \Delta \vdash A \wedge B}~\color{blue}{(\wedge\mathrm{\small INTRO})} \quad \cfrac{\Gamma \vdash A \wedge B}{\Gamma \vdash A}~\color{blue}{(\wedge\mathrm{\small ELIM/1}}) \quad \cfrac{\Gamma \vdash A \wedge B}{\Gamma \vdash B}~\color{blue}{(\wedge\mathrm{\small ELIM/2})}$$ $$ \cfrac{\Gamma \vdash A}{\Gamma \vdash A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/1}}}) \quad \quad \frac{\Gamma \vdash B}{\Gamma \vdash A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/2}})} $$ $$ \cfrac{\Gamma \vdash A \vee B \quad\quad \Delta, A \vdash C \quad\quad \Theta, B \vdash C}{\Gamma, \Delta, \Theta \vdash C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$ $$ \cfrac{\Gamma, A \vdash B}{\Gamma \vdash A \rightarrow B}~\implintro % % -> ELIM % \quad \quad \cfrac{\Gamma \vdash A \rightarrow B \quad \Delta \vdash A}{\Gamma, \Delta \vdash B}~\implelim $$ $$ \cfrac{\Gamma \vdash \bot}{\Gamma \vdash A}~\botelim $$

    These rules make it clear that it is now "compulsory" to discharge assumptions. When a rule has multiple premises, the assumptions are combined. By choosing broad enough set of assumptions ahead of time (weakening) the different sets of assumptions can be assumed to be the same. When moving towards sequent calculus, one can do away with the treatment of assumptions as sets and treat them as ordered sequences but with structural rules like exchange and contraction; removing one or both structural rules then yields sub-structural logic and proof theory.

    [9] Gerhard Gentzen (1936) "Die Widerspruchsfreiheit der reinen Zahlentheorie"

    Conclusion

    We are at the end of our tour; I hope this can provide a good starting point for studying the multiple connections between $\lambda$-calculus, logic and proof theory.

    Sound reasoning - preferably automated - and some form of formal logic and proof are going to be key when modeling some domain and constructing correct software for it. Programmers are used to deal with precise formulations (programming languages, specs) but only a subset will be interested in the "formalization of all mathematics"; a programmer with a basic understanding of simply-typed $\lambda$-calculus should ideally have a very clear path to understanding logic, proof assistants, type theory - I don't think that this is the case today.

    The activities in constructing models and software may be similar to mathematics, but they are certainly not the same. Each domain is a mathematical universe, with random and often contradictory pieces of information waiting to be organized in "islands of consistency". Hopefully we will one day reach some widespread understanding and usable tool support for formal reasoning that can serve both practical informatics as well as formalizing mathematics.

    [If you want to discuss, the HN thread is a great place to leave comments]