Logika v računalništvu (Simpson)
Warning: The notes are not optimized for HTML (yet). Without warranty.
Lecture 1: Propositional logic
Elimination is by case analysis: given \(\phi \lor \psi\), if both \(\phi \vdash \theta\) and \(\psi \vdash \theta\), conclude \(\theta\).
\begin{gather*} \begin{bprooftree} \AxiomC{\(\phi \lor \psi\)} \AxiomC{\(\phi\)} \noLine \def\extraVskip{0pt} \UnaryInfC{\(\vdots\)} \noLine \UnaryInfC{\(\theta\)} \def\extraVskip{2pt} \AxiomC{\(\psi\)} \noLine \def\extraVskip{0pt} \UnaryInfC{\(\vdots\)} \noLine \UnaryInfC{\(\theta\)} \def\extraVskip{2pt} \TrinaryInfC{\(\theta\)} \end{bprooftree} \end{gather*}\(\rightarrow i\) discharges the assumption \(\phi\). \(\rightarrow e\) is modus ponens.
\(\neg i\) discharges \(\phi\). \(\bot e\) is ex falso quodlibet: from falsehood, anything follows.
The rule that distinguishes classical from intuitionistic logic:
\begin{prooftree} \AxiomC{\(\neg \phi\)} \noLine \def\extraVskip{0pt} \UnaryInfC{\(\vdots\)} \noLine \UnaryInfC{\(\bot\)} \def\extraVskip{2pt} \RightLabel{raa} \UnaryInfC{\(\phi\)} \end{prooftree}If assuming \(\neg \phi\) leads to \(\bot\), conclude \(\phi\). Compare with \(\neg i\): there, assuming \(\phi\) and deriving \(\bot\) gives \(\neg \phi\). RAA goes the other direction.
\[ p \land (q \lor r) \rightarrow (p \land q) \lor (p \land r) \]
\begin{logicproof}{3} \begin{subproof} p \land (q \lor r) & \(\text{assumption} \) \\ p & \(\land e(1) \) \\ q \lor r & \(\land e(1) \) \\ \begin{subproof} q & \(\text{assumption} \) \\ p \land q & \(\land i(2)(4) \) \\ (p \land q) \lor (p \land r) & \(\lor i(5)\) \end{subproof} \begin{subproof} r & \(\text{assumption} \) \\ p \land r & \(\land i(2)(7) \) \\ (p \land q) \lor (p \land r) & \(\lor i(8)\) \end{subproof} (p \land q) \lor (p \land r) & \(\lor e(3)(4{-}6)(7{-}9)\) \end{subproof} p \land (q \lor r) \rightarrow (p \land q) \lor (p \land r) & \(\rightarrow i (1{-}10)\) \end{logicproof}A proof of a formula \(\phi\) from assumptions \(\Gamma\) is a sequence of formulas following the inference rules such that all assumptions used in the proof belong to \(\Gamma\). We write \(\Gamma \vdash \phi\).
\[ \Gamma \vdash \phi \iff \bigwedge \Gamma \rightarrow \phi \text{ is a tautology} \]
Soundness (\(\Rightarrow\)): If \(\Gamma \vdash \phi\), then \(\phi\) is true under every truth assignment making all formulas in \(\Gamma\) true.
Completeness (\(\Leftarrow\)): If \(\phi\) is true under every truth assignment making all formulas in \(\Gamma\) true, then \(\Gamma \vdash \phi\).
Propositional formulas correspond to types, and proofs correspond to terms (judgements \(t : A\)):
| Formula | Type |
|---|---|
| \(p\) (atomic) | \(\alpha\) (atomic type) |
| \(\phi \land \psi\) | \(A \times B\) (product) |
| \(\phi \lor \psi\) | \(A + B\) (coproduct/sum) |
| \(\phi \rightarrow \psi\) | \(A \rightarrow B\) (function) |
| \(\bot\) | \(0\) (empty type) |
Computation rules: \(\pi_1(s, t) \longrightarrow s\) and \(\pi_2(s, t) \longrightarrow t\).
Computation rule (beta reduction): \((\lambda x : A . t)\, u \longrightarrow t[x := u]\).
Substitution requires first renaming bound variables in \(t\) so they are distinct from free variables in \(u\).
Construct a term of type \(A \times B \rightarrow B \times A\).
\begin{logicproof}{2} \begin{subproof} x : A \times B & assumption \\ \pi_1(x) : A & \(\Pi e_1 (1)\) \\ \pi_2(x) : B & \(\Pi e_2 (1)\) \\ (\pi_2(x), \pi_1(x)) : B \times A & \(\Pi i (3)(2)\) \end{subproof} \lambda x : A \times B . (\pi_2(x), \pi_1(x)) : A \times B \rightarrow B \times A & \(\rightarrow i (1{-}4)\) \end{logicproof}Elimination by case analysis:
\begin{prooftree} \AxiomC{\(s : A + B\)} \AxiomC{\(x : A\)} \noLine \def\extraVskip{0pt} \UnaryInfC{\(\vdots\)} \noLine \UnaryInfC{\(t : C\)} \def\extraVskip{2pt} \AxiomC{\(y : B\)} \noLine \def\extraVskip{0pt} \UnaryInfC{\(\vdots\)} \noLine \UnaryInfC{\(u : C\)} \def\extraVskip{2pt} \TrinaryInfC{\(\mathbf{case}\; s\; (\iota_1(x) \mapsto t \mid \iota_2(y) \mapsto u) : C\)} \end{prooftree}Computation rules:
\begin{align*} \mathbf{case}\; \iota_1(s)\; (\iota_1(x) \mapsto t \mid \iota_2(y) \mapsto u) &\longrightarrow t[x := s] \\ \mathbf{case}\; \iota_2(s)\; (\iota_1(x) \mapsto t \mid \iota_2(y) \mapsto u) &\longrightarrow u[y := s] \end{align*}The empty type \(0\) has no introduction rule (no closed terms of type \(0\) can be constructed).
\begin{prooftree} \AxiomC{\(t : 0\)} \UnaryInfC{\(\mathbf{empty}(t) : A\)} \end{prooftree}This is the type-theoretic counterpart of \(\bot e\) (ex falso quodlibet).
Lecture 2: First order logic and dependent type theory
Extends propositional logic with predicates. \[ \phi ::= p(t_1, \dots, t_n) \mid \phi \land \phi \mid \phi \lor \phi \mid \phi \rightarrow \phi \mid \bot \mid \forall x . \phi \mid \exists x . \phi \] We add predicates expressing relations and constants and function symbols expressing elements. Every function symbol has an arity \(\geq 1\). Likewise we interpret formulas as having an arity \(\geq 0\).
Terms express elements: \[ t ::= x \mid c \mid f(t_1, \dots, t_n) \] where \(c\) represents constants, \(f\) a function symbol of arity \(n\).
- Constants: \(0, 1\)
- Function symbols: \(+, \cdot\) both of arity 2.
Examples of terms: \(z\), \(x + y\), \(x \cdot (y + (z + 1))\).
Predicates: \(=\) and \(\leq\), both of arity 2.
Example formula (with one free variable \(x\)): \[ \exists y . \exists z . \exists w . \exists v . \; x = y \cdot y + z \cdot z + w \cdot w + v \cdot v \]
To integrate predicate logic and type theory, introduce a special type \(\Prop\) — the type of propositions, called the propositional universe.
Often we make the context explicit, e.g.: \[ p : \Prop,\; q : \Prop \vdash p \land q \rightarrow q \land p : \Prop \]
An alternative is to include connectives and quantifiers as typed constants: \[ \rightarrow,\; \land,\; \lor : \Prop \rightarrow \Prop \rightarrow \Prop \] \[ \exists,\; \forall : (A \rightarrow \Prop) \rightarrow \Prop \]
- \(t : A\) — \(t\) is of type \(A\)
- \(\phi : \Prop\) — \(\phi\) is a proposition
- \(\phi\) — \(\phi\) is true
- In \(\forall i\), \(z\) must not appear free outside the box (i.e. not free in the context).
- In \(\forall e\), every free occurrence of \(x\) in \(\phi\) is replaced by \(t\), with renaming of bound variables if necessary.
In \(\exists e\): \(\psi : \Prop\) must be provable outside the box, and \(\psi\) cannot contain \(x\) free.
Thus \(\underbrace{p : A \rightarrow \Prop}_{p \text{ is a unary predicate}} \vdash \forall x : A . px : \Prop\).
A unary predicate is a function from a type into propositions. For binary predicates we may use currying \(q : A \rightarrow A \rightarrow \Prop\) or pairs \(q : A \times A \rightarrow \Prop\).
\[ p : A \rightarrow \Prop,\; q : A \rightarrow \Prop \;\vdash\; (\forall x : A . px) \rightarrow (\forall x : A . qx) \rightarrow (\forall x : A . px \land qx) \] So to speak, universal quantifier commutes with conjunction. Note this is an instance of RAPL.
Replacing the grammar for types with a universe \(\Type\), our judgements are now:
- \(\phi\) — \(\phi\) is true
- \(t : A\) — \(t\) is of type \(A\)
- \(\phi : \Prop\) — \(\phi\) is a proposition
- \(A : \Type\) — \(A\) is a type
Formation rules for type constructors:
\begin{gather*} \begin{bprooftree} \AxiomC{\(A : \Type\)} \AxiomC{\(B : \Type\)} \BinaryInfC{\(A \times B : \Type\)} \noLine \UnaryInfC{\(A + B : \Type\)} \noLine \UnaryInfC{\(A \rightarrow B : \Type\)} \end{bprooftree} \quad \begin{bprooftree} \AxiomC{} \UnaryInfC{\(0 : \Type\)} \end{bprooftree} \end{gather*}Thus when we introduce a concept, we give it four kinds of rule:
- Formation rules
- Introduction rules
- Elimination rules
- Computation rules
\[ \Pi (n : \mathbb{N}) . \operatorname{Vector}_{\mathbb{N}}\, n \] is the type of functions \(f\) such that for each \(n : \mathbb{N}\), \(f(n) : \operatorname{Vector}_{\mathbb{N}}\, n\). Examples:
- \(n \mapsto \overbrace{0 \cdots 0}^{n}\) — the zero sequence of length \(n\)
- \(n \mapsto 1\, 2 \cdots n\) — the counting sequence of length \(n\)
Lecture 3: Dependent type theory 2
The type \(\operatorname{nat}\) is given by
\begin{gather*} \begin{bprooftree} \AxiomC{} \RightLabel{formation rule} \UnaryInfC{\(\operatorname{nat} : \Type\) } \end{bprooftree} \end{gather*} \begin{gather*} \begin{bprooftree} \AxiomC{} \UnaryInfC{\(0 : \operatorname{nat}\) } \end{bprooftree} \quad \begin{bprooftree} \AxiomC{\(t : \operatorname{nat}\)} \RightLabel{Introduction rules} \UnaryInfC{\(\operatorname{succ} t : \operatorname{nat}\) } \end{bprooftree} \end{gather*}Alternatively, we can think of this type as given by
- A constant \(\operatorname{nil} : \operatorname{List} A\)
- A map \(\operatorname{cons} : A \rightarrow \operatorname{List} A \rightarrow \operatorname{List} A\)
Alternatively, we can think of it as given by
- A constant \(\operatorname{nil} : \operatorname{Vec} 0 \, A\)
- A map \(\operatorname{cons} : A \rightarrow \operatorname{Vec} A \, n \rightarrow \operatorname{Vec} A \, (\operatorname{succ} n)\)
The map including the dependent \(n\) means we really have a family of maps, which we give as a dependent product: \[ \operatorname{cons} : \prod(n : \operatorname{nat}) .\, A \rightarrow \operatorname{Vec} A \, n \rightarrow \operatorname{Vec} A \, (\operatorname{succ} n) \] Hence why \(\operatorname{cons}\) takes the additional argument \(n\). Vectors form a dependent type.
We also have the computation rule \[ (\lambda x : A . t) u \longrightarrow t[u/x]. \]
Note that these rules match to the rules for \(\rightarrow\). In type theory, once you have dependent product, we don't need the \(\rightarrow\) type as a primitive anymore, because it is defineable. In particular, reduces to the ordinary function type \(A \rightarrow B\) when \(B\) does not depend on \(x\), \(\Pi(x : A) . B\).
We also define the computation rules
\begin{align*} \pi_1(s, t) &\longrightarrow s \\ \pi_2(s, t) &\longrightarrow t \end{align*}Intuitively, \(\Sigma(x : A). B\) is the type where elements are tagged elements \[ \operatorname{in}_a(b) \] where \(a : A\) and \(b : B[a/x]\). Equivalently, we can give the tagging with pairs \((a, b)\) for \(a : A\) and \(b : B[a/x]\).
We can define the product types using dependent sums using \[ A \times B := \Sigma(x : A) . \, B \] where \(B\) does not depend on \(A\).
- \(0\) has no elements
- \(A \times B\) has pairs \((a, b)\) where \(a : A\) and \(b : B\)
- \(A + B\) has tagged elements \(\iota_1(a)\) where \(a : A\) or \(\iota_2(b)\) where \(b : B\)
- \(A \rightarrow B\) has functions of domain \(A\) and codomain \(B\)
- \(\Pi (x : A).\, B\) has functions mapping elements \(a : B\) to elements of type \(B[a/x]\).
- \(\Sigma (x : A). \, B\) has pairs \((a, b)\) where \(a : A\) and \(b : B[a/x]\)
We can also consider a primitive unit type.
- \(\bot\) has no proof.
- A proof of \(\phi \land \psi\) is a pair \((a, b)\) where \(a\) proves \(\phi\) and \(b\) proves \(\psi\).
- A proof of \(\phi \lor \psi\) is either \(\iota_1(a)\) where \(a\) is a proof of \(\phi\) or \(\iota_2(b)\) where \(b\) is a proof of \(\psi\).
- A proof of \(\phi \rightarrow \psi\) is a function that maps a proof of \(\phi\) to a proof of \(\psi\).
- A proof of \(\forall x : A . \phi\) is a function that takes an element \(a : A\) to a proof of \(\phi[a/x]\).
- A proof of \(\exists x : A . \phi\) is a pair \(a, b\) where \(a : A\) and \(b\) is a proof of \(\phi[a/x]\).
We replace logical formulas (proposition) with types whose elements are proofs of the propositions.
\begin{align*} \bot &\longrightarrow 0 \\ \phi \land \psi &\longrightarrow A \times B \\ \phi \lor \psi &\longrightarrow A + B \\ \phi \rightarrow \psi &\longrightarrow A \rightarrow B \\ \forall x : A .\, \phi &\longrightarrow \Pi(x : A) . \, B \\ \exists x : A .\, \Phi &\longrightarrow \Sigma(x : A) . \, B \end{align*}With this approach we do not need an explicit propositional universe \(\Prop\). This works but we have a better way.
This is the modern approach to embedding logic in type theory.
- If \(\phi : \Prop\), then \(\phi : \Type\), i.e. \(\Prop\) is a subuniverse of \(\Type\)
\(\Prop\) is closed under \(\Pi\) types, meaning
\begin{prooftree} \AxiomC{\(A : \Type\)} \noLine \UnaryInfC{\(x : A\)} \noLine \UnaryInfC{\(\phi : \Prop\)} \UnaryInfC{\(\Pi(x : A). \, B : \Prop\) } \end{prooftree}- If \(\phi : \Prop\), then \(\phi\) is subterminal, if \(a, b : \phi\), then \(a = b\).
- We also take \(\Prop : \Type\) - impredicativity.
Taking type theory with \(\Pi\) as the only type constructor is enough for logic. Also taking inductive types allows us to do mathematics and computer science.
- \(\bot\): \(\Pi(p : \Prop) . \,p\)
- \(\phi \land \psi\): \( \Pi (p : \Prop) . (\phi \rightarrow \psi \rightarrow p) \rightarrow p \)
- \(\phi \lor \psi\):\( \Pi (p : \Prop) . (\phi \rightarrow p) \rightarrow (\psi \rightarrow p) \rightarrow p \)
- \(\phi \rightarrow \psi\) : \(\phi \rightarrow \psi\)
- \(\forall x : A . \phi\) :\( \Pi (x : A) . \phi\)
- \(\exists x : A . \phi\): \( \Pi (p : \Prop) \left( \Pi(x : A) . (\phi \rightarrow p) \right) \rightarrow p\)
Lecture 4: Dependent type theory 3
We have the basic judgement forms
- \(A : \Type\)
- \(t : A\) wherein we had to prove \(A : \Type\)
- \(S \equiv t : A\) meaning \(s\) and \(t\) are judgementally equal.
and the derived judgment forms
- \(\Phi : \Prop\) is given by an instancce of \(t : A\) and \(A : \Type\)
- \(t : \Prop\) means \(\_ : \phi\), so \(\phi\) must be inhabited. We say the term is a witness for \(\phi\).
We will look at judgemental equality today. It turns out to be crucial.
We want to produce an elimination rule for the naturals.
Instead of a rule-based formulation, we assume typed constants that are "equivalent" to the rules. Our introduction rules are terms \(0 : \operatorname{nat}\) and \(\operatorname{succ} : \operatorname{nat} \rightarrow \operatorname{nat}\). Note that we cannot derive that \(\operatorname{succ}\) is a function by our previous definition, but we can define \(\lambda x : \operatorname{nat} . \operatorname{succ} x : \operatorname{nat} \rightarrow \operatorname{nat}\).
Likewise, rather than defining the induction rule as above, we can formulate it as an axiom in logic.
\[ \operatorname{Ind} : P : \operatorname{nat} \rightarrow\Prop .\, \left( P0 \rightarrow (\forall x : \operatorname{nat} .\, Px \rightarrow P (\operatorname{succ} x)) \rightarrow \forall x : \operatorname{nat} .\, Px \right) \]
A mathematical proof in set theory derives the recursion principle from this. If we have any set \(A\) and \(a : A\) and \(f : A \rightarrow A\), then there exists a unique function \(r : \N \rightarrow A\) such that we have \(r(0) = a\) and \(r(n + 1) = f(r(n))\).
Building on the classical intuition, we generalize the induction axiom by replacing the \(\forall\) with the dependent product, constructing an explicit recursor function.
\[ R_{\operatorname{nat}} : \prod P : \operatorname{nat} \rightarrow \Type .\, \left( P0 \rightarrow \left( \prod x : \operatorname{nat} .\, Px \rightarrow P(\operatorname{succ} x) \right) \right) \rightarrow \prod x : \operatorname{nat}.\, Px \]
We can use the recursor to formulate a computation rule.
Say we have \(P : \operatorname{nat} \rightarrow \Type\), \(b : P_0\), \(f : \prod (x : \operatorname{nat} .\,Px \rightarrow P(\operatorname{succ} x))\). Then we have a term \[ R_{\operatorname{nat}}P\, b\, f\, : \prod(x : \operatorname{nat}) .\, Px \] which we can apply as
\begin{align*} R_{\operatorname{nat}} P\, b\, f\, 0 &\longrightarrow b : P0\\ R_{\operatorname{nat}} P\, b\, f\, (\operatorname{succ} n) &\longrightarrow f \left( R_{\operatorname{nat}} P\, b\, f\, n\right) : P(\operatorname{succ} n) \end{align*}We define the function
\begin{align*} \operatorname{add} : \operatorname{nat} &\rightarrow \operatorname{nat}\rightarrow \operatorname{nat}\\ \operatorname{add} &:= \lambda m : \operatorname{nat} .\, \lambda n : \operatorname{nat} .\, R_{\operatorname{nat}} (\lambda \_ : \operatorname{nat}) \,m\, (\lambda\_ : \operatorname{nat} .\, \lambda x : \operatorname{nat} .\, \operatorname{succ} x) \, n \end{align*}Let us prove \(1 + 1 = 2\).
\begin{align*} \operatorname{add} \, (\operatorname{succ} 0)\, (\operatorname{succ} 0) &\longrightarrow^{ * } R_{\operatorname{nat}} (\lambda \_ : \operatorname{nat} . \operatorname{nat}) (\operatorname{succ} 0 )\, (\lambda \_ : \operatorname{nat} . \operatorname{nat}) (\operatorname{succ} 0 )\, \\ &\longrightarrow (\lambda \_ : \operatorname{nat} .\, \lambda x : \operatorname{nat}.\, \operatorname{succ} x) \, 0 \, (R_{\operatorname{nat}} (\lambda \_ : \operatorname{nat} .\, \operatorname{nat}) \, (\operatorname{succ} 0) \, \operatorname{\lambda \_ : \operatorname{nat} .\, \lambda x : \operatorname{nat} .\, \operatorname{succ} x} 0) \\ &\longrightarrow^{ * } \operatorname{succ} (*R_{\operatorname{nat}} \, (\lambda \_ : \operatorname{nat} .\, \operatorname{nat}) \, (\operatorname{succ} 0)\, (\lambda \_ : \operatorname{nat} .\, \lambda x : \operatorname{nat} .\, \operatorname{succ} x) 0) \\ &\rightarrow \operatorname{oper}(\operatorname{oper} 0) \end{align*}Note that a vector is given by \(\operatorname{nil} : \operatorname{Vec} \, A \, 0\) and \[ \operatorname{cons} : \prod (n : \operatorname{nat}) : A \rightarrow \operatorname{Vec} A\, n \rightarrow \operatorname{Vec}\, A\, (\operatorname{succ} n). \]
We define the recursor for vectors by
\begin{align*} R_{\operatorname{Vec}} &: \prod P : (\prod n : \operatorname{nat} . (\operatorname{Vec} A\,n \rightarrow \Type)) \\ & P\, 0\, \operatorname{nil} \rightarrow (\prod a : A.\, \prod n : \operatorname{nat} .\, \prod as : \operatorname{Vec} A\, n) .\\ &P \,n\,as \rightarrow P\,(\operatorname{succ} n)\, (\operatorname{cons}\,n\,a\,as) \\ &\rightarrow \prod n : \operatorname{nat} \, \prod ax : \operatorname{Vec} \, A\, n .\, P \,n\, as \end{align*}The computation rules as follows.
We also specify equality as an inductive type.
- Formation: \begin{prooftree} \AxiomC{\(A : \Type\)} \AxiomC{\(s : A\)} \AxiomC{\(t : A\)} \TrinaryInfC{\(s =_A t : \Prop\)} \end{prooftree}
- Introduction: \begin{prooftree} \AxiomC{\(t : A\)} \UnaryInfC{\(\operatorname{refl}_{A, t}: t =_A t\) } \end{prooftree}
Eliminator:
\begin{align*} J_A : &\prod C : (\prod (x : A) \, \prod (y : A)\, ((x =_A y) \rightarrow \Type) ) .\\ &(\prod (x : A).\, C \, x\, x \, (\operatorname{refl}_{A, x})) \rightarrow \\ &\prod (x : A)\, \prod (y : A)\, \prod (p : (x =_A y))\. C\,x\,y\,p \end{align*}- Computation: \[ J_A \, C\, f\,t\,t\, \operatorname{refl}_{A, t} \rightarrow f\, t \]
Martin-löf's version has equality types, where \(s =_A t : \Type\) instead of merely being a proposition. This leads to Homotopy Type Theory.
Then we have a rule
\begin{prooftree} \AxiomC{\(s : A\)} \AxiomC{\(t : A\)} \AxiomC{\(s \rightarrow^{ * } u\)} \AxiomC{\(t \rightarrow^{ * } u\)} \QuaternaryInfC{\(s\equiv t : A\)} \end{prooftree}
Lecture 5: SAT Solving
We will use the following connectives as primitive.
\[ \phi ::= x \mid \neg \phi \mid \bot \mid \phi \lor \phi \mid \top \mid \phi \land \phi \] Other connectives can be defined.
We view a formula \(\phi\) with (at most) \(k\) propositional variables \(x_1,\dots, x_k\) as a function \[ \llbracket \phi \rrbracket : 2^k \rightarrow 2 \] encoding the truth table. The proposition is encased in so called semantic brackets.
- A formula \(\phi\) in variables \(x_1,\dots, x_k\) is satisfiable if the true fiber is nonempty, i.e. if there exists \(\vec{b} \in 2^k\) such that
\[ \llbracket \phi \rrbracket(\vec{b}) = 1. \]
- A formula is a tautology (valid) if the false fiber is empty, i.e. it is true for all assigments of truth values.
- Formulae \(\phi\) and \(\psi\) are equivalent if \(\forall \vec{b} \in 2^k .\, \llbracket \phi \rrbracket(\vec{b}) = \llbracket \psi \rrbracket (\vec{b})\).
If \(\vec{b}\) satisfies \(\phi\) it is called a satisfying assignment. More generally, we might refer to it as a certificate.
The connections with the proof system is given by a pair of theorems.
If we can prove \(\phi\) from no assumptions in natural deduction, then \(\phi\) is valid.
If \(\phi\) is valid, then we can prove \(\phi\) in natural deduction.
Note that we need LEM for this.
\(\phi\) is valid \(\iff \neg \phi\) is not satisfiable.
The definitions of validity and satisfyability give rise to truth table algorithms, which simply check the property in question for all \(\vec{b} \in 2^k\). This is exponential, however.
The soundness theorem suggests another avenue of attack, namely, to search for proofs of the formula in natural deduction. This is provided by proof systems. Due to the equivalence between satisfiability and validity, by correctly bounding the size of proofs also allows us to exhaust the proof space and show nonsatisfiability.
It turns out many search problems can efficiently be encoded as a satisfiability problem efficiently.
A formula \(\phi\) is in negation normal form (nnf) if the only negations appear directly in front of propositional variables. This is given by the grammar \[ \phi ::= x \mid \neg x \mid \bot \mid \phi \lor \phi \mid \top \mid \phi \land \phi \]
Using De Morgan's laws we do rewrites of the form
\begin{align*} \neg (\phi \lor \psi) &\longrightarrow (\neg \phi) \land (\neg \psi) \\ \neg (\phi \land \psi) &\longrightarrow (\neg \phi) \lor (\neg \psi) \\ \neg\neg \phi &\longrightarrow \phi \\ \neg \bot &\longrightarrow \top \\ \neg \top &\longrightarrow \bot \end{align*}we arrive to the expected form. As stated this is a nondeterministic algorithm.
For any formula \(\phi\), the rewrite algorithm terminates with formula \(\phi^{\text{nnf}}\) that is equivalent to \(\phi\).
The time complexity is \(O(\abs{\phi})\). Also \(\abs{\phi^{\text{nnf}}} = O(\abs{\phi})\).
- A literal is a propositional variable \(x\) or the negation of one \(\neg x\).
- A clause is a finite disjunction of literals \(L_1 \lor \cdots \lor L_m \) or \(\bot\).
- A coclause is a finite conjunction of literals \(L_1 \land \cdots \land L_m \) or \(\top\).
A formula is in conjunctive normal form (CNF) if it is a finite conjunction of clauses: \[ D_1 \land \cdots \land D_n \text{ (or \(\top\))} \] where each \(D_i\) is a clause. A formula is in \(n\)-CNF if each clause contains at most \(n\) literals.
Dually, it is in disjunctive normal form if it is a finite disjunction of coclauses: \[ C_1 \lor \cdots \lor C_n. \]
We have an algoritm (see notes) that converts any formula \(\phi\) into an equivalent \(\phi^{\text{cnf}}\) in conjunctive normal form. A different algorithm works for dnf.
The SAT problem is a computational problem such that:
- Input: A propositional formula \(\phi\)
- Output: yes if \(\phi\) is satisfiable, no if \(\phi\) is not satisfiable.
In practice, we often care about the slightly more powerful "useful" SAT problem:
- Input: A propositional formula \(\phi\)
- Output: yes and a satisfying assigmnment if \(\phi\) is satisfiable, no if \(\phi\) is not satisfiable.
Famously, SAT is NP-complete. There is no known efficient in theory algorithm. Nevertheless, there are practical SAT solvers.
We can look at SAT for formulae in normal form. First, DNF.
- Input: A formula in DNF.
- Output: yes if satisfiable, no if not.
This is very easy to solve - we need to check a single disjunct is solvable, and each of these is a conjunct, which is satisfiable iff it does not contain both a statement and its negation. Checking this takes at most quadratic time.
However, the conversion into DNF (likewise for CNF) involves an exponential blowup in the size of the formula. The problem is not interesting.
CNF is the more interesting normal form.
- Input: A formula \(\phi\) in CNF.
- Output: yes + a satisfying assignment if \(\phi\) is satisfiable, no otherwise.
This is what SAT solvers implement. Even though conversion into an equivalent CNF formula is hard, we can give a transformation into a CNF formula which, while not equivalent, gives us a one-to-one mappping of satisfying assignments.
Let \(\phi(x_1, \dots, x_k)\) be a propositional formula. The Tseytin transformation maps it to a formula \[ \phi^{T}(x_1, \dots, x_K), \] where \(K \geq k\) such that
- \(\phi^{T}\) is in 3-CNF,
- \((b_1, \dots, b_K) \mapsto (b_1, \dots, b_k) : 2^K \rightarrow 2^k\) is a bijection between satisfying assignments for \(\phi^T\) and \(\phi\).
\(\phi^T\) is computable in \(O(\abs{\phi})\) time.
We will understand the algorithm and its validity through an example. NB this is an IŠRM lecture so this is justified.
Start with \[ \phi \equiv (x_1 \land x_2) \lor \neg (x_3 \lor x_1) \] We write the formula as its abstract syntax tree.
\begin{forest} [{$\lor$} [{$\land$} [$x_1$] [$x_2$] ] [{\neg} [{$\lor$} [$x_3$] [$x_1$] ] ] ] \end{forest}We introduce new variables to name the internal nodes of the tree. \(x_4\) is the root of the tree, \(x_5\) corresponds to the first \(\land\), \(x_6\) to the negation, \(x_7\) to the \(\lor\). Then we introduce the new formulas
\begin{align*} x_4 &\leftrightarrow x_5 \lor x_6 & & (\neg x_4 \lor x_5 \lor x_6) \land (\neg x_5 \lor x_4) \land (\neg x_6 \lor x_4) \\ x_5 & \leftrightarrow x_1 \land x_2 & \land\, & (\neg x_5 \lor x_1) \land (\neg x_5 \lor x_2) \land (\neg x_1 \lor \neg x_2 \lor x_5) \\ x_6 & \leftrightarrow \neg x_7 & \land\, & (\neg x_6 \lor \neg x_7) \land (x_7 \lor x_6) \\ x_6 & \leftrightarrow x_3 \lor x_1 & \land\, & (\neg x_7 \lor x_3 \lor x_1) \land (\neg x_3 \lor x_7) \land (\neg x_1 \lor x_7) \\ & & \land\, & x_4 \end{align*}where the whole formula on the right is \(\phi^T\).
In practice we often need not even convert the formula, as many problems can be stated in CNF directly.
Let \((V, E)\) be a graph with a finite set of vertices and \(E \subseteq V \times V\) is symmetric and irreflexive, i.e. without self-loops.
Then \((V, E)\) is 3-colorable if there exists a coloring function \(c : V \rightarrow \left\{ R, G, B \right\}\) such that for any edge \((v, v')\), we have \(c(v) \neq c(v')\).
We can see this as a SAT problem thus. Construct a formula \(\phi_{G}\) with propositional variables \[ \left\{ r_v, g_v, b_v \mid v \in V\right\} \] indicating the color of each vertex \(v\), giving us \(3 \abs{V}\) propositional variables. Then \(\phi_{G}\) is constructed as the CNF formed as a conjunction of the following clauses.
- For every \(v \in V\), we include the clauses \(r_v \lor g_v \lor b_v\) - it has at least one color, \(\neg r_v \lor \neg g_v\), \(\neg g_v \lor \neg b_v\), \(\neg r_v \lor \neg b_r\), i.e. it has at most one color.
- For every \(\left\{ v, v' \right\}\) in \(E\) we include the clauses \(\neg r_v \lor \neg r_{v'}\), \(\neg g_v \lor \neg g_{v'}\), \(\neg b_v \lor \neg b_{v'}\), i.e. no vertices of an edge share a color.
Satisfying assignments for \(\phi_{G}\) are in obvious bijective correspondence with 3-colorings of the graph \(G\).
We will specify a recursive algorithm DPLL \((\phi, \rho)\) where \(\rho\) is an assignment of truth values to variables not in \(\phi\). We repeat until none is applicable
- (Empty conjunction elimination): If \(\phi\) is the empty conjunction \([]\), return \(\rho\) as the satisfying assignment.
- (Empty clause elimination): If \(\phi\) contains an empty clause, return no
- (Unit clause elimination): If \(\phi\) contains a unit clause (or a pure literal) \(L\), perform \(DPLL\) on it \(\rho' = \rho, L\) and continue with DPLL on \(\rho' = \rho, L\) and \(\phi' = \operatorname{simplify}(\phi, L)\).
If none of the above apply, we choose some literal \(L\) appearing in \(\phi\) and continue with DPLL on \(\rho' = \rho, L\) and \(\phi' = \operatorname{simplify}(\phi, L)\).
If DPLL returns \(\rho''\) then we return \(\rho''\). Otherwise, we perform DPLL on \(\rho, \neg L\) and \(\phi' = \operatorname{simplify}(\phi, \neg L)\).
It turns out explicitly checking for pure literals tends to take more time than ignoring them. A clause is pure if it always appears either negated or nonnegated in the formula.
\[ \left[ (p \lor q) \land (\neg p \lor r) \land (\neg q \lor \neg r) \land (\neg p \lor q \lor \neg r) \right] \] At the start, \(\rho = \emptyset\).
- Choose \(p\) to be true. The simplification turns the formula into \[ \left[ (r) \land (\neg q \lor \neg r) \land (q \lor \neg r) \right] \] with \(\rho' = (p)\).
- \((r)\) is a unit clause, so we eliminate it. The simplification becomes \[ (\neg q) \land (q) \] with \(\rho' = (p, r)\)
- \((\neg q)\) is a unit clause, so we eliminate it. We get \[ [()] \] i.e. the singleton conjunction of the empty disjunction, i.e. falsehood.
- We backtrack. Choosing \(p\) didn't work, so we try \(\not p\). Our formula becomes \[ \left[ (q) \land (\neg q \lor \neg r) \right] \] with \(\rho' = (\neg p)\).
- We have unit clause \(q\). \[ \left[ (\neg r) \right] \] with \(\rho' = (\neg p, q)\).
- We have unit clause \(\neg r\), giving us \[ \left[ \right] \] with \(\rho' = (\neg p, q, \neg r)\).