Axiom Systems

In an attempt to make logical proofs rigorous, logicians at the end of the nineteenth century and the beginning of the twentieth developed axiom systems for formal logic in the spirit of Euclid’s axiom system for geometry. Typical characteristics of axiom systems are:

1.  The logical expressions are defined over a language with a small number of symbols.

2.  A small number of logical assertions, that are seen as expressing ‘obvious truths’, are indicated as axioms. These axioms can be introduced at any point during a proof.

3.  A small number of inference rules are defined over the logical assertions.

4.  There are no subproofs.

Inference rules can be written as {j1 , ¼ ,jn} ⊢ y, indicating that statement y can be inferred from the set of statements j1 , ¼ ,jn. A proof in an axiom system S of is a sequence of assertions, starting with a set of assertions G and ending with a assertion y, where each assertion f in the sequence after G is either an axiom, or is derived from a subset of the assertions in the sequence before f by the application of some inference rule. If there exists a formal proof in S from G to y, we write G ⊢S y.

Example:

The axiom system R.S. (Rosser System) for truth-functional logic is defined over assertions containing only the truth-functional connectives Ø and Ù.

Other connectives are used as shorthand expressions following the following definitions:

j Ú y =df Ø(Øj Ù Øy)

j ® y =df Ø(j Ù Øy)

Its axioms are:

R.S.1: j ® (j Ù j) (so really Ø(j Ù Ø(j Ù j)) )

R.S.2: (j Ù y) ® j (so really …)

R.S.3: (j ® y) ® (Ø(y Ù c) ® Ø(c Ù j)) (so really …)

Its only inference rule is:

Modus Ponens: {j ® y, j}⊢ y (which is really {Ø(j Ù Øy), j}⊢ y )

Here is a proof in R.S. that shows that {P ® Q, Ø(Q Ù ØR)} ⊢R.S. Ø(ØR Ù P):

1. P ® Q Premise

2. Ø(Q Ù ØR) Premise

3. (P ® Q) ® (Ø(Q Ù ØR) ® Ø(ØR Ù P)) R.S. 3

4. Ø(Q Ù ØR) ® Ø(ØR Ù P) MP 1,3

5. Ø(ØR Ù P) MP 2,4

Axiom Systems and Natural Deduction Systems

Especially when starting, it is much harder to find proofs in an axiom system than in a natural deduction system. This is because the goal of axiom systems is a theoretical one, not a practical one. Thus, where natural deduction systems try to provide rules that reflect inferences (as defined over expressions) that we use in daily reasoning, axiom systems are simply out to find a very small number of ‘logical atoms’ (be they symbols, axioms, or inferences) from which all other logical principles can be built up.

There are two obvious technical differences between axiom systems and natural deduction systems as well: axiom systems don’t use subproofs, and natural deduction systems don’t use axioms. However, consider the following system R.S.’, which has no axioms, but 4 inference rules:

{j ® y, j}⊢ y

⊢ j ® (j Ù j)

⊢ (j Ù y) ® j

⊢ (j ® y) ® (Ø(y Ù c) ® Ø(c Ù j))

Obviously, proofs in R.S.’ can completely mirror the proofs in R.S., showing that axiom systems can be captured as natural deduction systems that happen to not have any inference rules defined over subproofs.

In fact, one could make axiom systems very much like natural deduction systems by adding axioms reflecting typical inferences for natural deduction systems. However, since axioms systems don’t use subproofs, they can’t capture such basic inference patterns as Conditional Proof, Indirect Proof, or Proof by Cases.

Derived Rules and Theorems

The nice thing about axioms systems is that one can build proofs on other proofs. For example, the proof given earlier of {P ® Q, Ø(Q Ù ØR)} ⊢R.S. Ø(ØR Ù P) can be regarded as a proof of the more general {j ® y, Ø(y Ù Øc)} ⊢R.S. Ø(Øc Ù j). Hence, whenever in some later proof one has two statements of the form j ® y and Ø(y Ù Øc) respectively, then one can simply appeal to this result to infer the statement of the form Ø(Øc Ù j). The result {j ® y, Ø(y Ù Øc)} ⊢R.S. Ø(Øc Ù j) can thus be treated as a derived inference rule {j ® y, Ø(y Ù Øc)} ⊢ Ø(Øc Ù j) (call it DR1) that can be used in other proofs.

Example:

1. P ® (P Ù P) R.S.1

2. (P Ù P) ® P R.S.2

3. Ø(ØP Ù P) DR1

(remember, (P Ù P) ® P is really Ø((P Ù P) Ù ØP) )

Technically, the sequence above is not a proof as defined earlier, so we can either call it something else (e.g. a demonstration), or we can extend our definition of proof. We’ll do the latter without giving any explicit definition. The above sequence can thus be seen as a proof of ⊢R.S. Ø(ØP Ù P). Once again, though, this can be treated as a derived inference rule ⊢ Ø(Øj Ù j), but since there are no supporting statements, we’ll say that we have proven Ø(Øj Ù j) to be a theorem of R.S (let’s call it T1). Like axioms, theorems can be introduced in future proofs.

Example:

1. ØØP ® P T1

(remember, ØØP ® P is really Ø(ØØØP Ù P), which is indeed of the form Ø(Øj Ù j) ).

Independence

Suppose that some axiom of some axiom system can be derived from the other axioms and inference rules of that axiom system. You can understand that for logicians trying to find a set of ‘logical atoms’ from which everything else can be built up, this is a kind of ‘no no’; if an axiom is derivable from others, then it is not a ‘logical atom’, and should not be included in the axioms.

The technical term for this is independence: An axiom that is not derivable from the other axioms (and inference rules) is said to be independent. For logicians developing axiom systems, then, it is important to show that the postulated axioms are independent from each other. This can be a difficult task however.

There are various ways to try and show that an axiom is independent from the others. Below I’ll describe one of those ways.

Suppose that I interpret the atomic claims (P, Q, R, etc) in system R.S. not as assertions, but as variables denoting one of three possible objects: A, B, and C. Moreover, the connectives are not seen as truth-functional connectives, but as algebraic operations that map those objects to other objects in accordance to the following tables (which are not ‘truth-tables’, but function similarly):

P / ØP
A / C
B / B
C / A
P / Q / P Ù Q / P ® Q (= Ø(P Ù ØQ)
A / A / A / A
A / B / B / B
A / C / C / C
B / A / B / A
B / B / C / A
B / C / C / B
C / A / C / A
C / B / C / A
C / C / C / A

If we now generate the ‘truth-tables’ for the three axioms of R.S., you’ll find that any expression of the form of R.S.2 or R.S.3 will always map to A (we can call it a A-tautology). Moreover, the (derived) table for ® reveals that whenever you have an expression j ® y that denotes A, and an expression j that denotes to A, then the expression y will denote A as well. Therefore, if j ® y is an A-tautology, and j is an A-tautology, then the expression y will be an A-tautology as well. Hence, if you make a sequence of expressions of R.S. where each expression is either an instance of R.S.2 or R.S.3, or is derived from any two of the previous expressions in that sequence using Modus Ponens, then by induction it immediately follows that every expression in that sequence is an A-tautology.

However, not every expression of the form R.S.1 denotes A, and hence there are expressions of the form R.S.1 that are not A-tautologies. Those expressions are therefore not derivable from R.S.2 and R.S.3 and Modus Ponens, meaning that R.S.1 is independent.

To show the independence of R.S.2, use the same table for Ø, but use the following table for Ù:

P / Q / P Ù Q
A / A / A
A / B / A
A / C / C
B / A / A
B / B / A
B / C / C
C / A / C
C / B / C
C / C / C

With this table, every instance of R.S.1 and R.S.3 will be an A-tautology, and again the derived table for ® will reveal that whenever you have an expression j ® y that denotes A, and an expression j that denotes to A, then the expression y will denote A as well. So, since not every instance of R.S.2 denotes an A-tautology, R.S.2 is independent.

Homework Question: Show that R.S.3 is independent.

Completeness

Another obvious property that one would like axiom systems to have is that they are complete. R.S. turns out to be complete, but the proof is heinous, and that’s all I’ll have to say about that.

Famous Axiom Systems

Below are some famous axiom systems proposed over the ages. As far as I know, they are all complete.

Frege’s System (F.S.) (as defined over Ø and ®)

F.S.1: j ® (y ® j)

F.S.2: (j ® (y ® c)) ® ((j ® y) ® (j ® c))

F.S.3: (j ® (y ® c)) ® (y ® (j ® c))

F.S.4: (j ® y) ® (Øy ® Øj)

F.S.5: ØØj ® j

F.S.6: j ® ØØj

Lukasiewicz showed that F.S.3 can be derived from F.S.1 and F.S.2, so F.S.3 is not independent. He also replaced F.S.4 through F.S.6 with a single axiom:

L.S.1: j ® (y ® j)

L.S.2: (j ® (y ® c)) ® ((j ® y) ® (j ® c))

L.S.3: (Øj ® Øy) ® (y ® j)

Metamath (http://www.metamath.org) uses L.S. (check it out!)

Here is an example proof in L.S. that shows that ⊢L.S. P ® P:

1. (P ® ((P ® P) ® P) ® ((P ® (P ® P)) ® (P ® P)) L.S.2

2. P ® ((P ® P) ® P L.S.1

3. (P ® (P ® P)) ® (P ® P) MP 1,2

4. P ® (P ® P) L.S.1

5. P ® P MP 3,4

Russell and Whitehead’s system as laid out in Principia Mathematica (PM) (defined over Ø and Ú, j ® y short for Øj Ú y, and the (single) rule being {Øj Ú y, j} ⊢ y):

PM.1: (j Ú j) ® j

PM.2: y ® (j Ú y)

PM.3: (j Ú y) ® (y Ú j)

PM.4: (j Ú (y Ú c)) ® (y Ú (j Ú c))

PM.5: (y® c) ® ((j Ú y) ® (j Ú c))

In 1926, Paul Bernays showed PM.4 to be redundant.

Meredith’s single axiom system (defined over ® and Ø. Again, Modus Ponens is the only inference rule)

M.1: ((((j ® y) ® (Øc ® Øθ)) ® c) ® τ) ® ((τ ® j) ® (θ ® j))

Nicod’s single connective (the ‘|’, which is the NAND, also called the Sheffer stroke, first used by Peirce), single axiom, single rule system:

N. 1: (j | (y | c)) | ((θ | (θ | θ)) | ((τ | y) | ((j | τ) | (j | τ))))

The single rule is: {j | (y | c), j} ⊢ c