§0 Introduction
This essay is concerned with the role of ‘self-reference’, and especially the problems arising from it, in mathematics and logic. I also tried to point some connections with recursive definitions. Although I tried to express my thoughts in various places, since I am not an expert on the philosophy of mathematics, I could hardly avoid the historical character of this account.[1]
§1 Self-reference and recursive definitions
Self-reference is the phenomenon of something referring to itself. When we say refer we automatically presuppose an underlying object which helps us in making references, a suitable means of expression which enables us to make references. It is natural to choose a ‘language’ from our experience as such an object. So we can use English or Greek, a formal language (e.g. the language Î of set theory or LPA= <+,·, s, 0> of Peano Arithmetic) or the informal language of mathematics, the language which is used by any mathematician in his daily work. In this way we can construct sentences which refer to themselves. For example:
(a) This sentence is written in Chinese.
(b) This sentence contains thirty-eight letters.
(c) This sentence contains exactly three A’s.
These sentences, although not very in use, are understandable and clear enough to grasp, so that we can decide about their truth or falsehood.
Self-reference occurs also where we make inductive (or recursive) definitions. Take this simple example:
f(0)=0
f(n+1)=f(n)
This definition defines a function f by using its previous values. To compute the value of the function f on e.g. n=2 we are referred by f (its definition) to itself, namely the function f, but on a previous value. We will return to the meaning of such definitions later on.
Finally, an example of a self-reference in a formal language can be constructed via coding. The most famous example is Gödel’s undecidable proposition constructed in the proof of his (1st) incompleteness theorem[2].
The above examples do not seem to cause any problem in our understanding. Their meaning is determined and there is not any ambiguity in what they are saying. It is worth noticing that in the sentences (a)-(c) the first thing to do in order to understand them (and perhaps assess their validity) is to transform them in a more common form, equivalent (semantically) to the first form: for example (c) means
The sentence ‘This sentence is written in Chinese’ is written in Chinese.
This operation (transformation) takes place in our minds, even if we don’t realise it immediately (in that case unconsciously). In fact, a similar operation/indication takes place under a particular interpretation/understanding of a recursive definition of a function. Take this simple form of such a definition:
f (0)=c (1)
f (n+1)=h(f (n))
where h is a given function. All the arguments below apply equally to all kinds of primitive recursive definitions.
From a Platonist’s point of view, this ‘definition’ is an indication of a function. An infinite correspondence thought as a complete totality, which exists independently of the ‘real’ world and our sense perceptions[3]. To be more specific:
A Platonist, in the view of that ‘definition’ seeks a justification that there is a unique function in his ideal universe, which fulfils these equations. Such a justification can be found in books on axiomatic set theory or the foundations of mathematics[4] and it consists in
· Defining a function f, which fulfils (1). This is done essentially with the use of a sequence of finite sequences[5]. Practically it is the function
f(n)=h…h(f(0)) (2)
· Justifying its uniqueness by using mathematical induction[6].
The Platonist, having available justification for every recursive definition, interprets the last as an indication of the function given by the definition (2) which is explicitly (and not constructively) given in his justification[7].
§2 Problems which arise from self-reference
In contrast to the sentences (a)-(c) in the preceding section, there are cases where self-reference causes problems or, at least, seems to cause problems; and this was already known in antiquity. A very popular example is liar’s paradox. Its most simple form is this:
This sentence is not true (3)
By logical inferences we can imply that (3) is true iff (3) is not true. Indeed, if it is true, then, since what it asserts is true, it is not true; and conversely, if it is not true, then its assertion is false and thus it is true. Another form of this antinomy (via some sort of coding) is the following
The statement made in this essay, page 3 and line 10, is false[8].
Some of the paradoxes of this kind can be criticised as being meaningless propositions[9]. However, liar’s paradox is genuine and the reason is that there is no easy way to escape from the circle it involves. It is a sentence, which refers to something given, namely it self. Moreover it consists of the connection of the linguistic object sentence and the non-linguistic object, the concept of truth: a legitimate connection since truth is a concept which applies to the object sentence. The problem here seems to be semantic; and by that we mean that it has to do with semantics, as Tarski defined the last:
‘…the totality of considerations concerning those concepts which….express certain connections between the expressions of a language and the objects and states of affairs referred to by these expressions.’[Tarski 1936]
And as examples of semantical notions, Tarski mentions (among others) the concepts of truth, satisfaction and definition.
The liar’s antinomy belongs to a group of paradoxes called semantical[10], in accordance with their nature. But before discussing this category as a whole, we will try to demonstrate the vicious circle that occurs in the liar’s paradox, and point out some common features with circular, improper recursive definitions of functions. This treatment is based on a remark made by Goodstein in [Goodstein 1965, chapter II].
Consider the sentences:
4. 0=0
5. Sentence 4 is false
6. Sentence 5 is true
The bold letters indicate the places where names (variables) occur, instead of individuals (e.g. the variable ‘sentence 4’ is a name of the sentence ‘0=0’).
In order to assess the validity of sentence 6 we try to decode it, i.e. to eliminate all the variables occurring in it. So sentence 6 means:
7. ‘Sentence 4 is false’ is true, or
8. ‘ “0=0” is false’ is true
And now, having the proposition in this form, we are in a position to understand it fully and assert its truth or falsehood. Note that, if we have established an assignment of individuals (sentences) to the variables, then the sentences 6,7 and 8 have the same meaning. So, they are just different ways to express the same thing; namely the sentence 8. Moreover we have a directed procedure for eliminating the variables: in each step we eliminate the current variable, which doesn’t appear in any of the following steps.
Consider now, for the sake of clarity, the following form of liar’s paradox:
9. Sentence No.9 in this essay is false
The part of the sentence in boldface letters clearly serves as a name, a variable.
Trying to eliminate, as in the preceding, the variable via substitutions, we obtain the following infinite regress:
‘Sentence no.9 on this essay is false’ is false
“ ‘Sentence no.9 on this essay is false’ is false” is false
…etc…
Now this is, as Goodstein says, ‘as if I sought to define a function by saying for any n,
f(n-1)=f(n)+n
so that f(0)=f(1)+1=f(2)+2+1=f(3)+3+2+1
and so on, and we never reach the value of f(0).’[Goodstein 1965, chap. II]
§4 Semantical paradoxes and their nature- ways of resolution
We examined a typical example of this kind of antinomies. There are also other well-known antinomies of this category (that of Berry, Richard, Zermelo - Köning, Grelling are perhaps the most popular[11]) but this is not a place to make an extensive account.
Ramsey in [1925] says that semantical paradoxes [his Group B, see footnote 9]
‘… are not purely logical, and cannot be stated in logical terms alone, for they all contain some reference to thought, language or symbolism, which are not formal but empirical terms.’
In contrast, Tarski demonstrated via his theory of metalanguages that no empirical notions are involved in this category of paradoxes and he obtained a resolution[12].
Beth writes:
‘Tarski shows that the semantical paradoxes arise from the fact that semantical notions referring to certain linguistic objects are expressed by means of the very same language to which those objects belong. So mathematical notions referring to linguistic objects that belong to a certain language L (the object language) should be expressed, not in L, but by means of a suitably chosen metalanguage ML.’[Beth 1965 (b) p.112]
Tarski, pointing to the need of a metalanguage, in order to talk about truth in a given language, says
‘…semantical concepts have a relative character...they must always be related to a particular language. People have not been aware that the language about which we speak need by no means coincide with the language in which we speak…
…the semantical concepts simply have no place in the language to which they relate…the language which contains its own semantics, and within which the usual logical laws hold, ,must inevitably be inconsistent.’
This theory provided a satisfactory resolution to all the classical semantical paradoxes[13] for a long time. However, Saul Kripke showed[14] in 1975 that empirical notions are, in some way, involved in this kind of antinomy. Barwise and Moss write
‘For many years, Tarski’s …approach…was, by and large, the accepted wisdom on the semantical paradoxes. Things changed in 1975, though, with…Kripke’s article…Kripke convincingly demonstrated that circularity of reference is much more common than had been supposed, and that whether or not something is paradoxical may well depend on non-linguistic, empirical facts.’ [Barwise & Moss 1996]
§4 Self reference and logical paradoxes
In the preceding section we examined self-reference mainly in situations which involve semantical notions and we saw that the problems that arose there are due to, as Beth describes, ‘an inadequate manner of handling semantical notions such as “meaning”, “truth” and “definition” which play an important role in metalogical discussions.’ [Beth1965 (b) p.503]. Now we will see that a similar phenomenon occurs in another way, more logical or mathematical. We will examine Ramsey’s other group of antinomies[15], often called logical paradoxes. Ramsey was the first to pose clearly the need for these two classes of problems to be studied separately. In his original paper he makes the following description, which has found no significant objection so far;
Group A [the class of logical paradoxes] consists of contradictions which, were there no provision made against them would occur in a logical or mathematical system itself[16]. They involve only logical or mathematical terms such as class and number, and show that there must be something wrong with our logic or mathematics.[Ramsey 1925]
The most famous example in this group is Russell’s paradox of classes. Consider the class r defined as follows
x Î r « xÏ x (10)
or, in words, the class r (called Russell’s class) consists of the classes which do not contain themselves.
Using this definition we can immediately imply
rÎr « rÏr
which is obviously a contradiction. Russell’s paradox here is formed in set- theoretical terms but it can also be expressed in other ways (e.g. in terms of pure logic[17]).
But it is this form which makes it most striking, since this way is closer to mathematics.
Since we arrived at a contradiction (and our means of inference are correct) there must be something wrong with the definition (10). But what?
It is true that at a first glance it is the part ‘xÏ x’, which seems odd. However, the definition of r is based on something much more general; namely, the so-called comprehension Axiom[18]. This principle arises clearly from a Platonist’s point of view. Having available the (pre-existing) ‘mathematical entities’ in the Platonic universe and a predicate or property p which refers to any mathematical entity, we are allowed (via the comprehension axiom) to create a unity; namely a set which consists of all mathematical entities to which the predicate applies. This set is called the extension of p[19]. So, to put it shortly, using the words of Boolos, the main assertion of the comprehension axiom (assuming the axiom of extensionality) is that ‘any predicate has an extension’[20] (this is often called naïve conception of set).
The comprehension axiom seems to be the reason not only of Russell’s paradox but for a whole class of paradoxes-namely the logical paradoxes. This principle allows the following circular phenomenon to occur: The definition of a set in terms of a totality that contains this very set. This circle seems to be connected with what H. Behmann observed (1931), that definitions which are responsible for logical paradoxes do not satisfy Pascal’s condition. According to that, definitions should allow the replacement of the term they define, in any place that this occurs, by its definiens. We can justify that the definition (10) does not satisfy this condition: If we try to replace r, by its definiens in the expression rÎr, we will get rÏr, which is by no means what we mean[21].
As the comprehension axiom is the main reason for all these problems, it is natural to try to restrict it; and indeed all the successful attempts for the elimination of logical paradoxes consisted of suitable restrictions of this axiom. Examples of such attempts are Russell’s theory of (simple) types[22] and the so-called theory of limitation of size, which is expressed in various axiomatizations of set theory, e.g. ZF.