Linguistic Palaeontology: Science Or Fiction

LINGUISTIC PALAEONTOLOGY: SCIENCE OR FICTION?

A case study within Uralic

Dr. Angela Marcantonio

1. Introduction

I am a linguist, specialising in Uralic studies. My recent book (Marcantonio 2002a) carefully examines the evidence in favour of the theory that the Uralic languages are genetically related. In the extensive literature on this subject, I find that there is no scientific evidence at all in favour of the Uralic theory. Instead there is an extensive interlocking network of self-consistent assumptions and circular reconstructions. I conclude that the Uralic languages do not form a language family.

The purpose of this paper is to examine the methods of analysis that have been employed to build up the standard Uralic theory – and how the use of these methods has, I believe, so misled researchers. I believe this examination will be relevant to scientists in all disciplines that base their work on these reconstructions, as well as linguists who are responsible for establishing them. I hope to begin the process of a quantitative re-examination of other language families, including perhaps Indo-European.

Examining how researchers have come to believe in the unity of the Uralic language family, scholars have mainly used the so-called ‘Method of Historical Linguistics’. By comparing attested languages which are assumed to be related, and assuming a high degree of regularity in the way the languages have evolved in the past, it is believed one can reconstruct much of the language, location, culture and antiquity of a supposed ancient community. This process of reconstruction is referred to as ‘Palaeolinguistics’.

In the past, palaeolinguistics has attracted such a high scientific credibility amongst authors and peer-reviewers that many authors who report counter-evidence to the model tend to minimise or ‘re-interpret’ their data, rather than present a paper that clearly contradicts the model. Thus, one can observe papers in linguistics, archaeology, history and genetics that present evidence contradicting the theory, but whose conclusions either minimise the importance of their results, or re-interpret their data so that it now fits the model better. This minimisation or re-interpretation reinforces the interlocking network of assumptions and interpretations, so that even counter-evidence, ultimately, appears to contribute towards reinforcing the model.

One of the grossest distortions of this nature is found in the historical text that supposedly goes a long way towards establishing the Uralic origin of the Hungarians. We shall see that the original text of Constantine Porphyrogenitus refers to a population of Turks, and it clearly contradicts the supposed Uralic model. Historians describe this contradiction as ‘ridiculous’ because it contradicts the accepted linguistic model, and they simply assume that the original record was in error. The record is ‘corrected’ or ‘re-interpreted’ in most translations, so that it now appears to support the theory. Most textbooks do not mention that any re-interpretation is involved, and indeed many specialist papers fall into the same trap. One now finds this very text quoted in linguistic textbooks in support of the theory. A true circularity.

My central theme will be that I seek to invite authors – with the support of peer-reviewers – to have the courage to report their evidence as it stands. When authors discover evidence that is at variance with the linguistic models, this evidence must not be ‘re-interpreted’ in order to be consistent with the accepted model, but rather it should be stated clearly that the evidence contradicts the accepted model.

1.1 What is wrong with the standard Uralic theory?

According to the standard Uralic theory, the Hungarians, Finns, Samoyed, Lapp and so on all descend from an ancient community that lived somewhere near the Ural Mountains about 8,000 years ago.

Recent evidence from archaeology, anthropology and genetics appeared to contradict this theory. Several authors have drawn attention to this, including Julku (1997and 2000); Dolukhanov (2000a & b); Nuñez (1987, 1997a & b, 2000) and Niskanen (1997, 2000a & b). Compare also the recently published volume of ‘Root IV’, edited by Julku (Julku 2002). The principal items of counter-evidence are as follows:-

· The results from genetic analysis are at variance with the conventional assumption that genetic inheritance is the dominant factor in language transmission. The Samoyed and Ob-Ugric people have largely ‘Mongoloid’ genetic character, whilst the rest of the (traditionally classified) Uralic populations are largely ‘Europoid’. In fact, there is no evidence for a “Uralic gene”, other than as a linguistic definition of the gene characteristics near the Ural Mountains.

· There are no archaeological traces of migrations from the Ural Mountains toward the West, contrary to the predictions of the standard model. Indeed, populations and technology (such as arrow-heads, ice picks and ceramic technology), appear to have spread in a direction generally from the Southwest to the Northeast, that is, in the opposite direction than the one predicted by the conventional model.

· The supposed migration from the Ural Mountains into empty European areas is contradicted by evidence that North-eastern Europe has been inhabited, without interruption, by local populations throughout this period.

This evidence has given rise to many different models being proposed, such as the ‘Uralic lingua franca’ model as formulated by Wiik and Künnap (Künnap 1995, 1997a, 1997b, 1998, 2000/01, 2001; Wiik 1995, 1996, 1997a, b & c, 1999, 2000, 2000/01a, 2000/01b; see also Taagepera 1994, 1997, 2000 and Sutrop 2000a & b and 2001), or the chain model as proposed by Pusztay (1995, 1997, 2001).

All these new models appear to have a common thread. Despite their “revolutionary” or “revisionist” approaches (see Janhunen 2001), many of them still implicitly assume that there was in some sense a Uralic linguistic area, distinct from, for example, the Altaic or Siberian linguistic area. In fact, linguists as well as anthropologist and archaeologists[1] generally assume that the original, local populations who lived in northern-eastern Europe were the ancestors of the modern ‘Finno-Ugric’ and /or ‘Uralic’ populations (see for example Wiik 1996, 1997a, 2000; Künnap 1996, 2000/01; Dolukhanov 1998; Julku 1997; Nuñez 1997a & b; Pusztay 2001; Parpola (1999)),

I believe this central assumption, that linguistic studies have established the uniqueness of the Uralic family, is fundamentally flawed. Rather than being based on scientific evidence, the standard Uralic theory is founded on an extensive interlocking network of self-consistent assumptions and circular reconstructions. There is space here to outline only some of the linguistic evidence – for more information see Marcantonio (2002a & b): -

· The key Ugric node, on which the family was historically based, has never been reconstructed, and it is widely recognised that Hungarian is radically different in morphology, lexicon and phonology from its supposed siblings in the Ugric node.

· The Uralic node has likewise never actually been reconstructed. What is normally referred to as a reconstruction of the Uralic node in fact omits any systematic consideration of the key Ugric node. Statistical analysis of this corpus shows that it has the statistics of a set of accidental look-alikes.

· There are a number of linguistic correlations that are shared by the U languages; but these are also shared with the Altaic languages and Yukaghir. In fact, one can observe isoglosses that clearly cross the traditionally established language families.

1.2 What is wrong with the linguistic method of analysis?

More generally, there are severe problems with the methods that have been used to build up language families, including the Uralic family. In this section I shall briefly examine the linguistic methods. However, the problems that become evident appear to have infected other areas of study, such as the interpretation of historical texts: it is the interaction with other areas of discipline that will be the main focus of my talk and will be described in the next section.

It is generally assumed the use of the so-called “Comparative Method” of linguistic analysis yields results which are statistically significant and which therefore can be relied upon to establish language families. Indeed, if one finds many words in the various different languages, all related to one another through the same regular rules of sound-correspondences, then it is unlikely that the words are similar by chance and therefore there is a statistical significance to the results.

However the central problem is that I have not found any instance where such a corpus of regularly related words can be found. Most studies of the Uralic languages, including the main Uralic dictionary UEW (Rédei (ed.) 1986-91), do not state the sound-rules on which the correlations are supposed to be based. The principal exception is the Uralic corpus of Janhunen (1981), which clearly states the sound-rules (at least for vowels) joining identified words. However this corpus contains more sound-rules than regular correspondences, so that this corpus too has no statistical significance.

For other issues related to of the Comparative method, including the problems related to the basic regularity principle on which the comparative method is founded, see for example Fox (1995); Belardi (2002, I: 147ff.), Weinreich (1953), Weinreich, Labov & Herzog (1968), Labov (1963, 1972, 1980, 1981, 1994), Wang (1969,1979).

In a layer on top of the results of the Comparative Method, one finds the use of the methods of Palaeo-linguistics, in which one reconstructs the homeland and way of life of an assumed ancient community based on the reconstructed words. Putting aside the problem of the lack of statistical significance of the reconstructed words, there is recognised to be a further problem with this method (see for example Renfrew (1987: 77ff.)). The meaning of words may change through time, some crucial cognate-words may disappear from some languages, cognate-words may not refer to the same object, and the spreading of technological innovations may diffuse new names throughout a vast area. These factors mean that, even if one could demonstrate that the reconstructions have statistical significance, it would still be debatable whether the method is capable of producing a window on the pre-historical past that is anything more than speculative.

In order to illustrate this situation, one can consider the reconstruction of the ancient Uralic words for flora and fauna, which have been used to help establish the location of the ancient Uralic homeland. Typically there are several reconstructed names for each relevant term, each with a variety of alternative meanings, so that one is unclear which of the words are supposed to have been used. For example, Table 1 shows the various reconstructed meanings of the eight reconstructed words for ‘reindeer’: -

Table 1: the reconstructed words for ‘Reindeer’

1. dog/drone/male reindeer

2. Reindeer/elk

3. Ox/leading reindeer

4. Reindeer

5. male elk/deer/reindeer/ sacrificial animal

6. domesticated reindeer

7. domesticated reindeer/sheep/cow

8. male elk/reindeer/camel

Finally, one finds that most of the relevant reconstructed words are shared with non Uralic languages, mainly Altaic languages and Jukaghir. In fact, the reconstructed terms for body-parts and flora & fauna are present, on average, in 2.1 non-Uralic languages, contrary to the assumptions of the model.

If one accepts the state of affairs outlined above it becomes evident that relying on the method of Palaeo-linguistics can be dangerous in general, and in the Uralic context in particular. In the next paragraph I am going to illustrate what represents, in my opinion, one of the most misleading instances of linguistic and extra-linguistic reconstructions within Uralic: the reconstruction of the name ‘magyar’, the self-denomination of the Hungarians, and the consequent historical and ethnic reconstruction of their origin. This example in turn will illustrate one of those interlocking network of self-consistent reconstructions and interpretations upon which the standard Uralic theory is based, as claimed above and in Marcantonio (2002a).

2. The reconstruction of magyar and the associated re-interpretations of historical evidence

2.1. Introduction

The reconstruction of the ethnonym magyar, which has played a central role in the historical formation of the standard Uralic theory, is a paradigmatic example of the interlocking network of self-consistent reconstructions and interpretations upon which the Uralic theory appears to be founded.

All the available historical records (including Greek, Latin and Arabic sources of the 9th /10th Centuries AD) that mention names similar to magyar clearly and consistently refer to Turkic tribes. They therefore contradict the Uralic theory, in which linguists claim that the Hungarian language and peoples originate not from the Turkic, but from the Uralic group of languages. In order to square this evidence with the dominant model, massive re-interpretation is required, as described in detail below. Commonly, no mention is made that any re-interpretation is involved, not even in the specialist literature (see for a recent example Rédei 1998: 57), so that the re-interpretation /correction, being passed on from textbooks to textbook, generations after generations of scholarship, acquires the status of a ‘pseudo-fact’.

Linguistically, there are clear, Turkic etymological correspondences with the term magyar, dating from early Arabic records. These correspondences also contradict the dominant linguistic model, but they usually go unmentioned in textbooks. Indeed, even in specialist literature they are usually referred to as forming part of the unsolved “Hungaro - Bashkir complex”, as if it were an arcane detail rather than a major element of counter-evidence to the theory.

As we shall see below, linguists prefer an etymology which connects magyar to another ‘Uralic’ proper name, Mansi, the self denomination of the Voguls. Unfortunately this etymology differs from the historically attested forms and is linguistically ad-hoc. Linguists and dictionaries recognise that the etymological connection magyar-mansi is ‘problematic’, but it is nevertheless accepted on the grounds that such a connection is ‘supported’ by the historical ‘data’, thus giving rise to a true circularity.

2.2. Magyar: the historical background

As mentioned, the (presumed) etymology of the Hungarians self-denomination has been central in the emerging and establishing of the conventional paradigm. In fact, it was since long known that the Hungarian Chronicles [2] had indicated an unspecified Eastern homeland for the Hungarians. Between the 15th and the 17th Centuries it came to be taken for granted that this Eastern homeland could be identified with an area near the Ural mountains, called ‘Yugria’ (hence the term ‘Uralic’ and ‘Ugric’). This belief was in turn based on the apparent similarity of the toponym Yugria (with which the area was indicated in Russian and Western European sources) and the ethnonym ‘hungarus’, the Hungarians’ external denomination. This connection was later on reinforced by the discovery that one of the populations living in that area, the Voguls, called themselves ‘Mansi’, which ‘to the lay ear slightly resembles the name magyar’ (to use Kálmán (1988:395) words). In other words, one of the cornerstones of the traditional paradigm - the belief that the closest relatives of the Hungarians are the Vogul/ Mansi peoples -- was originally based not so much on scientific arguments, as on a superficial, ‘accidental’ similarity between proper names: magyar vs Mansi and hungarus vs Yugria. In the meantime, linguists and historians believed to have found early occurrences of the term magyar in the text ‘De Administrando Imperio’, written in Greek by the Byzantine emperor Constantine Porphyrogenitus between 947 and 952 AD. The testimony of the historical text was held to lend support to the Uralic origin of the Hungarians, as established by linguists.