Mobility of inventors and the geography of knowledge spillovers. New evidence on US data

Stefano Breschi*Francesco Lissoni*†

* CESPRI, Università “L. Bocconi”, Milan (Italy)

† Dept. of Mechanical Engineering, Università di Brescia (Italy)

Abstract

In this paper we exploit new data on US inventors in Organic Chemistry, Pharmaceuticals, and Biotechnology to revisit the JTH test of the localization of knowledge spillovers (Jaffe, Trajtenberg, and Henderson; 1993). We find that inventors who patent across different companiescontribute extensively to the observed citation patterns, both directly (through personal self-citations) and indirectly, by linking the various companies via a social network conducive to more citations. To the extent that the geographical mobility of these “cross-firm” inventors is quite limited, the resulting social networks and citations patterns are found to be bounded in space. We conclude that spatial distance, as measured in the JTH experiment, is just a proxy for a much more important variable, such as social distance between inventors. In a similar vein, we show that technological distance, introduced by Thompson and Fox-Kean (2005) to criticize the JTH experiment, is also a proxy of social distance.

1. Introduction

In the past 20 years, research on the geography of innovation has revolved largely around the concept of “localized knowledge spillovers” (hereafter LKS). LKSs exist insofar scientificand technological knowledge may escape its producer’s control, and yet diffuse only locally. LKSs may explain why innovation activities are often found to be spatially clustered (Feldman, 1999).

For long supported only by circumstantial evidence, the LKS hypotheseswas first tested byJaffe, Trajtenberg and Henderson (1993; hereafter JTH). The three authors argued that knowledge spillovers may be measured bythe “citations to prior art” contained in most patent documents, and produced a statistical experiment showing that such citations come disproportionately from the same geographical area of the cited patents. The experiment requires matching each citing patent to a control one, with the same technological classification, in order to compare their location in space.

The basic JTH methodology has become a classical reference for anyempirical work onthe geography of innovation. Very recently, however, it has been criticized by Thompson and Fox-Kean (2005; hereafter TF-K) on the basis of what the two authors regard as a technical detail: the technological classification chosen by JTH to produce the control patent sample is so loose that their experimental evidence should be discarded as spurious.

In this paper weargue that the point raised by the TF-K criticism is not merely a technical one, but the reflection of a deep conceptual problemat the roots of the entire literature on LKSs.

Proponents of the LKS conceptbuild the latter on the assumption that scientific and technological knowledge is largely tacit, so that face-to-face contacts are the necessary vehicle for its diffusion; and conclude that geographical proximity is in turn a necessary condition for those contacts to take place. This line of reasoning ignores the obvious remark that if knowledge is tacit it is also private (Callon, 1984). It is certainly true that ideas walk largely on their producers’ legs, and move across the space accordingly: but it is up to the producers to decide with whom sharing, or to whom sellingthem (Breschi and Lissoni, 2001). Therefore, the geographic extent of knowledge spillovers is by and large controlled by inventors, up to the point that the notion itself of “spillover” (which remindsof the economic concept of externality) may no longer apply.

In this paperwe showthat the key variable affecting knowledge diffusion is not the geographical, but the social distance between patent inventors. Knowledge passes through social contacts built by scientists’ and technologists’ cross-firm mobility patterns and/or market activity, which may or may not be localized (Moen, 2000; Lamoreaux and Sokoloff, 1999).

In section 2 we sum up the key details of the JTH experiment and the TF-K criticism.Then, in section 3, weshow that patents contain enough information to measure social distance quite accurately, we recall a few notions of social network analysis, and apply themto all patent applications by US inventors registered at the European Patent Office in three inter-connected fields such as Organic Chemistry, Pharmaceuticals, and Biotechnology, from 1991 to 1999.

In section 4 provide descriptive evidence on the extent of inventors’ mobility across firms and space, and on the resulting shape and geographical features of the social network of inventors.

In section 5 we use our patent sample to reproduce the JTH experiment; we show that inventors’ mobility across firms and social ties between inventors largely explain the original JTH results on the localization of spillovers. Wealsoshow that technological distance, as measured by TF-K, is just a proxy of social distance, so that the two authors’ argument against JTH can be read as further proof of the importance of our point.

In the Conclusions we suggest that our evidence casts some doubts on the common interpretation of citation-measured knowledge flows as spillovers.

2. The JTH sexperiment: methodology and criticism

2.1 JTH methodology

The JTH experiment starts with the selection of a sample of originating(cited) patents. For each originating patent, all subsequent patents citing it as prior art are then collected, with the exclusion of company self-citations, i.e. pairs of citing-cited patents assigned to the same company[1]. The address of inventors recorded in patent documents is then used to assign patents to a geographical area, in order to compare the locations of citing and cited patents[2].

A control sample of patents is also built. Each citing patent is matched to a randomly drawn patent, with the same technology class and application date, but no citation link to the corresponding originating patent.

A test follows, which consists in comparing the frequency with which citing-cited patent pairs match geographically (at the city or state level) to the corresponding frequency for control-cited patent pair. If the former turns out to be significantly greater than the latter, this should be interpreted as evidence of localisation effects of spillovers over and above the agglomeration effects arising from other sources[3].

The evidence reported by JTH shows indeed that citations are highly localised. Citing patents are up to two times more likely than the control patents to come from the same state, and up to six times more likely to come from the same metropolitan area.

2.2 Interpretation and the role of social networks

The mobility of R&D scientists and engineers within a localized labour market and the existence of localized markets for technologies have both been reported by various authors as potential explanations of JTH results[4].

As for labour mobility, Almeida and Kogut (1999) have replicated the JTH exercise for each US state. They find evidence of localised knowledge flows only in those few regions (most notably, the Silicon Valley) where the intra-regional mobility of inventors across companies is high.

Markets for technologies may also explain the JTH results to the extent that technology users need to consult as frequently with suppliers as to encourage co-location. Research contracts signed by the same independent inventor with different companies may produce patents that appear to be unrelated in terms of ownership, but very close in terms of technological contents and geographical distance. Evidence in this direction exists for the case of technology licensing (Mowery and Ziedonis, 2001).

The above mentioned studies suggests that patents linked by a citation may also be personally or socially linked. A personal tie occurs whenever the same inventor is responsible for two patents from two different companies, either because he moved from one to another, or because he is an independent inventor who has sold ideas to both. A social tie exists whenever two inventors responsible for as many patents for two different companies have a common acquaintance with whom both have shared some professional experience (either a mobile inventor who has been colleague of both; or an independent inventor with whom both have worked on a joint project). More indirect social ties may also exist, produced by similar mechanisms. These ties may or may not be concentrated in the geographical space: robust social links between inventors may convey tacit information even when the inventors have just a few chances to meet personally (witness many fruitful academic cooperation experiences), or well after the inventors have last met (Agrawal, Cockburn and McHale, 2003). They are also very different from the serendipitous encounters two inventors can have thanks only to their geographical proximity, as when they meet at workshops or through friends and other non-professional acquaintances.

The original JTH experiment made use of a database with limited information on inventors. Therefore it could not tell these two major categories of social and personal ties apart, so that geographical distances turned out, in the experiment, to be a proxy for both.

Having reclassified patents according to inventors, we will be able to show the importance of personal and professional-social ties, as well as their degree of geographical localization.

2.3 Technological vs. social distance

More recently, not just the interpretation but the methodology itself of the JTH experiment has undergone criticism. Thompson and Fox-Kean (2005) have criticised the sample selection process followed JTH. In order to match citing and control patents according to their technological content, JTH relied the USCS 3-digit patent main classification[5]. TF-K observe that such classification is too loose: randomly chosen control patents may result to bear little relationship with both the citing and the originating patents. If this is the case, the JTH results may be simply explained by the fact that the control patents and originating patents come from different industries with different localization patterns, while citing and originating patents (whose technological proximity is ensured by the citation link) come from the same industries. In fact, when choosing very narrower classification schemes than the 3-digit one, and by making sure that citing-control patents match also at the secondary class level, the JTH experiment does not produce anymore the expected results.

In a reply to the TF-K remarks, Jaffe, Trajtenberg and Henderson (2005) observe that too strict a technological match has the undesirable side effect of limiting too much the size of the patent samples. They also observe that knowledge spillovers captured by patents necessarily imply an act of invention, since pure imitation does not give rise to any patent. Therefore, one must allow for some technological distance between patents: spillovers generate inventions, which are original enough to be classified not in the same main and secondary more-than-3-digit class of the originating patent.

We remark here that this dispute is not merely a technical one, as the authors involved seem to presume.

As long as individual inventors and teams are necessarily specialized, a narrow technological focus convey our attention only towards a relatively small and self-contained technological community. The closer two patents are (technologically), the more likely it is that they come from the same people, whether the two patents are linked by a citation or not. Therefore, very similar citing and control patents are very likely to come from the same inventors, or by closely associated inventors, who share mutual professional acquaintances having worked with the same colleagues.

A broader technological focus allows more inventors to come to our attention, whose social ties may be looser. The underlying technological community will be broader, more heterogeneous, and more loosely connected. Therefore we may expect that, by extracting control patents from such a broad sample, the number of patents personally or socially connected to the originating ones will be relatively low. To the extent that the labour markets and markets for technologies are localized in space, this will result in relatively low level of geographical co-location.

In other words, technological distance is also a proxy of social distance: by allowing it to increase we miss out crucial personal and social ties; by cutting it too short we end up dealing always with the same people.

3. Methodology and data

3.1 Social networks: methodology and definitions

Our methodology exploits information recorded in patent documents regarding the names, surnames, address and company affiliation of each inventor (Breschi and Lissoni, 2004 and 2006). The following hypothetical example illustrates the main idea (see Figure 1). Let suppose to consider five patents (1 to 5) and four assignees (). Assignee owns two patents (1 and 2), while assignees  and one each. Patents have been produced by thirteen distinct inventors (A to M). For example, patent 1 assigned to company  has been produced by a team comprising inventors A, B, C, D and E. It is reasonable to assume that, due to the collaboration in a common research project, these five inventors are socially linked by some kind of knowledge sharing. The existence of such a linkage can be graphically represented by drawing an undirected edge between each pair of inventors, as in the bottom part of Figure 1.

Figure 1 – Bipartite graph of patents and inventors

Source: Breschi and Lissoni (2004, 2006)

Repeating the same exercise for each team of inventors, we end up with a map representing the network of all inventors. Using the graph just described, we can measure how connected two patents are. In order to see how, we first give a few definitions:

i)For any pair of inventors, one can measure the distance between the two by calculating the so-called geodesic distance. The geodesic distance is defined as the minimum number of edges that separate two distinct inventors in the network[6]. In Figure 1, for example, the geodesic distance between inventors A and C is equal to 1, whereas the same distance for inventors A and H is 3. While A and C shared directly their knowledge while working on patent 1, A and H are more likely to have exchanged some word-of-mouth technical information through the mediation of other actors (such as B and F).

ii)Inventors may belong to the same social component or they may be located in socially disconnected components. A component of a graph can be defined as a subset of the entire graph, such that all nodes included in the subset are connected through some path. In Figure 1, for example, inventors A to K belong to the same component, whereas inventors L and M belong to a different component. A pair of inventors belonging to two distinct components have distance equal to infinity (i.e. there is no path connecting them).

iii)We define a cross-firm inventor any inventor whose name has been reported in patent documents assigned to different organizations. This kind of inventors plays a fundamental role in connecting teams of inventors belonging to different organizations. For example, in Figure 1, inventor F worked for both company  and , thus connecting the team of inventors (B,D ) with the team of inventors (H,I). Similarly, inventor G worked both for company  and , thus connecting the team (B,D,F) with the team (I,J,K).

Using these definitions, we may now turn to illustrate how the existence of a linkage between patents can be ascertained. Three possible relations exist between any pair of patents from different firms:

1)The two patents exhibit no social connection , such as when the inventors behind them belong to socially disconnected components[7].

2)The two patents are linked by a social connection, such as when their inventors belong to the same social component. We also calculate the social distance between patents as the geodesic distance between the two closest individuals from the two teams of inventors (minimum geodesic distance)[8]. As such, the social distance between two socially connected patents may vary from 1 to any positive discrete value.

3)The two patents are linked by a personal connection , such as when at least one inventor belongs to both patents’ teams. The social distance between two personally connected patents is zero.[9]

A limitation of this approach relates to the absence of rules to establish the decay of social links. In fact, we know for sure when two inventors come into contact, namely when they work together on the same patent for the first time. But we cannot be sure they keep in touch (and exchange information) after that common experience, unless we find them working on more joint patents in the following years. Some contacts established through co-inventorship may be dropped by one or both parts, but we do not know which ones. In addition, a network whose social ties are never cancelled grows cumulatively over time, up to the point that its size and complexity overwhelm our computational resources.