Methods of Social Network Analysis explained with help of Collaboration Networks in

COLLNET

Hildrun Kretschmer

Department of Library and Information Science, Humboldt-University of Berlin, Germany

Abstract

There is a rapid increase of network analysis in several scientific disciplines beginning some decades ago. The social network analysis (SNA) is developed especially in sociology and in social psychology in collaboration with mathematics, statistics and computer science.

Social network analysis (SNA) can also be used successfully in the information sciences, as well as in studies of collaboration in science. Several methods of social network analysis will be explained with help of collaboration networks in COLLNET.

The growing importance of collaboration in research and the still underdeveloped state-of-the-art of research on collaboration have encouraged scientists from more than 20 countries to establish in 2000 a global interdisciplinary research network under the title “Collaboration in Science and in Technology” (COLLNET) with Berlin as its virtual centre.

The intention is to work together in co-operation both on theoretical and applied aspects.

Introduction

The increase in scientific-technical collaboration in the course of history has been vividly documented through a number of analytical studies.

For example, it has been shown that between 1650 and 1800 not more than 2.2% of scientific papers were published in co-authorship.

By contrast, the second half of the 20th century is characterized the world over by teamwork and co-authorships in the natural sciences and in medicine, i.e. about 60-70% of the scientific papers were published during this period in co-authorship. (DeB. Beaver & Rosen 1978; 1979a & b).

With the importance of collaboration in research and technology growing world-wide, it has become necessary to examine the processes involved in order to become aware of the implications for the future organization of research as well as those for science and technology policy. This has led to an increase in the number of scientific studies of this topic internationally. (Glanzel 2002, Borgman, C.L. & Furner, J. 2002).

The outstanding works of Donald deB. Beaver (1978), Derek John de Solla Price (1963) and others on the topic of collaboration in science have, over a number of years, encouraged a number of scientists working in the field of quantitative scientific research to concentrate their research in this field.

This has led both to an increase in the number of relevant publications concerning this topic in international magazines, and to an increase in the number of lectures in international conferences (Basu 2001, Braun et. al. 2001, Davis 2001, Havemann 2001, Wagner-Döbler 2001, Kundra & Tomov 2001).

By all accounts, this field of research is required to be acomprehensive and diversified area ranging from smallgroup research in social psychology/sociology to largenetwork analyses conducted into international co-authorshipor citation networks, including the concomitant observationof informal communication via interviews or interrogativesurveys on bibliometrical analyses.

A common bibliometricmethod for measuring thecooperation is the analysis of co-authorship networks. Asuitable webometricmethod has to be developed in thefuture.

There are various references to the positive effect of "multiauthored papers" in the co-authorship network: for exampleseveral stud ies show that international cooperation is linked

with a higher `citation impact' (Glänzel 2002).

The investigation of these processes can be made byanalyses at the micro level (individuals), at the meso level(institutions) or at the macro level (countries) (Glänzel 2002).

In the field of science studies one most frequently comesacross investigations on international cooperation in science,followed by cooperation relationships between institutions.

The last few years have seen an ascendancy in how to treatthese international issues.However, this trend has still failedto provide a concept on a fundamental and interrelated

theory regarding the theme entitled ´Collaboration in scienceand in technology´.The different approaches taken so farhave revealed the shortcomings of integration.On account of the diversity of these issues it is possible toobtain promising results only against the backdrop of aninterdisciplinaryapproach and from an interculturalviewpoint.

Both aspects are of basic importance in COLLNET.

In summary:

The rise in collaboration in science and technologyexperienced world-wide at national and international level,has assumed such an overriding importance that there isnow an urgent need perceptible to study such processes witha view to acquiring fundamental knowledge for organizingfuture research and its application to science and technologypolicies.

Foundation of COLLNET

Therefore in the year 2000 the time had come in the meantimeto create a global interdisciplinary research networkCOLLNET on the topic "Collaboration in Science and in

Technology" with64 membersfrom 20 countries of all continents.

The members intended to work in cooperation on boththeoretical and applied aspects on the topic "Collaboration inScience and in Technology".

The focus of this group is to examine the phenomena ofcollaboration in science, its effect on productivity, innovationand quality, and the benefits and outcomes accruing toindividuals, institutions and nations of collaborative work andco-authorship in science.

Web site:

Journal:

Journal of Information Management and Scientometrics

(Incorporating the COLLNET Journal)

COLLNET Meetings (2000-2006):

- First COLLNET Meeting, September 2000, Berlin, Germany

- Second COLLNET Meeting, February 2001, New Delhi, India

- Third COLLNET Meeting, July 2001, Sydney, Australia

- Fourth COLLNET Meeting, August 2003, Beijing, China

- Fifth COLLNET Meeting, March 2004, Roorkee, India

- Sixth COLLNET Meeting, July 2005, Stockholm, Sweden

- Seventh COLLNET Meeting, May 2006, Nancy, France

Papers in Co-authorship between COLLNET Members:

223 co-authored papers (lifetime, starting before official foundation of COLLNET)

The establishment of COLLNET has been reported in aspecial issue of the international journal Scientometrics. Inthis report, the work of both the first and second meetings

were outlined (Kretschmer, H., L. Liang and R. Kundra, 2001).

The areas of expertise represented by member scientists inCOLLNET are varied: mathematics, physics, chemistry,biology, med icine, history of science, social sciences and

psychology. The team includes many senior scientists suchas directors and/or deputy directors of large establishments,organizers and/or deputy organizers of world conferences in

the field ofscientometrics and informetrics as well aswinners of the Derek John de Solla Price Medal.

Among these are board members of the International Society for Scientometrics and Informetrics (ISSI), members of the German Society for Psychology and advisors to the international journal, Scientometrics. Current principal investigators, mainly from the field of quantitative scientific research (scientometrics and informetrics), engage in teamwork on the nature, characteristics, growth and policy relevance of collaboration and co-author networks. It is proposed to include in future more experts from other fields of scientific research and particularly from the social sciences, such as psychology and sociology.

COLLNET has been an important catalyst for research on collaboration and has provided opportunities for members to meet face to face at various international conferences such as at ISSI conferences (held every two years since 1987).

However, neither of these international conferences is focussed solely on issues relating to collaboration or collaborative networks, thus establishment of COLLNET in 2000 has opened an important forum in which ideas and work on these issues is exchanged. Closer personal contact between members inevitably leads to formal and informal agreements on collaborative projects on these crucial issues in research production.

Growth of Collaboration/Communication Structures in COLLNET

Since 2000

Two studies are presented:

- Development of informal and formal contacts between

COLLNET members studied by questionnaires

- Social Network Analysis of COLLNET

Development of informal and formal contacts between

COLLNET members studied by questionnaires

The questionnaire distributed to all of the COLLNET members asked for the following details:

-Names of those COLLNET members with whom

informal (loose) contacts exist in some form (either as

e-mail or exchange of reprints).

-Names of those with whom formal (intensive) contacts

exist in the form of discussions on common projects

with definitive titles or in the form of co-authorship of

joint papers.

The development of collaborative growth within the framework of COLLNET has been illustrated in Figures 2, 3 and 4.

Fig. 2 shows the number of informal (loose) contacts among the COLLNET-members at the time of the Second COLLNET Meeting in February 2001.

All the COLLNET members are compiled country-wise. 16 countries participated in COLLNET in the month of February. The line joining the front corner of Fig.2: (1/1) to the opposite rear corner (16/16) represents the main diagonal in which the contacts among COLLNET members of the same country have been plotted. As seen from Fig. 2, February 2001 witnessed the maximum number of informal (loose) contacts among COLLNET members within Germany (1/1) and between Germany and India (1/2). Informal contacts between other countries can also be observed.

Fig. 3 shows the number of the formal (intensive) contacts (joint projects or papers with definitive titles) as on the date of establishment of COLLNET, viz. 1st January 2000.

Fig. 4 shows the increase in these formal contacts over the one and a half years preceding the 3rd COLLNET Meeting.

Fig. 2 Fig. 3Fig. 4

It can be seen from the main diagonal in Fig. 3 that at the time when COLLNET was established, almost all the formal (intensive) contacts existed only among members belonging to the same country of origin.

However, Fig. 4 shows that during the subsequent period, the intensive contacts had expanded across the different countries. Fig. 4 resembles Fig. 2 in the graphical structural representation of informal (loose) contact.

This observation gives rise to the assumption that thanks to the development of a stronger COLLNET network, the loose contacts introduced through COLLNET have been progressively transformed into intensive contacts, thus fostering the development of a truly international research network.

Social Network Analysis of COLLNET

Sample Set

The bibliographies data of the 64 COLLNET members were examined, under them:

- 26 female and 38 male scientists

- 30 members from the European Union (EU) and 34 from non-European Union countries (N)

From the 34 members from the non-European Union countries (N) we have :

- 3 from Australia

- 7 from America (4 of them from North America)

- 19 from Asia

- 4 from Eastern Europe

- 1 from South Africa

The last COLLNET data are from June 2003.

Data and METHODS

Assuming that the reflection of collaboration is not limited to articles in SCI- or other data bases, a request was made to all the 64 COLLNET members for their complete bibliographies, independently of the type of the publications and independently from the date of appearance of these publications.

From these bibliographies all publications were selected that appeared

in co-authorshipbetween at least two COLLNET members.

Thus, it concerns

223 bibliographic multi-authored publications.

From this, the respective number of common publications between two members was determined as the basis for the analysis of the co-authorship network (SNA).

The co-authorship network developed according to this method covers theentire lifetime collaborationbetween the COLLNET members.

Developmental and structural formation processes in the bibliographic networks are studied.

For information and brief overview the classification of the 223 bibliographic multi-authored publications according to their type is shown:

CATEGORIESNUMBER

1. Articles in Scientometrics55

2. Articles in JASIS13

3. Papers in monographs 68

4. Papers from conference proceedings77

5. Books10

Total Sum223

Social Network Analysis (SNA): Methods

Otte and Rousseau (2002) recently showed that social network analysis (SNA) can be used successfully in the information sciences, as well as in studies of collaboration in science.

The authors showed interesting results by the way of an example of the co-authorship network of those scientists who work in the area of social network analysis.

Otte and Rousseau refer in their paper to the variety of the application possibilities of SNA, as well as to the applicability of SNA to the analysis of social networks in the Internet (webometrics, cybermetrics).

Introduction to SNA

(copied partly from the paper by Otte and Rousseau, 2002)

Network studies are a topic that has gained increasing importance in recent years. The fact that the Internet is one large network is not foreign to this. Social network theory directly influences the way researchers nowadays think and formulate ideas on the Web and other network structures such as those shown in enterprise interactions. Even within the field of sociology or social psychology network studies are becoming increasingly important.

In their article Otte and Rousseau are going to study social network analysis and show how this topic may be linked to the information sciences. It goes without saying that also Internet studies are to be mentioned, as the WWW represents a social network of a scale unprecedented in history.

Interest in networks, and in particular in social network analysis, has only recently bloomed in sociology and in social psychology.

There are, however, many related disciplines where networks play an important role. Examples are computer science and artificial intelligence (neural networks), recent theories concerning the Web and free market economy, geography andtransport networks.

In informetrics researchers study citation networks, co-citation networks, collaboration structures and other forms of social interaction networks.

What is social network analysis?

(copied partly from the paper by Otte and Rousseau 2002)

Social network analysis (SNA), sometimes also referred to as ‘structural analysis’, is not a formal theory, but rather a broad strategy of methods for investigating social structures.

The traditional individualistic social theory and data analysis considers individual actors making choices without taking the behaviour of others into consideration.

This traditional individualistic approach ignores the social context of the actor. One could say that properties of actors are the prime concern here.

In SNA, however, the relations between actors become the first priority, and individual properties are only secondary.

Social network analysis conceptualises social structure as a network with ties connecting members and focuses on the characteristics of ties rather than on the characteristics of the individual members.

One distinguishes two main forms of SNA: the ego-network analysis, and the global network analysis. In ‘ego’ studies the network of one person is analysed. An example in the information sciences is White’s description of the research network centred on Eugene Garfield. In global network analyses one tries to find all relations between the participants in the network.

Growth in the number of published articles in the field of SNA

The Fig. below clearly shows the fast growth of the field in recent years. More specifically, the real growth began around 1981, and there is no sign of decline.

Growth of social network analysis by Otte and Rousseau

Some notions from graph theory

(copied partly from papers by Otte and Rousseau 2002):

Directed and undirected graphs

A directed graph G, in short: digraph, consists of a set of nodes, denoted as N(G), and a set of links (also called arcs or edges), denoted as L(G). In this text the words ‘network’ and ‘graph’ are synonymous.

In sociological research nodes are often referred to as ‘actors’. A link e, is an ordered pair (X,Y) representing a connection from node X to node Y.

Node X is called the initial node of link e, X = init(e), and node Y is called the final node of the link: Y = fin(e). If the direction of a link is not important, or equivalently, if existence of a link between nodes X and Y necessarily implies the existence of a link from Y to X we say that this network is an undirected graph.

A path from node X to node Y is a sequence of distinct links (X, u1), (u1,u2), … , (uk,Y).

AB

CD

The length of this path is the number of links.

The length of the path from A to D can be 1 or 2 or 3.

In this article we only use undirected graphs. Consequently, the following definitions are only formulated for that case.

A co-authorship network is an example of an undirected graph: if author A co-authored an article with author B, automatically author B co-authored an article with A. An undirected graph can be represented by a symmetric matrix M = (mxy), where mXY is equal to 1 if there is an edge between nodes X and Y, and mXY is zero if there is no direct link between nodes X and Y.

Components

A component of a graph is a subset with the characteristic that there is a path between any node and any other one of this subset. If the whole graph forms one component it is said to be totally connected.

AB

CD EF

There are 2 components above.

Next we define some indicators describing the structure (cohesion) of networks and the role played by particular nodes.

Many more are described in the literature, but we will restrict ourselves to these elementary ones.

The Degree Centrality of a node A is equal to the number of his/her collaborators or co-authors. An actor (node) with a high degree centrality is active in collaboration. He/she has collaborated with many scientists.

The Degree Centrality in a V-node network can be standardised by dividing by V-1:

DCAs=DCA/(V-1) Example above: DCAs=3/3=1

Mean Degree Centrality (MDC) of the network is the ratio of the sum of the Degree Centralities of all the nodes in the network to the total number of nodes:

MDC=2L/V Example above: MDC=2*3/4=1.5

Betweenness CentralityBCA is the number of shortest paths (distance dxy) that pass through A.

Otte and Rousseau mention actors (nodes) with a high betweenness play the role of connecting different groups or are ´middlemen´.

Wasserman and Faust (1994, p. 188) mention: ´Interactions between two nonadjacent actors might depend on the other actors in the set of actors who lie on the paths between the two.

These “other” actors potentially might have some control over the interactions between the two nonadjacent actors.´ A particular “other” actor in the middle, the one between the others, has some control over paths in the network.