Intelligent Systems: The Most Cited (“H-Index”) Papers1

Daniel E. O’Leary

This paper summarizes the most cited papers from IEEE Intelligent Systems based on citation information gathered from ISI’sWeb of Knowledge (WOK) and Google Scholar (Google). I use the so-called “H-Index” as a basis for determining the number of most cited papers listed.

Why Study Number of Citations?

Diamond (1986) and others have noted that citations have economic value to the author. As a result, citation studies can be important to the individuals identified by the study. Further, citation studies are often used as important information as faculty members are evaluated for tenure, promotion or chaired positions.

Citation studies also have been used to assess the “Scientific Wealth of Nations” (e.g., May 1997). As a result, it is not surprising that citation studies also have been used to determine the high impact journals, researchers and leading universities (e.g., Adams 1998 and Garfield and Welljams-Dorof 1992). Further, citation studies have been used to attempt to determine “top” and most influential articles (e.g., Smith 2004). Since IEEE Intelligent Systems has just finished its 10th year, it is important to begin to understand where the field has been through its citations and who some of the key contributors have been.

Bibliographic Metrics

The H-Index often is used to indicate frequently cited researchers. For example, there is a web page of computer scientists with H-factors greater than or equal to forty.2 The “H – Index” is computed after the number of citations of an author or journal, etc. are each computed and ranked with the paper with the largest number of citations ranked first, the second largest number of citations ranked second, etc. The H-Index is thelargest rank at which the number of citations is greater than or equal to the rank. For example, if the 30th ranked paper has 30 citations, then the H-Index would be 30.

The H-Index can be generalized. For example, the “H/j – Index” can be defined as the largest rank greater than the number of citations, divided by j. Since the H-Index does not decrease over time (unless the set of papers being indexed changes), the H/j – Index would be useful for comparing newer journals or newer authors, for H >1.

ISI World of Knowledge (WOK)

The analysis was based on data generated from Thompson’s “ISI WOK” (e.g., Howitt 1998) and employed three citation databases, “Science Citation Index Expanded,” “Social Science Citation Index” and “Arts & Humanities Citation Index.” Thompson’s “ISI WOK” is well-known and can influence author publication choice, journal prestige and other factors through its well-known “biblio-metrics,” including the so-called “impact factor” and “citation half-life.” IEEE Intelligent Systems is among the journals indexed by ISI.

GoogleScholar

However, ISI’s WOK, is not the only source of citation information. In addition, this paper also analyzed the most cited papers using Google, in an effort to better understand the relationship between the number of citations in ISI and Google. Although Google is still only in beta form, it increasingly is being used to assess citations of scholarly papers.2

Approach

This research used a systematic approach to determine the set of papers to be considered. First I found those papers that had the most ISI WOK citations and then I found the number of citations for those papers using Google.

Using the ISI WOK I searchedfor all citations for “IEEE Intell*” where “*” is a wild card. I removed those entries with no authors or with journal names indicative that they were not entries from IEEE Intelligent Systems. Accordingly, I removed names ending with descriptors such as “Veh” that clearly were not “IEEE Intelligent Systems.”

ISI WOK has two types of citation entries. “Indexed” entries correspond to papers from the journals that ISI WOK indexes. In general, there is a unique abbreviation for each indexed journal. For those entries all authors are captured, and complete bibliographic information is provided. In the example below, “McIlraith, S.A.” is the (lead) author, “IEEE Intell Syst App” is the journal abbreviation, the year it was published was 2001, while the volume was 16 and the page it started on was p. 46. There were 144 citations indexed to this record. Detailed article and author information is available if “View Record” is clicked. The same information also would be provided under any co-authors.

MCILRAITH SA / IEEE INTELL SYST APP / 2001 / 16 / 46 / 144 / View Record

The other type of entry is “Not-Indexed.” For these entries, information is only made available for the lead author, and bibliographic information may be incomplete.No “View Record” is attached to the citation entry. There is not necessarily a unique abbreviation for the journal.Unfortunately, if the original bibliographic information in the entry being captured is incomplete, then the ISI WOK generally will be incomplete, even if the journal is one that is indexed. This results in a portion of “indexable” entries not being associated with indexed information. Accordingly, in some cases it is arguable that the non-indexed information can be associated with indexed items. For example, the below information was not included with the indexed record citations. However, as seen in this example, in many cases it appears that “NotIndexed” information could be disambiguated and included with indexed information.

MCILRAITH S / IEEE INTELL SYST / 2001 / 16 / 4653 / 1

This analysis resulted in

555 “Indexed” records with 4087 citations, with the following journal names and number of indexed records

IEEE Intelligent Sys (3)

IEEE Intell Syst App (246)

IEEE Intell Syst (306)

In addition, I found 881 Not Indexed items with 1322 citations.

I tried to associate Not Indexed information with Indexed information for the more frequently cited papers. If it appeared that Not Indexed information could be associated with an Indexed record, I gathered that information. However, in some cases, I could not disambiguate the non indexed citations. For example, “Fenzel” was particularly difficult. Apparently he had two papers in the same volume, but some captured citations only referenced volume, making it impossible to uniquely associate a Not Indexed item with an Indexed item, based on ISI WOK information.

In order to find the number of citations for each paper in Google, I searched using the paper title. This resulted in the number of citations for all but two papers that apparently were published under the same title in different settings, including books.

Findings
The results are summarized in table 1. I found that after I accounted for ISI’s NotIndexed citation information, there were 30 papers with 30 or more citations, leading to an H-Index of 30. However, if Not Indexed information is not used, then the H-Index falls to 26 (One paper not listed here has 27 Indexed citations and 0 Not Indexed citations).
Over four times more citations were found using Google than ISI’s WOK. Further, although the H-Index using Google information was not computed, based on the finding that the lowest and second lowest number of citations among these 30 references was 34 and 92, I would anticipate that H-index to be substantially higher.

Although the correlation (.927) between the total number of ISI citations and Google citations was highly correlated, and statistically significantly different than 0, the use of one citation source compared to another would lead to changes in the ordering. As just one of several examples, the number 2 position would change, as would others. The largest difference, for where data was available was a 15 difference in rank between the two sources.

The number of papers, as seen below, varied based on year, but the correlationbetween year and number of citations, was not statistically significant.

1996-1

1998-6

1999-9

2000-2

2001-7

2002-3

2003-1

2004-1

Contributions

This paper establishes the ISI WOK H-index and delineates the most cited papers for IEEE Intelligent Systems. I found that by without considering the Not Indexed citation information that the H-Index would have been 4 lower, 26 rather than 30, almost a 20% difference. In addition, for these most cited papers I found that the number of citations from ISI WOK and Google are highly correlated. However, I also found that the most cited order changes based on which citation source is used.

Table 1

Number of Citations for H-Index Papers

Authors / Rank / Total / ISI-Not Indexed / ISI-Indexed / Google
McIlraith et al. 2001 / 1 / 181 / 37 / 144 / 924
Chandrasekaran 1999 / 2 / 142 / 9 / 133 / 575
Hendler and McGuinness 2000 / 3 / 119 / 1 / 118 / 634
Noy et al 2001 / 4 / 118 / 26 / 92 / 528
Maedche and Staab 2001 / 5 / 107 / 28 / 79
Labrou et al. 1999 / 6 / 94 / 18 / 76 / 345
Guarino et al. 1999 / 7 / 85 / 6 / 79 / 362
Yang and Honavar 1998 / 8 / 81 / 2 / 79 / 369
Fensel et al. 2001 / 9 / 80 / 0 / 80 / 463
Oreizy et al. 1999 / 10 / 64 / 11 / 53 / 371
Hearst 1998 / 11 / 61 / 7 / 54 / 225
Lopez et al. 1999 / 12 / 52 / 38 / 14 / 206
Zhuge 2004 / 13 / 50 / 1 / 49 / 130
Fayyad 1996 / 14 / 48 / 0 / 48
Abecker et al. 1998 / 15 / 47 / 7 / 40 / 295
Sabin and Weigel 1998 / 15 / 47 / 9 / 38 / 193
Staab et al. 2001 / 17 / 44 / 1 / 43 / 315
Lee et al. 2002 / 18 / 36 / 1 / 35 / 34
O'Leary 1998 / 18 / 36 / 3 / 33 / 162
Franke et al. 1998 / 20 / 35 / 4 / 31 / 104
Gratch et al. 2002 / 20 / 35 / 18 / 17 / 121
Cook and Holder 2000 / 22 / 33 / 0 / 33 / 169
Mladenic 1999 / 23 / 32 / 0 / 32 / 193
Weiss, S.M. et al. 1999 / 23 / 32 / 1 / 31 / 132
Chan et al. 1999 / 25 / 31 / 2 / 29 / 109
Hendler 2001 / 25 / 31 / 0 / 31 / 341
Arisha et al. 1999 / 27 / 30 / 6 / 24 / 92
Fischer and Ostwald 2001 / 27 / 30 / 7 / 23 / 155
Gomez-Perez and Ostwald 2002 / 27 / 30 / 12 / 18 / 148
van der Aalst 2003 / 27 / 30 / 12 / 18 / 208
Totals / 1841 / 267 / 1574 / 7903
Footnotes
  1. An extended version of this paper is available on line at
References
  1. Abecker A, Bernardi A, Hinkelmann K, Kuhn O, Sintek M, “Toward a technology for organizational memories”, IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 13 (3): 40-48 MAY-JUN 1998.
  1. Adams, A., “Harvard Tops in Scientific Impact,” Science, 25, September 1998, Volume 281, No 5385, p. 1936.
  2. Arisha KA, Ozcan F, Ross R, Subrahmanian VS, Eiter T, Kraus S“Impact: A platform for collaborating agents,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (2): 64-72 MAR-APR 1999.
  1. Chan PK, Fan W, Prodromidis AL, Stolfo SJ , “Distributed data mining in credit card fraud detection,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (6): 67-74 NOV-DEC 1999
  1. Chandrasekaran B, Josephson JR, Benjamins VR , “What are ontologies, and why do we need them?” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (1): 20-26 JAN-FEB 1999
  1. Cook DJ, Holder LB, “Graph-based data mining,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 15 (2): 32-+ MAR-APR 2000
  1. Fayyad, U.M., “Data mining and knowledge discovery: Making sense out of data,” IEEE EXPERT-INTELLIGENT SYSTEMS & THEIR APPLICATIONS 11 (5): 20-25 OCT 1996
  1. Fensel, D., van Harmelen, F., Horrocks, I., McGuinnes, D.L., Patel-Schneider, P.F., “OIL: An ontology infrastructure for the Semantic Web,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (2): 38-45 MAR-APR 2001.
  1. Fischer, G. and Ostwald, J., “Knowledge management: Problems, promises, realities, and challenges,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (1): 60-72 JAN-FEB 2001
  1. Franke, U., Gavrila, D., Gorzig, S., Lindner, F., Paetzold, F., and Wohler, C., “Autonomous driving goes downtown,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 13 (6): 40-48 NOV-DEC 1998
  1. Garfield, E. and Welljams-Dorof, A., “Citation Data: Their Use as Quantitative Indicators for Science and Technology Evaluation and Policy Making,” Science & Public Policy, Volume 19, Number 5, pp. 321-327, 1992.
  1. Gomez-Perez, A., and Corcho, O., “Ontology languages for the Semantic Web,” IEEE INTELLIGENT SYSTEMS 17 (1): 54-60 JAN-FEB 2002
  1. Gratch, J., Rickel, J., Andre, E., Cassell, J., Petajan, E., Badler, N., “Creating interactive virtual humans: Some assembly required,”IEEE INTELLIGENT SYSTEMS 17 (4): 54-63 JUL-AUG 2002.
  1. Guarino, N., Masolo, C. and Vetere, G., “OntoSeek: Content-based access to the Web,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (3): 70-80 MAY-JUN 1999
  1. Hearst, M. A., “Support vector machines,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 13 (4): 18-21 JUL-AUG 1998.
  1. Hendler, J. and McGuinness, D.L., “The DARPA Agent Markup Language,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 15 (6): 72-73 NOV-DEC 2000
  1. Hendler, J., “Agents and the Semantic Web ,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (2): 30-37 MAR-APR 2001.
  1. Howitt, M, “ISI Spins a Web of Science,” DATABASE, April/May 1998, pp. 37-40.
  1. Labrou, Y., Finin, T., Peng, Y., “Agent communication languages: The current landscape,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (2): 45-52 MAR-APR 1999
  1. Lee, P.Y., Hui, S.C., Fong, ACM, “Neural networks for Web content filtering,” IEEE INTELLIGENT SYSTEMS 17 (5): 48-57 SEP-OCT 2002
  1. Lopez, M.F., Gomez-Perez, A., Sierra, J.P., Sierra, A.P., “Building a chemical ontology using methontology and the ontology design environment,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (1): 37-46 JAN-FEB 1999.
  1. Maedche, A. and Staab, S., “Ontology learning for the Semantic Web,”IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (2): 72-79 MAR-APR 2001
  1. May, R., “The Scientific Wealth of Nations,” Science, 7 February 1997, Volume 275, Number 5301, pp. 793-796.
  1. McIlraith, S.A., Son, T.C., Zeng, H.L.,“Semantic Web services,”IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (2): 46-53 MAR-APR 2001
  1. Mladenic, D., “Text-learning and related intelligent agents: A survey,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (4): 44-54 JUL-AUG 1999.
  1. Noy, N.F., Sintek, M., Decker, S., Crubezy, M., Fergerson, R.W., Musen, M.A., “Creating Semantic Web contents with Protege-2000,”IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (2): 60-71 MAR-APR 2001.
  1. O’Leary, D.E., “Using AI in knowledge management: Knowledge bases and ontologies,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 13 (3): 34-39 MAY-JUN 1998.
  1. Oreizy, P., Gorlick, M.M., Taylor, R.N., Heimbigner, D., Johnson, G., Medvidovic, N., Quilici, A., Rosenblum, D.S., Wolf, A.L., “An architecture-based approach to self-adaptive software,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (3): 54-62 MAY-JUN 1999.
  1. Sabin, D. and Weigel, R., “Product configuration frameworks - A survey,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 13 (4): 42-49 JUL-AUG 1998.
  1. Smith, S., “IS an Article in a Top Journal a Top Article,” Financial Management, Winter 2004, pp. 133-149.
  1. Staab, S., Studer, R., Schnurr, H.P., Sure, Y., “Knowledge processes and ontologies,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 16 (1): 26-34 JAN-FEB 2001.
  1. van der Aalst W., “Don’t go with the flow: Web-services composition standards exposed,” IEEE INTELLIGENT SYSTEMS 18 (1): 72-76 JAN-FEB 2003
  1. Weiss S. M., Apte C., Damerau F.J., Johnson D.E., Oles F.J., Goetz T., Hampp T., “Maximizing text-mining performance,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 14 (4): 63-69 JUL-AUG 1999
  1. Yang JH, Honavar V “Feature subset selection using a genetic algorithm,” IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS 13 (2): 44-49 MAR-APR 1998
  1. Zhuge, H., “China’s E-science knowledge grid environment,” IEEE INTELLIGENT SYSTEMS 19 (1): 13-17 JAN-FEB 2004

1