Home | Browse | Search | Credits | About
Register | User Area | DL-Harvest | Help
DLIST

Co-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment

Leydesdorff, Loet and Vaughan, Liwen (2006) Co-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment.

Full text available as:
HTM
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

To be published in Journal of the American Society for Information Science & Technology 57(12) (2006) 1616-1628. Abstract: Co-occurrence matrices, such as co-citation, co-word, and co-link matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of this data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This paper discusses the difference between a symmetrical co-citation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (like the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical co-citation matrix, but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co-occurrence matrices to the Web environment where the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed using both the traditional methods of multivariate analysis and the new visualization software Pajek that is based on social network analysis and graph theory.

EPrint Type:Preprint
Subjects:Bibliometrics
Information Science
Informetrics
Citation Analysis
Science Technology Studies
ID Code:1499
Deposited On:22 September 2006
Alternative Locations:http://www.leydesdorff.net/aca/index.htm
Eprint Statistics:View statistics for this eprint
Tell A Colleague:Tell a colleague about it.

Ahlgren, P., Jarneving, B. & Rousseau, R. (2003). Requirements for a cocitation similarity measure, with special reference to Pearson’s correlation coefficient. Journal of the American Society for Information Science and Technology, 54(6), 550-560.

Ahlgren, P., Jarneving, B. & Rousseau, R. (2004a). Autor Cocitation and Pearson’s r. Journal of the American Society for Information Science and Technology, 55(9), 843.

Ahlgren, P., Jarneving, B. & Rousseau, R. (2004b). Rejoinder: In Defense of Formal Methods. Journal of the American Society for Information Science and Technology, 55(10), 936.

Bensman, S. J. (2004). Pearson’s r and Author Cocitation Analysis: A Commentary on the Controversy. Journal of the American Society for Information Science and Technology, 55(10), 935-936.

Borg, I. & Groenen, P. (1997). Modern multidimensional scaling: Theory and applications. New York: Springer.

Burt, R. S. (1982). Toward a Structural Theory of Action. New York, etc.: Academic Press.

Burt, R. S. (1995). Structural Holes: The Social Structure of Competition. Cambridge, MA: Harvard University Press.

Cox, T. F. & Cox, M. A. A. (2001). Multidimensional Scaling, 2nd edition, New York: Chapman & Hall/CRC.

Da F. Costa, Luciano, Francisco A. Rodrigues, Gonzalo Travieso, and P. R. Villas Boas (2005). Characterization of complex networks: A survey of measurements. Preprint Condensed Matter, abstract, cond-mat/0505185 at http://arxiv.org/abs/cond-mat/0505185.

Davison, M. L. (1983). Multidimensional scaling. New York: John Wiley.

Engelsman, E. C., & A. F. J. van Raan. (1991). Mapping Technology. A First Exploration of Knowledge Diffusion Amongst Fields of Technology. The Hague: Ministry of Economic Affairs.

Garfield, E. (1979). Citation Indexing: Its Theory and Application in Science, Technology, and Humanities. New York: John Wiley.

Jones, W. P., & Furnas, G. W. (1987). Pictures of Relevance: A Geometric Analysis of Similarity Measures. Journal of the American Society for Information Science, 36(6), 420-442.

Kamada, T., & Kawai, S. (1989). An algorithm for drawing general undirected graphs. Information Processing Letters, 31(1), 7-15.

Kim, J.-O., & Mueller, C. W. (1978). Factor Analysis, Statistical Methods and Practical Issues. Beverly Hills, etc.: Sage.

Kruskal, J. B. & Wish, M. (1978). Multidimensional Scaling. Beverly Hills, etc.: Sage.

Leydesdorff, L. (1987). Various Methods for the Mapping of Science. Scientometrics 11, 291-320.

Leydesdorff, L. (1989). Words and Co-Words as Indicators of Intellectual Organization. Research Policy, 18(4), 209-223.

Leydesdorff, L. (2005). Similarity Measures, Author Cocitation Analysis, and Information Theory. Journal of the American Society for Information Science and Technology, 56(7), 760-772.

Marshakova, I. V. (1973). Bibliographic coupling system based on references: Science Citation Index, Nauch-Tekhn. Inform. Ser. 2 SSR 2, 3-8.

Schiffman, S. S., Reynolds, M. L., & Young, F. W. (1981). Introduction to multidimensional scaling: theory, methods, and applications. New York / London: Academic Press.

Scott, J. (1991). Social Network Analysis. London, etc.: Sage.

Siegel, S. & Castellan, N. J., Jr. 1988. Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill.

Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, July-August, 1973, 265-269.

Small, H., & Sweeney, E. (1985). Clustering the Science Citation Index Using Co-Citations I. A Comparison of Methods. Scientometrics 7, 391-409

SPSS (1993). SPSS professional Statistics 6.1. Chicago, U.S.: SPSS Inc. ISBN 0-13-178831-0.

Thelwall, M., Vaughan, L., & Björneborn, L (2005). Webometrics. In Cronin, B. (ed.), Annual Review of Information Science and Technology. Medford, NJ: Information Today, Inc., Vol. 39, 81-135.

Vaughan, L. & Shaw, D. (2003) Bibliographic and Web citations: What is the difference? Journal of the American Society for Information Science and Technology, 54(14), 1313-1322.

Vaughan, L. and Shaw, D. (2005). Web citation data for impact assessment: A comparison of four science disciplines. Journal of the American Society for Information Science and Technology, 56(10), 1075-1087.

Vaughan, L. & You, J. (2005). Mapping Business Competitive Positions Using Web Co-link Analysis. In Proceedings of ISSI 2005 – the 10th International Conference of the International Society for Scientometrics and Informetrics, (P. Ingwersen & B. Larsen eds.), pp. 534–543, Stockholm, Sweden, July 24–28, 2005.

Wasserman, S., & K. Faust. (1994). Social Network Analysis: Methods and Applications. New York: Cambridge University Press.

White, H. D. (2003). Author cocitation analysis and Pearson’s r. Journal of the American Society for Information Science and Technology, 54(13), 1250-1259.

White, H. D. (2004). Replies and a correction. Journal of the American Society for Information Science and Technology, 55, 843-844.

White, H. D. & Griffith, B. (1981). Author cocitation: A literature mesaure of intellectual structures. Journal of the American Society for Information Science, 32(3), 163-171.

White, H. D., & McCain, K. (1998). Visualizing a discipline: An author cocitation analysis of information science, 1972-1995. Journal of the American Society for Information Science, 49, 327-355.

Wouters, P., Hellsten, I., & Leydesdorff, L. (2004). Internet time and the reliability of search engines. First Monday, 9(10), at http://www.firstmonday.org/issues/issue9_10/wouters/index.html.

EPrints dLIST, an open access archive for the Information Sciences, is supported by the School of Information Resources and Library Science and Learning Technologies Center, University of Arizona. Established in 2002, dLIST has a global Advisory Board and is a part of the Information Technology & Society Research Lab. Open Archives
Contact: Admin | Donate