Home | Browse | Search | Credits | About
Register | User Area | DL-Harvest | Help
DLIST

Classification and Powerlaws: The logarithmic transformation

Leydesdorff, Loet and Bensman, Stephen (2006) Classification and Powerlaws: The logarithmic transformation.

Full text available as:
HTM
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

Published in Journal of the American Society for Information Science and Technology 57(11) (2006) 1470-1486. Abstract: Logarithmic transformation of the data has been recommended by the literature in the case of highly skewed distributions such as those commonly found in information science. The purpose of the transformation is to make the data conform to the lognormal law of error for inferential purposes. How does this transformation affect the analysis? We factor analyze and visualize the citation environment of the Journal of the American Chemical Society (JACS) before and after a logarithmic transformation. The transformation strongly reduces the variance necessary for classificatory purposes and therefore is counterproductive to the purposes of the descriptive statistics. We recommend against the logarithmic transformation when sets cannot be defined unambiguously. The intellectual organization of the sciences is reflected in the curvilinear parts of the citation distributions, while negative powerlaws fit excellently to the tails of the distributions.

EPrint Type:Preprint
Keywords:classification, citation, journal, logarithmic transformation, powerlaw
Subjects:Science Technology Studies
ID Code:1496
Deposited On:21 September 2006
Alternative Locations:http://www.leydesdorff.net/log05/
Eprint Statistics:View statistics for this eprint
Tell A Colleague:Tell a colleague about it.

Ahlgren, P., B. Jarneving, & R. Rousseau. (2003). Requirement for a Cocitation Similarity Measure, with Special Reference to Pearson's Correlation Coefficient. Journal of the American Society for Information Science and Technology, 54(6), 550-560.

Aitchison, J., & Brown, J. A. C. (1957). The lognormal distribution with special reference to its uses in economics. Cambridge: Cambridge University Press.

Anscombe, F. J. (1948). The transformation of Poisson, binomial and negative-binomial data. Biometrika, 35, 246-254.

Barabási, A.-L. (2002). Linked: The New Science of Networks. Cambridge, MA: Perseus Publishing.

Barabási, A.-L., H. Jeong, Z. Neda, E. Ravasz, A. Schubert, & T. Vicsek. (2002). Evolution of the Social Network of Scientific Collaborations. Physica A, 311(3-4), 590-614.

Bartlett, M. S. (1947). The use of transformations. Biometrics, 3, 39-52.

Benford, F. (1938). The Law of Anomalous Numbers. Proceedings of the American Philosophical Society, 78, 551-572.

Bensman, S. J. (1996). The structure of the library market for scientific journals: The case of chemistry. Library Resources & Technical Services, 40, 145-170.

Bensman, S. J. (2000). Probability Distributions in Library and Information Science: A Historical and Practitioner Viewpoint. Journal of the American Society for Information Science, 51(9), 816-833.

Bensman, S. J. (2001). Bradford’s Law and fuzzy sets: Statistical implications for library analyses. IFLA Journal, 27, 238-246.

Bensman, S. J., & Wilder, S. J. (1998). Scientific and technical serials holdings optimization in an inefficient market: A LSU Serials Redesign Project exercise. Library Resources & Technical Services, 42, 147-242.

Box, G. E. P, & Cox, D. R. (1964). The analysis of transformations. Journal of the Royal Statistical Society, Series B (Methodological), 26, 211-252.

Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137, 85-86.

Bray, J. H., & S. E. Maxwell. (1985). Multivariate Analysis of Variance. Beverly Hills, etc.: Sage.

Brookes, B. C. (1977). Theory of the Bradford Law. Journal of Documentation, 33, 180-209.

Brookes, B. C. (1979). The Bradford Law: A new calculus for the social sciences? Journal of the American Society for Information Science, 30, 233-234.

Brookes, B. C. (1980a) The foundations of information science, Part II: Quantitative aspects: classes of things and the challenge of human individuality. Journal of Information Science, 2, 209-221.

Brookes, B. C. (1980b). The foundations of information science, Part III: Quantitative aspects: objective maps and subjective landscapes. Journal of Information Science, 2, 269-275.

Brookes, B. C. (1984). Ranking techniques and the empirical Log Law. Information processing & management, 20, 37-46.

Brookes, B. C., & Griffiths, J. M. (1979). Frequency-rank distributions. Journal of the American Society for Information Science, 29, 5-13.

Chan, L. M. (1999). A Guide to the Library of Congress Classification. 5th ed. Englewood, Colo.: Libraries Unlimited.

Drott, M. C., & B. C. Griffith. (1978). An Empirical Examination of Bradford’s Law and the Scattering of Scientific Literature. Journal of the American Society for Information Science, 29(5), 238-246.

Egghe, L., & Rousseau, R. (2003). Size-frequency and rank-frequency relations, power laws and exponentials: a unified approach. Progress in Natural Science, 13(6), 478-480.

Elliott, J. M. (1977). Some methods for the statistical analysis of samples of benthic invertebrates. 2nd ed. Freshwater Biological Association scientific publication, no. 25. Ambleside: Freshwater Biological Association.

Engelsman, E. C., & A. F. J. van Raan. (1991). Mapping Technology. A First Exploration of Knowledge Diffusion Amongst Fields of Technology. The Hague: Ministry of Economic Affairs.

Everitt, B. S. (1998). The Cambridge dictionary of statistics. Cambridge: Cambridge University Press.

Feller, W. (1943). On a general class of “contagious” distributions. Annals of Mathematical Statistics, 14, 389-400.

Ferrer Cancho, R., & R. V. Solé. (2001). Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited. Journal of Quantitative Linguistics, 8(3), 165-173.

Garfield, E. (1971). The mystery of the transposed journal lists—wherein Bradford’s Law of Scattering is generalized according to Garfield’s Law of Concentration. Current Contents, no.17, 5-6.

Garfield, E. (1972). Citation analysis as a tool in journal evaluation. Science, 178, 471-479.

Garfield E. (1983). Citation indexing—its theory and application in science, technology, and humanities. Philadelphia: ISI Press.

Hoyle, M. H. (1973). Transformations—An introduction and a bibliography. International Statistical Review, 41, 203-223.

Jones, W. P., & G. W. Furnas. (1987). Pictures of Relevance: A Geometric Analysis of Similarity Measures. Journal of the American Society for Information Science, 36(6), 420-442.

Jöreskög, K. G., & A. S. Goldberger. (1975). Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable. Journal of the American Statistical Association, 70, 631-639.

Kamada, T., & S. Kawai. (1989). An Algorithm for Drawing General Undirected Graphs. Information Processing Letters, 31(1), 7-15.

Katz, J. S. (1999). The Self-Similar Science System. Research Policy, 28, 501-517.

Katz, J. S. (2000). Scale Independent Indicators and Research Evaluation. Science & Public Policy, 27(1), 23-36.

Keynes, J. M. (1921). A treatise on probability. London: Macmillan.

Kim, J.-O., & C. W. Mueller. (1978). Factor Analysis, Statistical Methods and Practical Issues. Beverly Hills, etc.: Sage).

Leydesdorff, L. (1987). Various Methods for the Mapping of Science. Scientometrics, 11, 291-320.

Leydesdorff, L. (1989). Words and Co-Words as Indicators of Intellectual Organization. Research Policy, 18, 209-223.

Leydesdorff, L. (1995). The Challenge of Scientometrics: The Development, Measurement, and Self-Organization of Scientific Communications. Leiden: DSWO Press, Leiden University; at <http://www.upublish.com/books/leydesdorff-sci.htm >.

Leydesdorff, L. (1997). Why Words and Co-Words Cannot Map the Development of the Sciences. Journal of the American Society for Information Science, 48(5), 418-427.

Leydesdorff, L. (2002). Indicators of Structural Change in the Dynamics of Science: Entropy Statistics of the Sci Journal Citation Reports. Scientometrics, 53(1), 131-159.

Leydesdorff, L. (forthcoming). Can Scientific Journals be Classified in terms of Aggregated Journal-Journal Citation Relations using the Journal Citation Reports? Journal of the American Society for Information Science and Technology (in press).

Leydesdorff, L., & S. E. Cozzens. (1993). The Delineation of Specialties in Terms of Journals Using the Dynamic Journal Set of the Science Citation Index. Scientometrics, 26, 133-154.

Leydesdorff, L. & L. Vaughan (2006). Co-occurrence Matrices and their Application in Information Science: Extending ACA to the Web Environment, Journal of the American Society for Information Science and Technology, forthcoming.

Merton, R. K. (1968). The Matthew Effect in science. Science 159, 56–63

Michelet, B. (1988). L’analyse des associations. Unpublished Ph. D. Thesis, Université Paris VII, Paris.

Pennock, D. M., G. W. Flake, S. Lawrence, E. J. Glover, & C. L. Giles. (2002). Winners Don't Take All: Characterizing the Competition for Links on the Web. Proceedings of the National Academy of Sciences, 99(8), 5207-5211.

Price, E., & M. Thelwall. (2005). The Clustering Power of Low Frequency Words in Academic Webs. Journal of the American Society for Information Science and Technology, forthcoming.

Quenouille, M. H. (1950). Introductory statistics. London: Pergamon Press.

Salton, G., & M. J. McGill. (1983). Introduction to Modern Information Retrieval. Auckland, etc.: McGraw-Hill.

Simon, H. A. (1973). The Organization of Complex Systems. In H. H. Pattee (Ed.), Hierarchy Theory: The Challenge of Complex Systems (pp. 1-27). New York: George Braziller Inc.

Van Rijsbergen, C. J. (1977). A Theoretical Basis for the Use of Co-Occurrence Data in Information Retrieval. Journal of Documentation, 33(2), 106-119.

Vaughan, L., & J. You (2005). Mapping Business Competitive Positions using Web Co-link Analysis. In P. Ingwersen & B. Larsen (Eds.), Proceedings of ISSI 2005 – the 10th International Cnference of the International Society for Scientometrics and Informetrics (pp. 534-543). Stockholm, Sweden, July 24-28.

Wagner, C. S., & L. Leydesdorff. Network Structure, Self-Organization and the Growth of International Collaboration in Science. Research Policy (forthcoming).

White, H. D. (2003). Author Cocitation Analysis and Pearson's R. Journal of the American Society for Information Science and Technology, 54(13), 1250-1259.

White, H. D. (2004). Reply to Bensman. Journal of the American Society for Information Science and Technology, 55, 843-844.

White, H. D., & B. C. Griffith. (1981). Author Cocitation: A Literature Measure of Intellectual Structure. Journal of the American Society for Information Science, 32(3), 163-171.

White, H. D., & B. C. Griffith. (1982). Authors as Markers of Intellectual Space: Co-Citation in Studies of Science, Technology and Society. Journal of Documentation, 38(4), 255-272.

Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338-353.

Zitt, M., E. Bassecoulard, & Y. Okubo. (2000). Shadows of the Past in International Cooperation: Collaboration Profiles of the Top Five Producers of Science. Scientometrics, 47(3), 627-657.

EPrints dLIST, an open access archive for the Information Sciences, is supported by the School of Information Resources and Library Science and Learning Technologies Center, University of Arizona. Established in 2002, dLIST has a global Advisory Board and is a part of the Information Technology & Society Research Lab. Open Archives
Contact: Admin | Donate