Home | Browse | Search | Credits | About
Register | User Area | DL-Harvest | Help
DLIST

Internet Categorization and Search: A Self-Organizing Approach

Chen, Hsinchun and Schuffels, Chris and Orwig, Richard E. (1996) Internet Categorization and Search: A Self-Organizing Approach. Journal of Visual Communication and Image Representation, Special Issue on Digital Libraries 7(1):pp. 88-102.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

The problems of information overload and vocabulary differences have become more pressing with the emergence of increasingly popular Internet services. The main information retrieval mechanisms provided by the prevailing Internet WWW software are based on either keyword search (e.g., the Lycos server at CMU, the Yahoo server at Stanford) or hypertext browsing (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability for WWW servers based on selected machine learning algorithms. Our proposed approach, which is grounded on automatic textual analysis of Internet documents (homepages), attempts to address the Internet search problem by first categorizing the content of Internet documents. We report results of our recent testing of a multilayered neural network clustering algorithm employing the Kohonen self-organizing feature map to categorize (classify) Internet homepages according to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases and improve Internet keyword searching and/or browsing.

EPrint Type:Journal Article (Paginated)
Keywords:National Science Digital Library, NSDL, Artificial Intelligence Lab, AI Lab, Information Retrieval
Subjects:Internet
Information Seeking Behaviors
Information Extraction
ID Code:494
Deposited On:20 September 2004
Alternative Locations:http://ai.bpa.arizona.edu/go/papers.html
Eprint Statistics:View statistics for this eprint
Tell A Colleague:Tell a colleague about it.
EPrints dLIST, an open access archive for the Information Sciences, is supported by the School of Information Resources and Library Science and Learning Technologies Center, University of Arizona. Established in 2002, dLIST has a global Advisory Board and is a part of the Information Technology & Society Research Lab. Open Archives
Contact: Admin | Donate