Home | Browse | Search | Credits | About
Register | User Area | DL-Harvest | Help
DLIST

MetaSpider: Meta-Searching and Categorization on the Web

Chen, Hsinchun and Fan, Haiyan and Chau, Michael and Zeng, Daniel (2001) MetaSpider: Meta-Searching and Categorization on the Web. Journal of the American Society for Information Science & Technology 52(13):pp. 1134-1147.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

It has become increasingly difficult to locate relevant information on the Web, even with the help of Web search engines. Two approaches to addressing the low precision and poor presentation of search results of current search tools are studied: meta-search and document categorization. Meta-search engines improve precision by selecting and integrating search results fromgeneric or domain-specific Web search engines or other resources. Document categorization promises better organization and presentation of retrieved results. This article introduces MetaSpider, a meta-search engine that has real-time indexing and categorizing functions. We report in this paper the major components of MetaSpider and discuss related technical approaches. Initial results of a user evaluation study comparing Meta- Spider, NorthernLight, and MetaCrawler in terms of clustering performance and of time and effort expended show that MetaSpider performed best in precision rate, but disclose no statistically significant differences in recall rate and time requirements. Our experimental study also reveals that MetaSpider exhibited a higher level of automation than the other two systems and facilitated efficient searching by providing the user with an organized, comprehensive view of the retrieved documents.

EPrint Type:Journal Article (Paginated)
Keywords:National Science Digital Library, NSDL, Artificial Intelligence Lab, AI Lab, MetaSpider
Subjects:Web Mining
Internet
Knowledge Management
World Wide Web
ID Code:422
Deposited On:16 August 2004
Alternative Locations:http://ai.bpa.arizona.edu/go/papers.html
Eprint Statistics:View statistics for this eprint
Tell A Colleague:Tell a colleague about it.

Chen, H., Houston A.L., Sewell R.R., & Schatz, B.R. (1998). Internet

browsing and searching: User evaluations of category map and concept

space techniques. Journal of the American Society for Information

Science, 49, 582–603.

Chen, H., Schufels, C., & Orwig, R. (1996). Internet categorization and

search: A self-organizing approach. Journal of Visual Communication

and Image Representation, 7, 88–102.

Chignell, M.H., Gwizdka, J., & Bodner, R.C. (1999). Discriminating

meta-search: A framework for evaluation. Information Processing and

Management, 35, 337–362.

Cormack, G.V., Palmer, C., & Clarke, C. (1998). Efficient construction of

large test collections. In Proceedings of the 21st International ACM

SIGIR Conference on Research and Development in Information Retrieval

(SIGIR-98). New York: ACM Press.

Garman, N. (1999). Meta search engines, ONLINE. [On-line]. Available at

http://www.onlineinc.com/onlinemag/OL1999/garman5.html

Hearst, M. (1995). TileBars: Visualization of term distribution information

in full text information access. In Proceedings of the ACM SIGCHI

Conference on Human Factors in Computing Systems (CHI ‘95) (pp.

59–66). New York: ACM Press.

Hearst, M., & Pedersen, J.O. (1996). Reexamining the cluster hypothesis:

Scatter/gather on retrieval results. In Proceedings of the 19th International

ACM SIGIR Conference on Research and Development in Information

Retrieval (SIGIR’96) (pp. 76–84). New York: ACM Press.

Kohonen, T. (1995). Self-organizing maps. Berlin: Springer-Verlag.

Kohonen, T. (1997). Exploration of very large databases by self-organizing

maps. In Proceedings of the IEEE International Conference on Neural

Networks, 1 (pp. PL1–PL6). IEEE.

Lawrence, S., & Giles, C.L. (1999). Accessibility of information on the

Web. Nature, 400, 107–109.

Leighton, H.V., & Srivastava, J. (1999). First 20 precision among World

Wide Web search services (search engines). Journal of the American

Society for Information Science, 50, 870–881.

Leuski, A. (1998). Evaluating a visual presentation of retrieved documents.

CIIR Technical Report [On-line]. Available at: http://ciir.cs.umass.edu/

Leuski, A., & Allan, J. (1999). The best of both worlds: Combining ranked

list and clustering. CIIR Technical Report [On-line]. Available at: http://

ciir.cs.umass.edu/

Lin, X. (1997). Map displays for information retrieval. Journal of the

American Society for Information Science, 48, 40–54.

Lin, X., Soergel, D., & Marchionini, G. (1991). A self-organizing semantic

map for information retrieval. In Proceedings of the 14th International

ACM SIGIR Conference on Research and Development in Information

Retrieval Retrieval (SIRIR ‘91) (pp. 262–269). New York: ACM Press.

Morgan, L. (1999, May). Make Web searches more powerful. InternetWeek,

766.[Online].Availableathttp://www.internetwk.com/reviews/

rev052499-3.htm

Orwig, R., Chen, H., & Nunamaker, J.F. (1997). A graphical self-organizing

approach to classifying electronic meeting output. Journal of the

American Society for Information Science, 48, 157–170.

Salton, G. (1986). Another look at automatic text-retrieval systems. Communications

of the ACM, 29, 648–656.

Selberg, E., & Etzioni, O. (1995). Multi-service search and comparison

using the MetaCrawler. In Proceedings of the 4th World Wide Web

Conference, Boston, Mass, USA, December 1995.

Selberg, E., & Etzioni, O. (1997). The MetaCrawler architecture for

resource aggregation on the Web. IEEE Expert. IEEE.

Singhal, A. (1998). AT&T at TREC-6. In Proceedings of the Sixth Text

Retrieval Conference (TREC-6) (pp. 215–226). Gaithersburg, Maryland:

National Institute of Standards and Technology.

Tolle, K., & Chen, H. (2000). Comparing noun phrasing techniques for use

with medical digital library tools. Journal of the American Society for

Information Science, 51, 352–370.

Van Rijsbergen, C.J. (1979). Information retrieval (2nd ed.). London:

Butterworths.

Veerasamy, A., & Belkin, N.J. (1996). Evaluation of a Tool for Visualization

of Information Retrieval Results. In Proceedings of the 19th

International ACM SIGIR Conference on Research and Development in

Information Retrieval (SIGIR ‘96) (pp. 85–92). New York: ACM Press.

Voorhees, E., & Harman, D. (1998). Overview of the Sixth Text Retrieval

Conference (TREC-6). In Proceedings of the Sixth Text Retrieval Conference

(TREC-6) (pp. 1–24). Gaithersburg, Maryland: National Institute

of Standards and Technology.

Zamir, O. (1998). Visualization of search results in document retrieval

systems. Unpublished General Examinations Paper, University of Washington,

Seattle.

Zamir, O., & Etzioni, O. (1999). Grouper: A Dynamic Clustering Interface

to Web Search Results. In Proceedings of the Eighth World Wide Web

Conference, Toronto, May 1999.

EPrints dLIST, an open access archive for the Information Sciences, is supported by the School of Information Resources and Library Science and Learning Technologies Center, University of Arizona. Established in 2002, dLIST has a global Advisory Board and is a part of the Information Technology & Society Research Lab. Open Archives
Contact: Admin | Donate