Home | Browse | Search | Credits | About
Register | User Area | DL-Harvest | Help
DLIST

HelpfulMed: Intelligent Searching for Medical Information over the Internet

Chen, Hsinchun and Lally, Ann M. and Zhu, Bin and Chau, Michael (2003) HelpfulMed: Intelligent Searching for Medical Information over the Internet. Journal of the American Society for Information Science & Technology 54(7):pp. 683-694.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

Medical professionals and researchers need information from reputable sources to accomplish their work. Unfortunately, the Web has a large number of documents that are irrelevant to their work, even those documents that purport to be “medically-related.” This paper describes an architecture designed to integrate advanced searching and indexing algorithms, an automatic thesaurus, or “concept space,” and Kohonen-based Self-Organizing Map (SOM) technologies to provide searchers with finegrained results. Initial results indicate that these systems provide complementary retrieval functionalities. HelpfulMed not only allows users to search Web pages and other online databases, but also allows them to build searches through the use of an automatic thesaurus and browse a graphical display of medical-related topics. Evaluation results for each of the different components are included. Our spidering algorithm outperformed both breadth-first search and PageRank spiders on a test collection of 100,000 Web pages. The automatically generated thesaurus performed as well as both MeSH and UMLS—systems which require human mediation for currency. Lastly, a variant of the Kohonen SOM was comparable to MeSH terms in perceived cluster precision and significantly better at perceived cluster recall.

EPrint Type:Journal Article (Paginated)
Keywords:National Science Digital Library, NSDL, Artificial Intelligence Lab, AI Lab, HelpfulMed
Subjects:Web Mining
Medical Libraries
ID Code:416
Deposited On:16 August 2004
Alternative Locations:http://ai.bpa.arizona.edu/go/papers.html
Eprint Statistics:View statistics for this eprint
Tell A Colleague:Tell a colleague about it.

Bates, M.J. (1986). Subject access in online catalogs: a design model. Journal

of the American Society for Information Science, 37(6), 357–376.

Brin, S. & Page, L. (1998). The anatomy of a large-scale hypertextual Web

search engine. In Proceedings of the 7th International World Wide Web

Conference, Brisbane, Australia.

Chen, H., Houston, A.L., Sewell, R.R., & Schatz, B.R. (1998). Internet

browsing and searching: User evaluation of category map and concept

space techniques. Journal of the American Society for Information

Science, 49(7), 582–603.

Chen, H. & Lynch, K. (1992). Automatic construction of networks of

concepts characterizing document databases. IEEE Transactions on Systems,

Man and Cybernetics, 22, 885–902.

Chen, H., & Ng, T. (1995). An Algorithmic Approach to Concept Exploration

in a Large Knowledge Network (Automatic Thesaurus Consultation):

Symbolic Brand-and Bound Search vs. Connectionist Hopfield

Net Activation. Journal of the American Society for Information Science,

1995, 46(5), pp. 348–369.

Chen, H., Schatz, B.R., Ng, T.D., Martinez, J.P., Kirchhoff, A.J., & Lin, C.

(1996). A parallel computing approach to creating engineering concept

spaces for semantic retrieval: The Illinois digital library initiative

project. IEEE Transactions on Pattern Analysis and Machine Intelligence,

Special Section on Digital Libraries: Representation and Retrieval,

18(8), 771–782.

Cho, J., Garcia-Molina, H., & Page, L. (1998) Efficient crawling through

URL ordering. in Proceedings of the 7th World Wide Web Conference,

Brisbane, Australia, Apr 1998.

Cimino, J.J., Johnson, S.B., Peng, P., & Aguirre, A. (1994). From

ICD9-CM to MeSH using the UMLS: A how-to guide. Paper presented

at the Annual Symposium on Computer Applications in Medical Care.

Crouch, C.J. (1990). An approach to the automatic construction of global

thesauri. Information Processing and Management, 26(5), 629–640.

Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K. & Harshman,

R. (1990). Indexing by latent semantic analysis. Journal of the American

Society for Information Science, 41, 391–407.

Eysenbach, G., Powell, J., Kuss, O., & Sa, E.R. (2002). Empirical studies

assessing the quality of health information for consumers on the World

Wide Web. Journal of the American Medical Association, 287(20),

2691–2700.

Fallis, D., and Fricke´, M. (2002). Indicators of accuracy of consumer health

information on the Internet: a study of indicators relating to information

for managing fever in children in the home. Journal of the American

Medical Informatics Association, 9(1), 73–79.

Furnas, G.W., Landauer, T.K., Gomez, L.M., & Dennis, S.T. (1987). The

vocabulary problem in human-system communication. Communications

of the ACM, 30(11), 964–971.

Guntzer, U., Juttner, G., Seegmuller, G., & Saare, F. (1989). Automatic

thesaurus construction by machine learning from retrieval sessions.

Information Processing and Management, 25(3), 265–273.

Haveliwala, T.H. (1999). Efficient computation of PageRank. Stanford

University Technical Report [Online]. Available at: http://dbpubs.

stanford.edu:8090/pub/1999–31

Hearst, M.A., & Pedersen, J.O. (1996). Reexamining the Cluster Hypothesis:

Scatter/Gather on retrieval results. In Proceedings of the 19th

Annual International ACM Conference on Research and Development in

Information Retrieval (SIGIR’96), Zurich, Switzerland, Aug 1996.

Hopfield, J.J. (1982). Neural Network and Physical Systems with Collective

Computational Abilities. In Proceedings of the National Academy of

Science, USA, 1982, 79(4), pp. 2554–2558.

Houston, A.L., Chen, H., Schatz, B.R., Hubbard, S.M., Sewell, R.R., & Ng,

T.D. (2000). Exploring the use of concept space to improve medical

information retrieval. International Journal of Decision Support Systems,

30, 171–186.

Janes, J.W. (1994). Other people’s judgments: A comparison of users’ and

others’ judgments of document relevance, topicality, and utility. Journal

of the American Society for Information Science, 45(3), 160–171.

Kohonen, T. (1995). Self-Organized Maps. Berlin: Springer-Verlag.

Lyman, P., & Varian, H.R. (2000). How much information. [Online].

Available at http://www.sims.berkeley.edu/how-much-info/

Mechkour, M., Harper, D., & Muresan, G. (1998). The WebCluster project.

Using clustering for mediating access to the World Wide Web. In

Proceedings of the 21st annual International ACM SIGIR Conference on

Research and Development in Information Retrieval, (pp. 357–358),

Melbourne, Australia.

Roussinov, D.G. & Chen, H. (1999). Document clustering for electronic

meetings: an experimental comparison of two techniques. Decision

Support Systems, 27(1), 67–81.

Salton, G. (1986). Another look at automatic text-retrieval systems. Communications

of the ACM, 29(7), 648–656.

Salton, G. (1989) Automatic Text Processing. Addison-Wesley Publishing

Company, Inc. Reading, MA.

Salton, G., Wong, A., & Yang, C.S. (1975). A vector space model for

automatic indexing. Communications of the ACM, 18(11), 613–620.

Srinivasan, P. (1996). Query expansion and MEDLINE. Information Processing

and Management, 32, 431–443.

Tolle, K.M., & Chen, H. (2000) Comparing noun phrasing techniques for

use with medical digital library tools. Journal of the American Society

for Information Science. 51(4), 352–370.

van Rijsbergen, C.J. (1979). Information Retrieval. Butterworths. London.

Second Edition.

Ve´lez, B., Wiess, R., Sheldon, M.A., & Gifford, D.K. (1997). Fast and

Effective Query Refinement. In Proceedings of the 20th ACM Conference

on Research and Development in Information Retrieval, Philadelphia,

Pennsylvania, July 1997.

Woolf, S.H., Grol, R., Hutchinson, A., Eccles, M., & Grimshaw, J. (1999).

Potential benefits, limitations, and harms of clinical guidelines. British

Medical Journal 318(7182), 527–530.

Wu, M., Fuller, M., & Wilkinson, R. (2001). Using clustering and classi-

fication approaches in interactive retrieval. Information Processing and

Management, 37, 459–484.

Zamir, O. & Etzioni, O. (1999). Grouper: a dynamic clustering interface to

Web search results. In Proceedings of the 8th World Wide Web Conference,

Toronto, May 1999.

EPrints dLIST, an open access archive for the Information Sciences, is supported by the School of Information Resources and Library Science and Learning Technologies Center, University of Arizona. Established in 2002, dLIST has a global Advisory Board and is a part of the Information Technology & Society Research Lab. Open Archives
Contact: Admin | Donate