Methods Inf Med 1995; 34(01/02): 104-110
DOI: 10.1055/s-0038-1634570
Original article
Schattauer GmbH

An Overview of Statistical Methods for the Classification and Retrieval of Patient Events

C. G. Chute
1   Section of Medical Information Resources, Department of Health Sciences Research, Mayo Foundation, Rochester, Min, USA
,
Y. Yang
1   Section of Medical Information Resources, Department of Health Sciences Research, Mayo Foundation, Rochester, Min, USA
› Author Affiliations
Further Information

Publication History

Publication Date:
09 February 2018 (online)

Abstract:

Statistical methods that can support text retrieval are becoming an increasing focus of medical informatics activities. We overview our adaptation of existing knowlege sources to create pseudo-documents for concept based latent semantic indexing. Experience demonstrated this tack of limited practical value, since retrieval performance was invariably unsatisfactory. We discovered this was due in part to the introduction of a vocabulary gap between the queries and the cases we sought to retrieve. In part to address this problem, and to avail our large body of humanly coded text as a knowledge source, we developed a least squares fit alternative for the computer assisted indexing and retrieval of biomedical texts. This technique demonstrates equivalent or superior retrieval performance when compared to all other textual retrieval techniques. It does not depend upon elaborate knowledge bases, lexicons, or thesauri. It is a promising technique for classifying and retrieving the large volumes of clinical text.

 
  • References

  • 1 Chute CG. New medical knowledge from patient data repositories: applied clinical epidemiology at Mayo.
  • 2 HICDA-2, Hospital Adaptation of ICDA. (2nd ed). Ann Arbor, MI: Commission on Professional and Hospital Activities; 1968
  • 3 Anderson TW. On estimation of parameters in latent structure analysis. Psychometrika 1954; 19: 1-10.
  • 4 Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R. Indexing by Latent Semantic Analysis. J Amer Soc Inform Sci 1990; 41: 391-407.
  • 5 Golub GH, Van Loan CF. Matrix Computations, Second Edition. Baltimore, MD: The John Hopkins University Press; 1989
  • 6 International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM). Ann Arbor, MI: Commission on Professional and Hospital Activities; 1986
  • 7 UMLS Knowledge Sources, 4th Experimental Edition. Bethesda, MD: National Library of Medicine, National Institutes of Health; 1993
  • 8 Chute CG, Yang Y, Evans DA. Latent semantic indexing of medical diagnoses using UMLS Semantic Structures. In: Proceedings of the 15th Annual Symposium on Computer Applications in Medical Care 1991; 185-9.
  • 9 Chute CG. The classification of medical events using Latent Semantic Analysis. In: Kwasnik BH, Fidel R. eds. Advances in Classification Research – Vol. II. Proceedings of the 2nd ASIS SIGICR Workshop on Classification Research. Medford, NJ: Learned Information Inc; 1992: 45-51.
  • 10 Salton G. Introduction to Modern Information Retrieval. New York, NY: McGraw-Hill; 1983
  • 11 Chute CG, Yang Y. An evaluation of concept based Latent Semantic Indexing for Clinical Information Retrieval. In: Proceedings of the 16th Annual Symposium on Computer Applications in Medical Care. New York: McGraw-Hill; 1992. 16 639-43.
  • 12 Salton G. Development in automatic text retrieval. Science 1991; 352: 974-80.
  • 13 Evans DA, Chute CG, Handerson SK, Yang Y, Monarch LA, Hersh WR. “Latent Semantics” as a basis for managing variation in medical terminologies. In: MEDINFO 92. Amsterdam: North Holland Publ; 1992: 1462-8.
  • 14 Haynes R, McKibbon K, Walker C, Ryan N, Fitzgerald D, Ramsden M. Online access to MEDLINE in clinical settings. Ann Intern Med 1990; 112: 78-84.
  • 15 Yang Y, Chute CG. An application of least squares fit mapping to clinical classification. In: Proceedings of the 16th Annual Symposium on Computer Applications in Medical Care. New York: McGraw Hill; 1992. 16 460-4.
  • 16 Yang Y, Chute CG. A linear least squares fit method for terminology mapping. In: Proceedings of Fifteenth International Conference on Computational Linguistics 1992; II: 447-53.
  • 17 Yang Y, Chute CG. An application of least squares fit mapping to text information retrieval. In: Proc 16th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval 1993; 281-90.
  • 18 Salton G. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Reading, MA: Addison-Wesley Publ; 1989
  • 19 Yang Y, Chute CG. Words or concepts: The features of indexing units and their optimal use in information retrieval. In: Proceedings of 17th Annual Symposium on Computer Applications in Medical Care. New York: McGraw Hill; 1993: 685-9.
  • 20 Yang Y. Expert Network: Effective and efficient learning from human decisions in text categorization and retrieval. In: Proc 17th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval 1994; 11-21