Methods Inf Med 1978; 17(04): 227-237
DOI: 10.1055/s-0038-1636442
Original Article
Schattauer GmbH

The Measurement of Performance in Probabilistic Diagnosis

II. Trustworthiness of the Exact Values of the Diagnostic ProbabilitiesDIE LEISTUNGSMESSUNG BEI DER WAHRSCHEINLICHKEITSDIAGNOSE.II. VERLÄSSLICHKEIT DER ZAHLENWERTE DER DIAGNOSTISCHEN WAHRSCHEINLICHKEITEN
J. Hilden
1   From the Department of Public Health and Social Medicine, Erasmus University, Rotterdam, The Netherlands, and the Institute of Human Genetics, University of Copenhagen, Denmark
,
J. D. F. Habbema
1   From the Department of Public Health and Social Medicine, Erasmus University, Rotterdam, The Netherlands, and the Institute of Human Genetics, University of Copenhagen, Denmark
,
B. Bjerregaard
1   From the Department of Public Health and Social Medicine, Erasmus University, Rotterdam, The Netherlands, and the Institute of Human Genetics, University of Copenhagen, Denmark
› Author Affiliations
Further Information

Publication History

Publication Date:
19 February 2018 (online)

Attention is focused on one important aspect of good performance in probabilistic diagnosis, the »reliability« (external validity) of probabilistic assertions: a diagnostic alternative claimed to be 90% certain, say, must occur neither more nor less than nine times out of ten on the average. Statistical measures are offered by which departures from such perfect reliability can be estimated, and statistical tests are developed in order to test the hypothesis of perfect reliability. The specific reliability defects looked for include overconfident diagnoses and so-called size bias (common diseases being over diagnosed). Reliability is contrasted with discriminatory power and other performance aspects. The illustrative data derive from a study of computer-aided diagnosis of the acute abdomen.

Die Arbeit behandelt einen wichtigen Aspekt der Leistungsgüte bei der probabilistischen Diagnose, nämlich die »Reliabilität« (externe Gültigkeit) der probabilistischen Behauptungen im folgenden Sinne: eine z.B. als 90% sicher angesehene diagnostische Alternative darf im Durchschnitt weder häufiger noch seltener als in neun von zehn Fällen vorliegen. Die Verfasser bieten statistische Methoden an, mit deren Hilfe Abweichungen von einer solchen perfekten Zuverlässigkeit geschätzt werden können, sowie Verfahren für die Prüfung der Hypothese perfekter Zuverlässigkeit. Zu den spezifischen Mängeln an Zuverlässigkeit, die hier behandelt werden, gehören »übertrieben zuversichtliche« Diagnosen und der sogenannte »size bias« (häufige Krankheiten werden überdiagnostiziert). Der Zuverlässigkeit in diesem Sinne werden Diskriminanzvermögen und andere Leistungsaspekte gegenübergesteht. Die illustrierenden Daten entstammen einer Studie über die computerunterstützte Diagnose bei akuten Bauchschmerzen.

 
  • References

  • 1 BJERREGAABD B., BBYNITZ S., HOLST-CHBISTENSEN J., KALAJA E., LUND-KRISTENSEN J., HILDEN J., DE DOMBAL F. T., HORROCKS J. C.. Computer-aided Diagnosis of the Acute Abdomen: A System from Leeds Used on Copenhagen Patients. In de Dombal F. T., and Gremy F.. (Eds) Decision Making and Medical Care: Can Information Science Help?. pp. 165-174. ( Amsterdam: North-Holland Publ. Co.; 1976. ).
  • 2 BURBANK F.. A Computer Diagnostic System for the Diagnosis of Prolonged Undifferentiating Liver Disease. Amer. J. Med 46 ( 1969; ) 401-415.
  • 3 CORNFIELD J., DUNN R. A., BATCHLOB C. D., PIPBERGER H.. Multigroup Diagnosis of Electrocardiograms. Comput. biomed. Res 6 ( 1973; ) 97-120.
  • 4 CROET D. J.. Is Computerized Diagnosis Possible ?. Comput. biomed. Res 5 ( 1972; ) 351-367.
  • 5 GEHAN E. A.. Use of Medical Measurements to Predict the Course of Disease. In Morrison III B. H.. (Edit.) Conference on Experimental Clinical Cancer Chemotherapy. National Cancer Institute Monograph No. 3. PP. 51-58. ( Washington D. C.: U. S. Govern. Print. Office; 1960. ).
  • 6 HABBEMA J. D. F.. Models for Diagnosis and Detection of Combinations of Diseases. In [1] pp. 399-410.
  • 7 HABBEMA J. D. F., HILDEN J., BJEBBEGAARD B.. The Measurement of Performance in Probabilistic Diagnosis — I. The Problem, Desciiptive Tools, and Measures Based on Classification Matrices. Meth. Inform. Med 17 ( 1978; ) 217-226.
  • 8 HABBEMA J. D. F., HILDEN J., BJERREGAARD B.. The Measurement of Performance in Probabilistic Diagnosis. IV. Measures Based on Utility Considerations, and General Recommendations. Meth. Inform. Med. (Under preparation).
  • 9 HILDEN J.. Size Bias, a Small-Sample Distortion in Computer — Aided Probability Diagnosis and Pattern Recognition. (Manuscript being revised.)
  • 10 HILDEN J., BJERREGAARD B.. Computer-Aided Diagnosis and the Atypical Case. In [1] pp. 365-378.
  • 11 HILDEN J., HABBEMA J. D. F., BJERREGAARD B.. The Measurement of Performance in Probabilistic Diagnosis — III. Methods Based on Continuous Functions of the Diagnostic Probabilities. Meth. Inform. Med 17 ( 1978; ) 238-246.
  • 12 MAI N., HACHMANN E., HENRICH G., VON CRAMON D., BRINKMANN R.. Indikationsstellung fur die zerebrale Angiographie. Entwicklung eines Bayes-Programms zur Entscheidungshilfe. Meth. Inform. Med 16 ( 1977; ) 45-51.
  • 13 MOSTELLER F., WALLACE D. L.. Inference and Disputed Authorship : The Federalist. ( Reading, MA.: Addison-Wesley; 1964. ).
  • 14 MUBPHY A. H.. Evaluation of Probabilistic Forecasts: Some Procedures and Practices. In Murphy A. H., and Williamson D. L.. (Eds) Weather Forecasting and Weather Forecasts : Models, Systems, and Users. pp. 807-830. ( Boulder Colorado: National Center for Atmospheric Research; 1977. ).
  • 15 MURPHY A. H., WINKLER R. L.. Reliability of Subjective Forecasts of Precipitation and Temperature. J. roy. statist. Soc., Series C 26 ( 1977; ) 41-47.
  • 16 MURPHY A. H., WINKLER R. L.. Credible Interval Temperature Forecasting: Some Experimental Results. Mon. Weath. Rev. (Amer. Meteor. Soc.) 102 ( 1974; ) 784-794.
  • 17 RADHAICRISHNA S.. Discrimination Analysis in Medicine. Statistician 14 ( 1964; ) 147-167.
  • 18 SALAMON R., DEROUESNE C., SAMSON M., BERNADET M., GREMY F.. Decision-Making Aids Used to Determine the Content of Medical Teaching. In [1] pp. 227-237.
  • 19 STAËL VON HOLSTEIN C.-A. S.. Assessment and Evaluation of Subjective Probability Distributions, p. 142, in particular. ( Stockholm: EFI, The Economic Research Institute of the Stockholm School of Economics; 1970. ).
  • 20 TEMPLETON A. W., LEHR J. L., SIMMONS C.. The Computer Evaluation and Diagnosis of Congenital Heart Disease, Using Roentgenographic Findings. Radiology 87 ( 1966; ) 658-670.