Machine Learning of Motor Vehicle Accident Categories from Narrative Data

M. R. Lehto; G. S. Sorock

doi:10.1055/s-0038-1634680

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Share / Bookmark

Facebook X Linkedin Weibo

Download PDF

Methods Inf Med 1996; 35(04/05): 309-316
DOI: 10.1055/s-0038-1634680

Original Article

Schattauer GmbH

Machine Learning of Motor Vehicle Accident Categories from Narrative Data

M. R. Lehto

¹School of Industrial Engineering, Purdue University, West Lafayette, IN, USA

,

G. S. Sorock

²Liberty Mutual Research Center for Safety and Health, Hopkinton, MA, USA

› Author Affiliations

Further Information

Publication History

Publication Date:
20 February 2018 (online)

Abstract
Full Text
References

Permissions and Reprints

Abstract:

Bayesian inferencing as a machine learning technique was evaluated for identifying pre-crash activity and crash type from accident narratives describing 3,686 motor vehicle crashes. It was hypothesized that a Bayesian model could learn from a computer search for 63 keywords related to accident categories. Learning was described in terms of the ability to accurately classify previously unclassifiable narratives not containing the original keywords. When narratives contained keywords, the results obtained using both the Bayesian model and keyword search corresponded closely to expert ratings (P(detection)≥0.9, and P(false positive)≤0.05). For narratives not containing keywords, when the threshold used by the Bayesian model was varied between p>0.5 and p>0.9, the overall probability of detecting a category assigned by the expert varied between 67% and 12%. False positives correspondingly varied between 32% and 3%. These latter results demonstrated that the Bayesian system learned from the results of the keyword searches.

Keywords:

Narrative Text - Bayesian Methods - Epidemiology

References
1 Langley JD. Experiences using New Zealand’s hospital based surveillance system for injury prevention. Meth Inform Med 1995; 34: 340-4.

PubMed Google Scholar
2 Buckley SM, Chalmers DJ, Langley JD. Injuries due to falls from horses. Aust J Pub Health 1993; 3: 269-71.

PubMed Google Scholar
3 McLoughlin E, Langley JD, Laing RM. Prevention of children’s burns: Legislation and fabric flammability. NZ Med J 1986; 99: 804-7.

PubMed Google Scholar
4 Jenkins EL, Hard DL. Implications for the use of E codes of the international classification of diseases and narrative data in identifying tractor-related deaths in agriculture, United States, 1980-1986. Scand J Work Environmental Health 1992; 18 (Suppl): 49-50.

PubMed Google Scholar
5 Sorock GS, Ranney TA, Lehto MR. Motor vehicle crashes in roadway construction workzones: an analysis using narrative text from insurance claims. Acci Anal and Prev 1996; 28: 131-8.

PubMed Google Scholar
6 West RJ. Development and Use of a System for Classifying Accidents based on Driver’s Reports. Technical Report. Transport and Road Research Laboratory; Crowthorne, Berkshire: 1996

Google Scholar
7 Salton G, McGill MJ. Introduction to Modern Information Retrieval. New York: McGraw-Hill; 1983

Google Scholar
8 Baxendale PB. Machine made index for technical literature – an experiment. IBM J Res Dev 1958; 2: 354-61.

Crossref PubMed Google Scholar
9 Clarke DC, Wall RE. An economical program for limited parsing of English. In: AFIPS Conference Proceedings 1965; 27: 307-16.

PubMed Google Scholar
10 Dillon M, Gray AS. FASIT: A fully functional (syntactically) based indexing system. J Am Soc Info Sci 1983; 34: 99-108.

Crossref PubMed Google Scholar
11 Salton G, Buckley C, Smith M. On the application of syntactic methodologies in automatic text analysis. Info Proc and Mgmt 1990; 12: 43-51.

PubMed Google Scholar
12 Van Rijsbergen CJ. Information Retrieval. (2nd ed).. London: Butterworths; 1979

Google Scholar
13 Bookstein A. Probability and fuzzy-set applications to information retrieval. In: Williams ME. ed. Annual Review of Information Science and Technology (Vol 20). White Plains, NY: Knowledge Industry Publications; 1985: 117-51.

Google Scholar
14 Moustakis V, Lehto MR, Salvendy G. Survey of expert opinion: Which machine learning method may be used for which task?. Int J Human Computer Interaction 1996; 8: 221-36.

PubMed Google Scholar
15 Lehto MR. Warnings and Safety Instructions (Electronic Hypertext version 2.0). Ann Arbor, MI: Fuller Technical Publications; 1994

Google Scholar
16 Zhu W, Lehto MR. Decision Support for Indexing and Retrieval of Information in Hypertext Systems. (working paper) Purdue University; West Lafayette IN: 1996

Google Scholar
17 Tanner WP, Swets JA. A decision making theory of visual detection. Psych Rev 1954; 61: 401-9.

Crossref PubMed Google Scholar
18 Lachenbruch PA, Mickey MR. Estimation of error rates in discriminant analysis. Technometrics 1968; 10: 1-11.

Crossref PubMed Google Scholar
19 Gardner MJ, Altman DG. Calculating confidence intervals for proportions and their differences. In: Gardner MJ, Altman DG. eds. Statistics with Confidence-Confidence Intervals and Statistical Guidelines. London: British Medical Journal Publ; 1989: 28-33.

Google Scholar

Subscribe to RSS

Share / Bookmark

Machine Learning of Motor Vehicle Accident Categories from Narrative Data

Publication History

Abstract:

Keywords:

References