Subscribe to RSS
DOI: 10.3414/ME15-01-0076
Evaluation of Large-scale Data to Detect Irregularity in Payment for Medical Services[*]
An Extended Use of Benford’s LawPublication History
received:
15 June 2015
accepted in revised form:
01 March 2016
Publication Date:
08 January 2018 (online)

Summary
Background: Sophisticated anti-fraud systems for the healthcare sector have been built based on several statistical methods. Although existing methods have been developed to detect fraud in the healthcare sector, these algorithms consume considerable time and cost, and lack a theoretical basis to handle large-scale data.
Objectives: Based on mathematical theory, this study proposes a new approach to using Benford’s Law in that we closely examined the individual-level data to identify specific fees for in-depth analysis.
Methods: We extended the mathematical theory to demonstrate the manner in which large-scale data conform to Benford’s Law. Then, we empirically tested its applicability using actual large-scale healthcare data from Korea’s Health Insurance Review and Assessment (HIRA) National Patient Sample (NPS). For Benford’s Law, we considered the mean absolute deviation (MAD) formula to test the large-scale data.
Results: We conducted our study on 32 diseases, comprising 25 representative diseases and 7 DRG-regulated diseases. We performed an empirical test on 25 diseases, showing the applicability of Benford’s Law to large-scale data in the healthcare industry. For the seven DRG-regulated diseases, we examined the individual-level data to identify specific fees to carry out an in-depth analysis. Among the eight categories of medical costs, we considered the strength of certain irregularities based on the details of each DRG-regulated disease.
Conclusions: Using the degree of abnormality, we propose priority action to be taken by government health departments and private insurance institutions to bring unnecessary medical expenses under control. However, when we detect deviations from Benford’s Law, relatively high contamination ratios are required at conventional significance levels.
Keywords
Medical fees - claims analysis - healthcare fraud - diagnosis-related groups (DRG) - Benford’s Law* Supplementary material published on our website http://dx.doi.org/10.3414/ME15-01-0076
-
References
- 1 Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases. Journal of Record for the AI Community 1996; 17 (Suppl. 03) 37.
- 2 Li J, Huang KY, Jin J, Shi J. A survey on statistical methods for health care fraud detection. Health Care Manag Sci 2008; 11 (Suppl. 03) 275-87.
- 3 Brownson RC, Baker EA, Leet TL, Gillespie KN, True WR. Evidence-based public health. Oxford University Press; 2010
- 4 Burghard C. Big data and analytics key to accountable care success. International Data Corporation Health Insights, Sponsored by: IBM, 3–4. 2012
- 5 Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2014; 2 (Suppl. 01) 3.
- 6 Feldman B, Martin EM, Skotnes T. Big data in healthcare hype and hope. October 2012. Dr. Bonnie, 360.
- 7 Fernandes L, O’Connor M, Weaver V. Big data, bigger outcomes: Healthcare is embracing the big data movement, hoping to revolutionize HIM by distilling vast collection of data for specific analysis. J AHIMA 2012; 83 (Suppl. 10) 38-43.
- 8 Shapiro AF. The merging of neural networks, fuzzy logic, and genetic algorithms. Insurance: Mathematics and Economics 2002; 31 (Suppl. 01) 115-31.
- 9 He H, Wang J, Graco W, Hawkins S. Application of neural networks to detection of medical fraud. Expert Systems with Applications 1997; 13 (Suppl. 04) 329-36.
- 10 Hall C. Intelligent data mining at IBM: new products and applications. Intelligent Software Strategies 1996; 7 (Suppl. 05) 1-16.
- 11 Sibbritt D, Gibberd R. The effective use of a summary table and decision tree methodology to analyze very large healthcare datasets. Health Care Manag Sci 2004; 7 (Suppl. 03) 163-71.
- 12 Ormerod T, Morley N, Ball L, Langley C, Spenser C. Using ethnography to design a Mass Detection Tool (MDT) for the early discovery of insurance fraud. CHI’03 Extended Abstracts on Human Factors in Computing Systems 2003; 650-651. ACM.
- 13 Williams GJ, Huang Z. Mining the knowledge mine. Artificial Intelligence, Springer Berlin Heidelberg 1997; 340-348.
- 14 Ortega PA, Figueroa CJ, Ruz GA. A medical claim fraud/abuse detection system based on data mining: A case study in Chile. Conference of Data Mining. 2006 6. 26-9.
- 15 Cooper C. Turning information into action, Computer Associates: The Software That Manages eBusiness, Report. 2003
- 16 Huber PJ. From Large to Huge: A statistician’s reactions to KDD & DM. Conference on Knowledge Discovery and Data Mining 1997; 304-8.
- 17 Maher M, Akers M. Using Benford’s Law to Detect Fraud in the Insurance Industry. Accounting Faculty Research and Publications. 2002: 21.
- 18 Hill TP. The first digit phenomenon. American Scientist 1998; 86 (Suppl. 04) 358-63.
- 19 García-Berthou E, Alcaraz C. Incongruence between test statistics and P values in medical papers. BMC Medical Research Methodology 2004; 4 (Suppl. 01) 13.
- 20 Durtschi C, Hillison W, Pacini C. The effective use of Benford’s law to assist in detecting fraud in accounting data. Journal of forensic accounting 2004; 5 (Suppl. 01) 17-34.
- 21 Al-Marzouki S, Evans S, Marshall T, Roberts I. Are these data real? Statistical methods for the detection of data fabrication in clinical trials. BMJ 2005; 331 7511 267-70.
- 22 Sanches J, Marques J. Image reconstruction using the Benford law. In 2006 International Conference on Image Processing.
- 23 Lu F, Boritz JE, Covvey D. Adaptive fraud detection using Benford’s law. In: Advances in Artificial Intelligence. Berlin; Heidelberg: Springer; 2006. p. 347-358.
- 24 Hill TP. A statistical derivation of the significant-digit law. Statistical Science 1995; 354-363.
- 25 Diaconis P. The distribution of leading digits and uniform distribution mod 1. The Annals of Probability 1977; 72-81.
- 26 Hesman T. Cheaters tripped up by random numbers law. Dallas Morning News. 1999
- 27 Boyle J. An application of Fourier series to the most significant digit problem. American Mathematical Monthly 1994; 879-886.
- 28 Sakamoto H. On the distributions of the product and the quotient of the independent and uniformly distributed random variables. Tohoku Mathematical Journal. J, 49, 1943; 243-260.
- 29 Springer MD, Thompson WE. The distribution of products of independent random variables. Journal of the Society for Industrial and Applied Mathematics 1966; 14 (Suppl. 03) 511-26.
- 30 Adhikari AK, Sarkar BP. Distribution of most significant digit in certain functions whose arguments are random variables. Sankhy[amacron]: The Indian Journal of Statistics, Series B 1968; 47-58.
- 31 Turner PR. The distribution of leading significant digits. 1982
- 32 Ettredge ML, Srivastava RP. Using digital analysis to enhance data integrity. Issues in Accounting Education 1999; 14 (Suppl. 04) 675-90.
- 33 The Economist.. Corporate fraud: Every good boy deserves fudged profits. 2011 October 13; Retrieved April 1, 2015. Available from: http://www.economist.com/blogs/democracyinamerica/2011/10/corporate-fraud
- 34 The Economist.. Free exchange: The scam busters. 2012 December 15; Retrieved April 1, 2015,. Available from: http://www.economist.com/news/finance-and-economics/21568364-how-antitrust-economists-are-getting-better-spotting-cartelsscam-buste
- 35 Newcomb S. Note on the frequency of use of the different digits in natural numbers. American Journal of Mathematics 1881; 4 (Suppl. 01) 39-40.
- 36 Benford F. The law of anomalous numbers. Proceedings of the American Philosophical Society 1938; 551-572.
- 37 Pinkham RS. On the distribution of first significant digits. The Annals of Mathematical Statistics 1961; 1223-30.
- 38 Hill TP. A statistical derivation of the significant-digit law. Statistical Science 1995; 354-363.
- 39 Wlodarski J. Fibonacci and Lucas numbers tend to obey Benford’s Law. The Fibonacci Quarterly, 9, 1971; 87-88.
- 40 Nigrini MJ, Miller SJ. Benford’s law applied to hydrology data–results and relevance to other geophysical data. Mathematical Geology 2007; 39 (Suppl. 05) 469-90.
- 41 Nigrini M. Benford’s Law: Applications for forensic accounting auditing, and fraud detection. Vol. 586 2012. John Wiley & Sons.;
- 42 Busta B, Sundheim R. Tax return numbers tend to obey Benford’s Law. Center for Business Research 1992; 93: 106-94.
- 43 Giles DE. Benford’s law and naturally occurring prices in certain ebaY auctions. Applied Economics Letters 2007; 14 (Suppl. 03) 157-61.
- 44 Lemons DS. On the numbers of things and the distribution of first digits. American Journal of Physics 1986; 54 (Suppl. 09) 816-7.
- 45 Pearson KX. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 1900; 50 (Suppl. 302) 157-75.
- 46 Gauss CF. Bestimmung der Genauigkeit der Beobachtungen. In Werke 4, 1816; 109-117.
- 47 Drake PD, Nigrini MJ. Computer assisted analytical procedures using Benford’s law. Journal of Accounting Education 2000; 18 (Suppl. 02) 127-46.
- 48 World Health Organization.. The world health report 2000–health systems: improving performance. Geneva: World Health Organization; 2000
- 49 Barnum H, Kutzin J, Saxenian H. Incentives and provider payment methods. The International Journal of Health Planning and Management 1995; 10 (Suppl. 01) 23-45.
- 50 Kahn KL, Keeler EB, Sherwood MJ, Rogers WH, Draper D, Bentow SS. et al. Comparing outcomes of care before and after implementation of the DRG-based prospective payment system. JAMA 1990; 264 (Suppl. 15) 1984-8.
- 51 Fetter RB, Shin Y, Freeman JL, Averill RF, Thompson JD. Case mix definition by diagnosis-related groups. Med Care 1980; 18 iii 1-53.