Appl Clin Inform 2015; 06(03): 506-520
DOI: 10.4338/ACI-2015-03-RA-0036
Research Article
Schattauer GmbH

Machine Learning Techniques for Prediction of Early Childhood Obesity

T. M. Dugan
1   Indiana University, Children‘s Health Services Research, Indianapolis, Indiana, United States
2   Indiana University Purdue University Indianapolis, Computer Science, Indianapolis, Indiana, United States
,
S. Mukhopadhyay
2   Indiana University Purdue University Indianapolis, Computer Science, Indianapolis, Indiana, United States
,
A. Carroll
1   Indiana University, Children‘s Health Services Research, Indianapolis, Indiana, United States
,
S. Downs
1   Indiana University, Children‘s Health Services Research, Indianapolis, Indiana, United States
› Author Affiliations
Further Information

Correspondence to:

Tamara M. Dugan
Children’s Health Services Research
410 W 10th Street
Indianapolis, IN 46202
Phone: 317–278–6926   

Publication History

received: 08 April 2015

accepted in revised form: 30 June 2015

Publication Date:
19 December 2017 (online)

 

Summary

Objectives: This paper aims to predict childhood obesity after age two, using only data collected prior to the second birthday by a clinical decision support system called CHICA.

Methods: Analyses of six different machine learning methods: RandomTree, RandomForest, J48, ID3, Naïve Bayes, and Bayes trained on CHICA data show that an accurate, sensitive model can be created.

Results: Of the methods analyzed, the ID3 model trained on the CHICA dataset proved the best overall performance with accuracy of 85% and sensitivity of 89%. Additionally, the ID3 model had a positive predictive value of 84% and a negative predictive value of 88%. The structure of the tree also gives insight into the strongest predictors of future obesity in children. Many of the strongest predictors seen in the ID3 modeling of the CHICA dataset have been independently validated in the literature as correlated with obesity, thereby supporting the validity of the model.

Conclusions: This study demonstrated that data from a production clinical decision support system can be used to build an accurate machine learning model to predict obesity in children after age two.

Citation: Dugan TM, Mukhopadhyay S, Carroll AE, Downs SM. Machine learning techniques for prediction of early childhood obesity. Appl Clin Inform 2015; 6: 506–520

http://dx.doi.org/10.4338/ACI-2015-03-RA-0036


#

 


#

Conflicts of Interest

The authors declare that they have no conflicts of interest in the research.

  • References

  • 1 CDC. [12/3/2013] Available from: http://www.cdc.gov/healthyyouth/obesity/facts.htm.
  • 2 Khan NA, Raine LB, Drollette ES, Scudder MR, Pontifex MB, Castelli DM, Donovan SM, Evans EM, Hill-man CH. Impact of the FITKids Physical Activity Intervention on Adiposity in Prepubertal Children. Pediatrics 2014; 133: e875-e883.
  • 3 Mollard RC, Senechal M, MacIntosh AC, Hay J, Wicklow BA, Wittmeier KD, Sellers EA, Dean HJ, Ryner L, Berard L, McGavock JM. Dietary determinants of hepatic steatosis and visceral adiposity in overweight and obese youth at risk of type 2 diabetes. The American journal of clinical nutrition 2014; 99: 804-812.
  • 4 Wing RR, Bolin P, Brancati FL, Bray GA, Clark JM, Coday M, Crow RS, Curtis JM, Egan CM, Espeland MA, Evans M, Foreyt JP, Ghazarian S, Gregg EW, Harrison B, Hazuda HP, Hill JO, Horton ES, Hubbard VS, Jakicic JM, Jeffery RW, Johnson KC, Kahn SE, Kitabchi AE, Knowler WC, Lewis CE, Maschak-Carey BJ, Montez MG, Murillo A, Nathan DM, Patricio J, Peters A, Pi-Sunyer X, Pownall H, Reboussin D, Regen-steiner JG, Rickman AD, Ryan DH, Safford M, Wadden TA, Wagenknecht LE, West DS, Williamson DF, Yanovski SZ. Cardiovascular effects of intensive lifestyle intervention in type 2 diabetes. New England journal of medicine 2013; 369: 145-154.
  • 5 Wadden TA, Volger S, Sarwer DB, Vetter ML, Tsai AG, Berkowitz RI, Kumanyika S, Schmitz KH, Diewald LK, Barg R, Chittams J, Moore RH. A two-year randomized trial of obesity treatment in primary care practice. New England journal of medicine 2011; 365: 1969-1979.
  • 6 Pahor M, Guralnik JM, Ambrosius WT, Blair S, Bonds DE, Church TS, Espeland MA, Fielding RA, Gill TM, Groessl EJ, King AC, Kritchevsky SB, Manini TM, McDermott MM, Miller ME, Newman AB, Rejeski WJ, Sink KM, Williamson JD. Effect of structured physical activity on prevention of major mobility disability in older adults: the LIFE study randomized clinical trial. JAMA the journal of the American Medical Association 2014; 311: 2387-2396.
  • 7 Davis CL, Pollock NK, Waller JL, Allison JD, Dennis BA, Bassali R, Melendez A, Boyle CA, Gower BA. Exercise dose and diabetes risk in overweight and obese children: a randomized controlled trial. JAMA the journal of the American Medical Association 2012; 308: 1103-1112.
  • 8 Monteiro PO, Victora CG. Rapid growth in infancy and childhood and obesity in later life--a systematic review. Obesity reviews: an official journal of the International Association for the Study of Obesity 2005; 6: 143-154.
  • 9 CDC. [6/27/2014] Available from: http://www.cdc.gov/healthyweight/assessing/bmi/childrens_bmi about_childrens_bmi.html.
  • 10 Michie D, Spiegelhalter DJ, Taylor CC. Machine learning, neural and statistical classification. 1994
  • 11 Muhamad Adnan M, Husain W, Damanhoori F. A survey on utilization of data mining for childhood obesity prediction. Information and Telecommunication Technologies (APSITT) 2010; 1-6.
  • 12 Novak B, Bigec M. Application of artificial neural networks for childhood obesity prediction. Artificial Neural Networks and Expert Systems 1995; 377-380.
  • 13 Novak B, Bigec M. Childhood obesity prediction with artificial neural networks. Computer-Based Medical Systems 1996; 77-82.
  • 14 Adnan MHBM, Husain W, Rashid N. Parameter Identification and Selection for Childhood Obesity Prediction Using Data Mining 2nd International Conference on Management and Artificial Intelligence. 2012
  • 15 Adnan M, Hariz M, Husain W, Rashid A. A hybrid approach using Naïve Bayes and Genetic Algorithm for childhood obesity prediction. Computer & Information Science (ICCIS) 2012; 281-285.
  • 16 Zhang S, Tjortjis C, Zeng X, Qiao H, Buchan I, Keane J. Comparing data mining methods with logistic regression in childhood obesity prediction. Information Systems Frontiers 2009; 11: 449-460.
  • 17 Anand V, Biondich PG, Liu G, Rosenman M, Downs SM. Child Health Improvement through Computer Automation: the CHICA system. Studies in health technology and informatics 2004; 107: 187-191.
  • 18 Russell S, Norvig P. Artificial Intelligence: A modern approach. Prentice-Hall, Englewood Cliffs 1995; 25.
  • 19 Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explor Newsl 2009; 11: 10-18.
  • 20 Han J, Kamber M. Data mining: concepts and techniques. Morgan Kaufmann Publishers; Burlington: 2000
  • 21 Singh GK, Kogan MD, Van Dyck PC, Siahpush M. Racial/ethnic, socioeconomic, and behavioral determinants of childhood and adolescent obesity in the United States: analyzing independent and joint associations. Annals of epidemiology 2008; 18: 682-695.
  • 22 Gross RS, Velazco NK, Briggs RD, Racine AD. Maternal Depressive Symptoms and Child Obesity in Low-Income Urban Families. Academic Pediatrics 2013; 13: 356-363.
  • 23 Maupome G, Karanja N, Ritenbaugh C, Lutz T, Aickin M, Becker T. Dental caries in American Indian toddlers after a community-based beverage intervention. Ethnicity & disease 2010; 20: 444-450.
  • 24 Morandi A, Meyre D, Lobbens S, Kleinman K, Kaakinen M, Rifas-Shiman SL, Vatin V, Gaget S, Pouta A, Hartikainen AL, Laitinen J, Ruokonen A, Das S, Khan AA, Elliott P, Maffeis C, Gillman MW, Jarvelin MR, Froguel P. Estimation of newborn risk for child or adolescent obesity: lessons from longitudinal birth cohorts. PloS one 2012; 7: e49919.
  • 25 Kroenke K, Spitzer RL, Williams JB. The Patient Health Questionnaire-2: validity of a two-item depression screener. Medical care 2003; 41: 1284-1292.

Correspondence to:

Tamara M. Dugan
Children’s Health Services Research
410 W 10th Street
Indianapolis, IN 46202
Phone: 317–278–6926   

  • References

  • 1 CDC. [12/3/2013] Available from: http://www.cdc.gov/healthyyouth/obesity/facts.htm.
  • 2 Khan NA, Raine LB, Drollette ES, Scudder MR, Pontifex MB, Castelli DM, Donovan SM, Evans EM, Hill-man CH. Impact of the FITKids Physical Activity Intervention on Adiposity in Prepubertal Children. Pediatrics 2014; 133: e875-e883.
  • 3 Mollard RC, Senechal M, MacIntosh AC, Hay J, Wicklow BA, Wittmeier KD, Sellers EA, Dean HJ, Ryner L, Berard L, McGavock JM. Dietary determinants of hepatic steatosis and visceral adiposity in overweight and obese youth at risk of type 2 diabetes. The American journal of clinical nutrition 2014; 99: 804-812.
  • 4 Wing RR, Bolin P, Brancati FL, Bray GA, Clark JM, Coday M, Crow RS, Curtis JM, Egan CM, Espeland MA, Evans M, Foreyt JP, Ghazarian S, Gregg EW, Harrison B, Hazuda HP, Hill JO, Horton ES, Hubbard VS, Jakicic JM, Jeffery RW, Johnson KC, Kahn SE, Kitabchi AE, Knowler WC, Lewis CE, Maschak-Carey BJ, Montez MG, Murillo A, Nathan DM, Patricio J, Peters A, Pi-Sunyer X, Pownall H, Reboussin D, Regen-steiner JG, Rickman AD, Ryan DH, Safford M, Wadden TA, Wagenknecht LE, West DS, Williamson DF, Yanovski SZ. Cardiovascular effects of intensive lifestyle intervention in type 2 diabetes. New England journal of medicine 2013; 369: 145-154.
  • 5 Wadden TA, Volger S, Sarwer DB, Vetter ML, Tsai AG, Berkowitz RI, Kumanyika S, Schmitz KH, Diewald LK, Barg R, Chittams J, Moore RH. A two-year randomized trial of obesity treatment in primary care practice. New England journal of medicine 2011; 365: 1969-1979.
  • 6 Pahor M, Guralnik JM, Ambrosius WT, Blair S, Bonds DE, Church TS, Espeland MA, Fielding RA, Gill TM, Groessl EJ, King AC, Kritchevsky SB, Manini TM, McDermott MM, Miller ME, Newman AB, Rejeski WJ, Sink KM, Williamson JD. Effect of structured physical activity on prevention of major mobility disability in older adults: the LIFE study randomized clinical trial. JAMA the journal of the American Medical Association 2014; 311: 2387-2396.
  • 7 Davis CL, Pollock NK, Waller JL, Allison JD, Dennis BA, Bassali R, Melendez A, Boyle CA, Gower BA. Exercise dose and diabetes risk in overweight and obese children: a randomized controlled trial. JAMA the journal of the American Medical Association 2012; 308: 1103-1112.
  • 8 Monteiro PO, Victora CG. Rapid growth in infancy and childhood and obesity in later life--a systematic review. Obesity reviews: an official journal of the International Association for the Study of Obesity 2005; 6: 143-154.
  • 9 CDC. [6/27/2014] Available from: http://www.cdc.gov/healthyweight/assessing/bmi/childrens_bmi about_childrens_bmi.html.
  • 10 Michie D, Spiegelhalter DJ, Taylor CC. Machine learning, neural and statistical classification. 1994
  • 11 Muhamad Adnan M, Husain W, Damanhoori F. A survey on utilization of data mining for childhood obesity prediction. Information and Telecommunication Technologies (APSITT) 2010; 1-6.
  • 12 Novak B, Bigec M. Application of artificial neural networks for childhood obesity prediction. Artificial Neural Networks and Expert Systems 1995; 377-380.
  • 13 Novak B, Bigec M. Childhood obesity prediction with artificial neural networks. Computer-Based Medical Systems 1996; 77-82.
  • 14 Adnan MHBM, Husain W, Rashid N. Parameter Identification and Selection for Childhood Obesity Prediction Using Data Mining 2nd International Conference on Management and Artificial Intelligence. 2012
  • 15 Adnan M, Hariz M, Husain W, Rashid A. A hybrid approach using Naïve Bayes and Genetic Algorithm for childhood obesity prediction. Computer & Information Science (ICCIS) 2012; 281-285.
  • 16 Zhang S, Tjortjis C, Zeng X, Qiao H, Buchan I, Keane J. Comparing data mining methods with logistic regression in childhood obesity prediction. Information Systems Frontiers 2009; 11: 449-460.
  • 17 Anand V, Biondich PG, Liu G, Rosenman M, Downs SM. Child Health Improvement through Computer Automation: the CHICA system. Studies in health technology and informatics 2004; 107: 187-191.
  • 18 Russell S, Norvig P. Artificial Intelligence: A modern approach. Prentice-Hall, Englewood Cliffs 1995; 25.
  • 19 Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explor Newsl 2009; 11: 10-18.
  • 20 Han J, Kamber M. Data mining: concepts and techniques. Morgan Kaufmann Publishers; Burlington: 2000
  • 21 Singh GK, Kogan MD, Van Dyck PC, Siahpush M. Racial/ethnic, socioeconomic, and behavioral determinants of childhood and adolescent obesity in the United States: analyzing independent and joint associations. Annals of epidemiology 2008; 18: 682-695.
  • 22 Gross RS, Velazco NK, Briggs RD, Racine AD. Maternal Depressive Symptoms and Child Obesity in Low-Income Urban Families. Academic Pediatrics 2013; 13: 356-363.
  • 23 Maupome G, Karanja N, Ritenbaugh C, Lutz T, Aickin M, Becker T. Dental caries in American Indian toddlers after a community-based beverage intervention. Ethnicity & disease 2010; 20: 444-450.
  • 24 Morandi A, Meyre D, Lobbens S, Kleinman K, Kaakinen M, Rifas-Shiman SL, Vatin V, Gaget S, Pouta A, Hartikainen AL, Laitinen J, Ruokonen A, Das S, Khan AA, Elliott P, Maffeis C, Gillman MW, Jarvelin MR, Froguel P. Estimation of newborn risk for child or adolescent obesity: lessons from longitudinal birth cohorts. PloS one 2012; 7: e49919.
  • 25 Kroenke K, Spitzer RL, Williams JB. The Patient Health Questionnaire-2: validity of a two-item depression screener. Medical care 2003; 41: 1284-1292.