Hybrid Data Mining Techniques for Childhood Obesity Predictions

ABSTRACT

Childhood obesity has become a worrying global epidemic.   Evidences show that childhood obesity persists into adulthood.  Therefore, predicting obesity at an early age is both useful and important.  The objectives of this study are to identify significant risk factors and protective factors of childhood obesity in Malaysia; to select and propose parameters for prediction; to investigate suitable data mining techniques for childhood obesity and overweight predictions; and to propose hybrid data mining techniques to increase the sensitivity of predictions.  This study consisted of five stages that are risk factors reviews, data collections, parameter selection, empirical study, propose and evaluation of hybrid techniques.  The factors that are related to the Malaysian children are: sex, catch-up growth, premature birth, adiposity rebound, breastfeeding duration, soup, sandwich, snack or chocolate eaten outdoors, excessive television (TV) watching, eating junk food, eat junk food in front of TV, warm meals for supper, eating fried food and fruits, physical activity, duration of sleep, number of sibling, and parents BMI.  Three hybrid techniques have been proposed.  The first technique is a hybrid of Naïve Bayes (NB) and Classification and Regression Tree (CART) in a NB-Tree framework; the second technique is using the CART for variable selection and NB for prediction (CART Selection-NB Classification); and the third technique integrates mean value calculation and Euclidean Distances classification into the second technique (Mean-Euclidean Classification).  The proposed parameters for prediction are: adiposity rebound, parent obesity, catch-up growth, father obesity, mother obesity, fried food, snacks in front TV, junk food, physical activity, number of sibling, sleep duration, fruits, premature birth, and duration of watching TV.  Obesity predictions using the proposed parameters have shown these results: CART with 66.67% sensitivity; NB with 0% sensitivity; NB-Tree with 0% sensitivity; CART Selection-NB Classification with 40% sensitivity; and the Mean-Euclidean Classification with 60% sensitivity. Meanwhile, for overweight predictions using the proposed parameters, the results are as follows: CART with 0% sensitivity; NB with 0% sensitivity; NB-Tree with 0% sensitivity; CART Selection-NB Classification with 75% sensitivity; and Mean-Euclidean Classification with 95% sensitivity.  The predictions were also made using the simulated parameters that are:  sex, birth weight, premature birth, catch-up growth, and parents BMI. Obesity predictions using the simulated parameters have shown these results:  the CART with 33.33% sensitivity; the NB with 0% sensitivity; and the Mean-Euclidean Classification with 50% sensitivity.  Meanwhile, for overweight predictions using the simulated parameters, the results are as follows: CART with 0% sensitivity; NB with 0% sensitivity; and Mean-Euclidean Classification with 50% sensitivity.  The results have shown that the proposed hybrid techniques (CART Selection-NB Classification and Mean-Euclidean Classification) and the proposed parameters have increased the performance of the CART and the NB in obesity and overweight predictions.

FULL DOCUMENTATION

Hybrid Data Mining Techniques for Childhood Obesity Predictions

PUBLICATIONS

 

Leave a Reply

Your email address will not be published. Required fields are marked *