↓ Skip to main content

Decision trees in epidemiological research

Overview of attention for article published in Emerging Themes in Epidemiology, September 2017
Altmetric Badge

Citations

dimensions_citation
99 Dimensions

Readers on

mendeley
164 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Decision trees in epidemiological research
Published in
Emerging Themes in Epidemiology, September 2017
DOI 10.1186/s12982-017-0064-4
Pubmed ID
Authors

Ashwini Venkatasubramaniam, Julian Wolfson, Nathan Mitchell, Timothy Barnes, Meghan JaKa, Simone French

Abstract

In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 164 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 164 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 29 18%
Student > Ph. D. Student 25 15%
Student > Master 17 10%
Student > Bachelor 12 7%
Student > Doctoral Student 10 6%
Other 25 15%
Unknown 46 28%
Readers by discipline Count As %
Medicine and Dentistry 27 16%
Computer Science 17 10%
Nursing and Health Professions 9 5%
Mathematics 7 4%
Unspecified 7 4%
Other 40 24%
Unknown 57 35%