Increases in physical activity through active travel have the potential to have large beneficial effects on populations, through both better health outcomes and reduced motorized traffic. However accurately identifying travel mode in large datasets is problematic. Here we provide an open source tool to quantify time spent stationary and in four travel modes(walking, cycling, train, motorised vehicle) from accelerometer measured physical activity data, combined with GPS and GIS data.
The Examining Neighbourhood Activities in Built Living Environments in London study evaluates the effect of the built environment on health behaviours, including physical activity. Participants wore accelerometers and GPS receivers on the hip for 7 days. We time-matched accelerometer and GPS, and then extracted data from the commutes of 326 adult participants, using stated commute times and modes, which were manually checked to confirm stated travel mode. This yielded examples of five travel modes: walking, cycling, motorised vehicle, train and stationary. We used this example data to train a gradient boosted tree, a form of supervised machine learning algorithm, on each data point (131,537 points), rather than on journeys. Accuracy during training was assessed using five-fold cross-validation. We also manually identified the travel behaviour of both 21 participants from ENABLE London (402,749 points), and 10 participants from a separate study (STAMP-2, 210,936 points), who were not included in the training data. We compared our predictions against this manual identification to further test accuracy and test generalisability.
Applying the algorithm, we correctly identified travel mode 97.3% of the time in cross-validation (mean sensitivity 96.3%, mean active travel sensitivity 94.6%). We showed 96.0% agreement between manual identification and prediction of 21 individuals' travel modes (mean sensitivity 92.3%, mean active travel sensitivity 84.9%) and 96.5% agreement between the STAMP-2 study and predictions (mean sensitivity 85.5%, mean active travel sensitivity 78.9%).
We present a generalizable tool that identifies time spent stationary and time spent walking with very high precision, time spent in trains or vehicles with good precision, and time spent cycling with moderate precisionIn studies where both accelerometer and GPS data are available this tool complements analyses of physical activity, showing whether differences in PA may be explained by differences in travel mode. All code necessary to replicate, fit and predict to other datasets is provided to facilitate use by other researchers.