alexis yelton insight week 4 presentation 2
Post on 16-Aug-2015
65 Views
Preview:
TRANSCRIPT
Runners want to run fasterWhat goal should you set for a half marathon time?
What training programs do the fastest runners follow?
Data from Strava.com
Elevation gain this month
Number of rest days per week
Pace
Hal
f mar
atho
n tim
e (m
in)
Age
Time series, demographic, and aggregated running data on 10,000 runners. 1,000 with half-marathon times and other features.
Log
half
mar
atho
n tim
e
Log distance run this month (mi)
Pace
Distance past month Weight range
Time past month Age range
Pace past month Number of rest days/wk
Distance past 6 months Number of long days/wk
Gender Sdev pace
Data from Strava.com
Pace
Hal
f mar
atho
n tim
e (m
in)
Sex
Age range (years)
Hal
f mar
atho
n tim
e
Weight range (lbs)
Analysis
Benchmarking with a linear model 0.49 10 min
Nonlinear regression modeling1. Lasso regression 0.48 10 min2. Ridge regression 0.48 10 min3. Random forest regression 0.66 8.3 min
Regression r2
RMSE
Validation:179 runners3-fold cross-validation 0.79 6.2 min
Seems to be related to a different distribution in the test set. Possibly because of importance of outliers.
Your average pace over the past month is the most important feature by far.
Results
Variable importancePace past month
Distance past month
Distance past 6 months
Elevation past month
Rest days
SD pace
Weight
Long days
Age
Gender
Decrease in node impurities
About me: Alexis Yelton, MIT postdoc
Genomics for understanding ecosystems:Discovery of novel organisms, metabolisms, and ecosystem functions such as large organic compound breakdown by marine cyanobacteria (Tara Oceans data set 7.2 terabases of DNA)
Chitinase in marine Synechococcus
Chiti
nase
acti
vity
Wet lab science for validating discoveries: My first half
marathon:1:56:30
Personal best:1:47:56
top related