When most runners are waiting to find out if they got into the Boston Marathon, all they can do is hazard a guess or read online forums to try and determine their fate. But one of Strava’s data engineers, Dave Hoch, thought he could take a more scientific approach. Despite the fact that the B.A.A. lowered the qualifying standards by 5 minutes for 2020, the final cut-off time was 1:39 below the qualifying time. Here’s the story of how Dave got within 5 seconds of the cut-off time before they were announced – and what it meant for his own participation in the race.
First, we chose ages and genders for whom we had a reasonable amount of data to make predictions from. This ended up being Men and Women < 35, 35-39, 40-44, and 45-49 for years 2015-2020. The charts below verify that the distribution of marathon times has remained similar across gender and year.
Some other observations:
- Round times clearly have a psychological effect, with spikes materializing at 3:00, 3:30, and 4:00. The same seems to go for the BQ times (3:05 and 3:35 for the most populous age groups in each gender for all but 2019, and then 3:00 and 3:30 respectively)
- You can also see pretty fantastically how much Strava has grown in running (and especially with women) over time, with the biggest jumps coming in the 2019 and 2020 windows (September 2017-2018 and September 2018-2019)
We also looked at overall stats by age/gender for each year in numerical form for:
- Total marathons on Strava
- Number of would-be-qualifiers
- Number of qualifiers under the eventual cutoff
These stats showed the following:
- The youngest age groups are the most competitive in terms of % of athletes denied and % of the total # of athletes who hit the BQ. They are also the largest (both at Boston and represented on Strava).
- 2019 was a huge jump in % of rejections of people who hit the standard but not the cutoff, which justifies the 5-minute drop across the board in the standard for 2020.
To determine the cutoff for the 2020 Boston Marathon, we looked at the percentage of Strava athletes who qualified for Boston each year, and applied that same percentage to the 2020 Boston qualifying window. The tricky part is that the Strava population changes every year, and that percentage of runners decreases over time. The analysis will mostly ignore that factor and assume that the Strava representation does not change year over year.
As evidenced in the original dataset, the youngest age group ( <35) appears to be the most competitive, so we used those cohorts to base our methodology.
For all marathons run with the Men <35 cohort on Strava within the qualifying window for the 2019 Boston Marathon, 13.0% were under the B.A.A. cutoff of 3:00:08.
For all marathons run with the Women <35 cohort on Strava within the qualifying window for the 2019 Boston Marathon, 15.0% were under the B.A.A. cutoff of 3:30:08.
If we then look at marathon times for the 2020 qualifying window and apply the respective percentage, we are able to hazard a guess at what this year’s cutoff prediction will be.
Looking at the 13th percentile of Men <35 cohort on Strava for the 2020 qualifying window gives us 2:58:30 (90 seconds under the 3:00:00 qualification window).
Looking at the 15th percentile of Women <35 cohort on Strava for the 2020 qualifying window gives us 3:28:26 (94 seconds under the 3:30:00 qualification window).
Granted, there are many more factors that should be taken into account. Some of them are under the B.A.A.’s control (exactly how many runners they let into the race, the percentage of qualifiers vs. charity runners, how to adjust for unequal gender applications, etc).
Based on this data, the final prediction is that the Boston cutoff will be between 90 to 94 seconds under the qualifying time.
At the end of the day, no one knows how exactly the cutoff times are determined except for the B.A.A.. Coming in with a qualifier of 2:58:37 run at last year’s Boston I knew that I would be close, and wanted to try to use Strava data to hopefully give myself some Peace of Mind. Unfortunately, although the estimate was fairly accurate, it left me 16 seconds short. On a brighter side, there are two employees at Strava who qualified by scant 10 and 11-second margins.
Congratulations to everyone who will be toeing the line at Hopkinton this year, and I’ll be cheering you on from Heartbreak Hill!
Dave is a data engineer at Strava. Before working at Strava, he lived in Boston and credits spectating the Boston Marathon as one of the reasons he got into running. After running five marathons to mixed success (eat your breakfast!), he was finally able to qualify and ran his first Boston in 2019. An avid Bruce Springsteen fan, you can typically find Dave running with his Dunkin shoes at his favorite route near the Strava office to Heron’s Head Park.