Saturday, July 2, 2016

Strava Data Analysis

Here I'll apply some simple analyses, at a global level, to Rob Young's Strava data.  We'll demonstrate impossible speeds, done only at night when he was unobserved, and how his performance is so off the charts compared to other high and elite-level efforts that it cannot be real.

Speed Histogram - Rob Young pre/post observation vs. others

Data Acquisition/Tools

The data sources for this analysis are the Strava activities on Rob Young's account, downloaded using a Chrome plugin on 6/26/16.  Also included is one activity which was deleted from Rob Young's account but re-uploaded to a backup/fake account by LetsRun users.  An archive of the evaluated data can be found on Google Drive (TCX files, GPX files).  Neither the original nor the recreated "Joanna" accounts are considered here.

In all, there are 267 activities covering 348 hours, 37 minutes, 16 seconds (292:37:57 moving time) and 1412.8 miles.  Analysis was performed using the latest development build of GoldenCheetah.

Speed and Pace

GoldenCheetah can spit out an athletes best efforts quite easily.   Here are Rob Young' top 10 sustained paces for varying time intervals:

He's sustained a 4:07 pace for 1 minute (62s quarter), a couple 5-flat miles, and 6 minute pace for an hour and a half.  Curiously, his best sustained pace for 5 minutes all the way through 30 minutes is about the same at low-5.

Next, let's take a wide view with a speed histogram.  Simply, how much time did Robert Young spend at different speeds?

It's essentially a tri-modal distribution.  Most of his time was spent at a walking pace (17:30), with some time running (8:00 pace), and another "modality" seen at 6 minute pace.  Let's put this into a more familiar running perspective using pace zones:

He's mostly in the first 2 zones I defined, slower than 15:00min/mi.  But there's a substantial amount of running done at a sub-7 pace, over 45 hours worth in fact.

By Day

LetsRun and other internet users noticed that Rob seemed to put up his incredible mileages in the early part of his run, when he was in the rural West, and prior to being observed 24/7 by a group of ultra-runners affectionately termed "Team Geezer".  Let's look at the time spent in pace zones by day to see if we can visualize a pattern:

There's a lot of red (sub-7) on day 1, but that run was almost certainly legitimate.  Rob probably burned himself out quickly, and then the RV-assists began on day 4 once they were out of populated areas.  The fast running all but disappears once the internet heat is on and once he is continually observed.  There indeed seem to be two Rob Young's.  Let's take the onset of 24-hour observation as a break point for further analysis.

By Time of Day

At what times of the day is Rob recording high speeds, before and after he is observed?



Before he's observed, he's blistering through the night, 18:00-04:00, with a high proportion of sub-7 miles.  After observation, there's much less running in the wee hours, and hardly any of it at speed.

Revisiting Speed/Pace

Let's look at some before and after's for those histograms.

Rob Young's third running modality disappears once he is observed.

As for the pace zones:

The time spent below 7 min/mi drops drastically.

Comparison to Other Ultrarunners

The paces and circumstances shown above ought to be enough to raise most eyebrows, but for good measure let's compare Rob Young's statistics to a few other ultra-runs for reference.

First, consider Joe Fejes, a prolific multi-day ultrarunner who last year set the American Record for six-day racing, running a staggering 606 miles.  That race was done on a 1km loop in Hungary, and there's a recorded split time for each and every lap, all 975 of them.  Here is the distribution of paces Joe ran for each lap:


A single lap was sub-8, and zero of them done below a 7 min/mi pace.

Next up is Adam Kimble.  He's won a 250km race across the Gobi desert, his marathon PR is 2:56 (pretty good, and probably soft since he's a trail/ultra guy), and earlier this year completed a transcontinental run in 60 days.  The data for his transcon run is available on his Strava.  Here are Adam's histograms for the run:

He spends basically no time (a few minutes here and there is probably artifact) below 7min/mi.  Like post-observation Rob, he has a walking-dominant bimodal distribution, but he's missing that pre-observation Rob 6 minute/mile bump.

Next, consider Jason Romero.  He's a blind ultra-athlete, speaker, and humanitarian.  Really motivational story.  He holds 11 world records and earlier this year, he too completed a transcontinental run in just over 59 days.  Data is available on his Strava.

Jason, like Adam, spent essential zero time below 7 minute pace.  He also has a bimodal walk/run distribution, though it seems he's more run dominant, preferring to predominantly trudge along at 10-11 minute pace rather than walk more but run faster when actually running.

Next, Patrick Malandain.  He's a French ultra runner, so I can't read most of the stuff about him...  But I do know that this year he ran 100km per day for 100 days, breaking a world record.  His data is available on Garmin Connect.

Similar to everyone else, there are no blistering fast paces at all.  Like Jason, Patrick impressively ran quite a bit, and he has the expected bimodal distribution.


This data analysis has shown:

  • Rob Young recorded a highly suspicious amount of time at high running speeds
  • His suspiciously fast efforts were almost all at night and prior to being observed
  • His pace distributions do not come close to matching other high- and elite-level multiday ultrarunning performances.  They are simply impossible.
I'll leave you with some fancy GIFs animating the above comparisons.

Speed Histogram - Rob Young pre/post observation vs. others
Pace Zones - Rob Young pre/post observation vs. others

No comments:

Post a Comment