In 2012, I ran the Fargo Marathon. When I got my results, I was surprised to see two extra columns that I wasn’t expecting. One had the number of runners I had passed on the course. The other had the number who had passed me.
The race organizers used a fairly straightforward calculation to figure these two things out:
- Anyone with a start time earlier than mine but a later finishing time was someone I passed.
- Anyone with a start time later than mine but with an earlier finishing time was someone who passed me.
Of course, races are a little more complex and those calculations don’t account for things like runners passing each other multiple times throughout the course. But kudos to the Fargo Marathon race organizers for thinking outside the box and giving their runners some extra information about how they did on the course. Seeing those two calculations made me realize that race results probably hold more information than we realize. Back then, I thought mining race result data would become a new trend. I wondered what other information I would start to see in future result sets. Maybe these new analytics would help with my training.
It’s been a few years since I ran Fargo. How many new insights have I been able to glean from race results I’ve gotten since then?
That’s right. Zero… Zip… Zilch… Nada. I’ve yet to run a single race besides Fargo where my results contained anything other than standard “overall finishing time”, “split times” and “age group placement” figures. This is not very innovative. The values we see in our race results today harken back to the days when races were timed with handheld stopwatches. It’s a bit frustrating to think about it that way.
What’s even more frustrating is knowing that there’s technically nothing stopping time tracking companies from doing a deeper analysis on our race results. The standard timing chips that most races use already collect plenty of data. The real problem is creativity. Even though the data is sitting right in front of us, nobody ever thought to look at it in different ways and figure out new uses for it. Until now.
I recently had the pleasure of speaking with Eila Stiegler, who is a quality analysis manager at a software company called Wolfram Research. Eila ran the 2015 Chicago Marathon and then used the Wolfram Language to analyze the race results. She figured out some interesting things about both the race and its participants.
This is a running blog but I also have a computer science background. So I’m going to get technical for a minute. If you feel your eyes starting to glaze over, go ahead and skip the next paragraph.
For anyone still reading, the Wolfram Language is a multi-paradigm programming language that emphasizes symbolic computation, functional and rule based programming. It can be used for complex tasks like creating graphics and audio, analyzing 3D models and solving differential equations. Impressive stuff. If you’re a programmer, you can read more about it here.
So what additional information was Eila able to learn? I put some samples below. You can find the full analysis in Eila’s blog post on the Wolfram website. Even if you don’t have a technical background, you should take a look.
The charts below show how consistently people ran the race (average pace and variations at each split).
This chart shows how frequently runners passed each other on the course. Eila’s analysis also takes factors like the way people approach water stops into account. You can read more details in her post.
This chart shows a comparison of finishing times versus how far people had to travel to get to the race:
This video showing a visual recreation of the race is my favorite. The green dot represents the winning runner. The red dot represents Eila, so you can actually see where she was on the course in relation to the fastest runner at any given time. The blue dot represents the median for all runners and the purple bars represent the density of runners at any given point along the course. If you ran the 2015 Chicago Marathon, you are represented in one of those bars.
Cool stuff huh? Like I said, there are a number of other calculations in Eila’s blog post. If you ran the Chicago Marathon last year, most of them will apply to you, so you should check them out.
So what can you do with all of this information? Anything you want, really. I can think of a few examples right off the top of my head:
- Regular readers know that I travel for races several times per year. Realistically, there’s no way I can PR in every race. So when I plan my schedule out for the year, I have to choose which races to go all in for and which ones to hold back on. Having access to an additional data point that tells me how distance traveled effects the finishing times of runners on a particular course would help with these decisions.
- I can also compare my performance throughout the entire course with of other runners in my area and maybe find a training partner for my next race.
- I can look more closely at the variations in my split times and determine if I need to adjust my training to run more consistently.
- Comparing split time variations to elevations on hilly courses at high altitudes would help flush out spots where runners struggle the most.
- Or imagine if you could go a step further and combine data like this with data from a heart rate monitor. You could get a detailed report showing how your body reacted at specific points throughout a course. Imagine if you could then share that data with a trainer. You could use the information to make adjustments to your workout routines and ensure that your body is optimally conditioned to handle all the nuances of a specific race.
I recently wrote a post about how the Internet of Things is changing the running world. This type of data analysis falls into the same category. It has the potential to revolutionize running and racing.
Eila did a great job using the Wolfram language to show the art of the possible. Notably, she didn’t collect any special data to do her analysis. She simply took the standard data that’s publicly available on the Chicago Marathon Website and found some new ways to connect the dots and gain additional insights. This same data is already available for almost every other major race as well. The potential uses for this type of analysis are endless.
I would love to be able to click the link in a race results email one day and not only see my finishing times but also a set of charts and graphs that I can use to do an even deeper analysis of my performance. So here’s to hoping that race directors will see the potential that this has and start looking into ways to utilize it.