Do Women Marathoners Push Themselves As Hard As Men?

June 24, 2014

(This is the 14th post in a series that started here)

Studies, including a recent one of mine, show that on average, women tend to pace themselves more evenly than men when running marathons.

We know that splits tend to get more positive as runners’ finish times go up and that the average female runner is slower than the average male, so the raw numbers may understate this trend. Using split scores to mitigate the differences in finish times between women and men bears this out.

Since conventional wisdom says runners should strive for even splits in order to run their best marathon, it’s been said that this means that women are better at marathon pacing than men.

It’s also been hypothesized that women don’t push themselves as hard as men, which makes it easier for them to run more even splits, but keeps them from achieving finish times as fast as they might otherwise.

Perhaps a closer look at the data will tell us more.

In two recent posts, I studied 5K splits from the latest Boston, New York, and Chicago marathons. I discovered that as runners’ finish times got faster, their pace, relative to their previous splits, slowed over the last 7K and especially over the last 2K.

I decided to take one of those races (Boston 2014) and break the data down by gender.

This chart shows how the average 5K splits for men and women compare to the overall average. It also compares the splits for the runners with the most even half-marathon splits from each group.


Click on any image to enlarge

Click on any image to enlarge

The chart shows that women’s splits at the end of the race are faster, compared to their earlier splits, than men’s. Since the average woman is slower than the average man, we expected that.

By only selecting runners with even splits, the differences between the shape of the lines for women and men were reduced, but the women’s line still trends comparatively faster.

The next chart tries to eliminate the differences due to speed by dividing the average splits for men and women into half-hour groups by finishing time.

In almost every case, even though the runners are grouped by equal finish times, the women’s splits at the end of the race are faster, relative to their earlier splits, than the men’s. That is, the women’s splits take on the shape of slower runners, though the finish times for each group of men and the women are the same.

Since men are faster than women on average, by comparing men and women using equal finish times, the women in each group are faster, relative to all other women, then the men are to other men. That means the direct comparison might tend to understate the difference in split trends.

This chart shows that the distribution of finish times for men and women are roughly the same, except that the men’s times run about 20 minutes faster.

So in this chart, I divided the men up into half-hour groups that are 20 minutes faster that the women’s groups.

As predicted, this chart makes the difference in split trends between men and women more distinct.

What’s it all mean?

In my original 5K split post, I guessed that slower runners weren’t pushing themselves quite as close to their potential as the faster runners, leaving them with more energy left after the Newton hills to speed up and look good for their finish line photos.

Since women’s splits take on the shape of slower runners when compared to men’s, even when the data is manipulated to minimize the effect of differences in speed, does this mean that women aren’t pushing themselves as hard as men?

Maybe, maybe not.

If men are pushing themselves harder, does that mean that they’re getting better results than comparable women?

Maybe, maybe not.

If you disagree, what’s your alternative explanation? Let me know what you think and maybe we can figure out if the data supports it.

We’ll divide up 5K splits by age in my next post.

Another Precinct Heard From: 5K Split Distribution at the 2013 Chicago Marathon

June 23, 2014

(This is the 13th post in a series that started here)

After my last post in this series, I obtained the 5K splits for the 2013 Chicago Marathon. Chicago is a notoriously flat course, as shown in this graphic:

Click on any image to enlarge

Click on any image to enlarge

I wanted to see if the trends in runners’ paces over the last 2K on the flatter course were consistent with what we saw at Boston and New York.

Here’s the 5K pace chart for Chicago:

Once again, the faster runners were relatively slow over the last 2K, while slower runners picked up their pace.

The new chart reveals another pattern: as the courses get easier, more groups run faster over the final 2K.  At Boston, the first group to pick up the pace at the end were the 4:00-4:29 runners. At Chicago, the 3:00-3:29 group managed the feat. And at New York, which most people think falls somewhere in between Boston and Chicago in difficulty, the 3:30-3:59 group was the first to speed up at the end.

I look at the difference between 5K splits for men and women in my next post.

“Grit Is In the Soul (of my Shoes)” in Stymie Magazine

June 18, 2014

STYMIESPLAT2Think marathon running automagically makes you an amazing person? The idea inspired this opinion piece, “Grit Is In the Soul (of my Shoes)“. You can read it at Stymie Magazine.

5K Split Distribution at the 2014 Boston Marathon

June 17, 2014

(This is the 12th post in a series that started here)

When I looked at the 5K splits from the 2014 Boston Marathon, I mostly saw what I expected to see.

I calculated the average 5K splits for the 2014 race both as a whole and broken down into specific groups of runners. This chart shows those 5K splits converted to a minute-per-mile pace:

Bos 2014 - 5K splits

Click on any image to enlarge

Here’s the elevation map for the Boston course:


The Newton hills start at about 28K and end near the 33K mark. They don’t line up perfectly with the 5K splits so their effect is muted some, but you can still see from the 30K and 35K splits how the hills slow everyone down. Splitting up the data by finish time shows that the hills hit the slower runners harder.

Then comes something I found interesting. Once you get through the hills, the next 7K through Brookline is pretty easy, trending mostly flat or downhill. Overall, the average runner was able to take advantage of the terrain and pick up the pace. Perhaps unsurprisingly, the “Even splits” group did a particularly good job of making up time lost in the hills.

A closer look shows that the overall effect was created by the extreme groups – the very fastest runners in the race and the slowest. The middle of the pack averaged even slower after the hills than they did through Newton, though the course is much easier. Even the runners finishing between 2:30 and 3:00, a pretty fast group, lost ground.

The last 2K was even more surprising to me.

That part of the course has its challenges (the bump over the Mass Pike, the Mass Ave. underpass, and the climb up Hereford St.) but they’re nothing compared to what came earlier in the race. Countering that, almost everyone who runs a marathon gets a burst of energy when their goal finally comes within reach. And at Boston, the thickest packs of spectators start just past the 40K mark, in Kenmore Square. The roar of the crowd is relentless from Kenmore on in.

I expected to see splits dropping across the board. But the trend was clear. As runners’ finish times got faster, their times over the last 2K got slower (not in an absolute sense, but relative to their previous splits). 2:30 runners still finished faster than 5:30 runners, but the 2:30 runners slowed down while the 5:30 runners sped up. The fastest runners were dramatically (for them) slower over the last 2K.

Even the “Even splits” group lost ground after 40K, while the “Most positive” group made up more than anyone.

Maybe there’s an element of “oh well, I’m not winning” in the sub-2:20 group, but that doesn’t apply to the 3:30 marathoners, and they still slowed down.

It’s not just Boston. The 5K splits for New York show the same pattern:

NY 2013 - 5K splits

though the elevation profile is different:


To run your fastest race, you want to “leave it all out on the course”. If you have the energy to pick up the pace as you approach the finish, you probably could have run faster earlier on and ended up with a better finish time.

But there’s a fine line between running your best race and going out too fast. As we know, most runners run positive splits. Marathoning is hard.

The 5K charts show that in general, slower runners drop off from their initial pace sooner than faster runners.

For the slowest runners, part of the drop may be because, on average, they aren’t pushing themselves quite as close to their potential as the faster runners. That does leave the slower runners with more energy left after the Newton hills to speed up and look good for their finish line photos.

Mid-packers don’t give in as easily, holding on to their initial pace farther into the race. Actually, mid-packers might be better served by taking it a little easier from the start; on average, they’re not able to take advantage of the easier terrain after the hills.

Even the fastest runners have difficulty getting the first 40K just right, so they’re tired, but not too tired, for the last 2K. Apparently the best runners believe that if you’re going to make a mistake in pacing, it’s best to err slightly on the optimistic side and risk running out of gas a little early, just as long as you don’t get too sanguine in the early miles.

It sounds funny to say that the fastest runners “went out too fast” over the first 40K, but I think that’s what’s happening.


  • There is a wide variance within any of the groups of runners. While the average runner in a group may have a certain split pattern, many individuals within the group will have entirely different splits.
  • I used split scores to determine who belonged in the “Even splits” and “Most Positive” groups. The first and second half splits for the 310 runners in the Even splits group were within ±.2% of each other. The Most Positive group was made up of about 300 runners whose second half was at least 20% slower than their first.
  • I looked into creating a “Most Negative” group. There weren’t any runners with 20% negative splits, and only about 100 with splits that were more than 2% negative. The average 5K splits for those runners were badly skewed by a few runners with individual splits way out of line with the rest of their race, perhaps because of portapotty breaks or other temporary issues, so I decided not to chart them.

More on this topic in my next post.



June 13, 2014


Plotting Split Scores

June 10, 2014

(This is the 11th post in a series that started here)

Last time, we defined a “split score” as a runner’s raw split divided by their finish time.

Split scores work great for individual runners, easily showing us that a 5-hour marathoner’s 10 minute split is smaller, relatively, than the same raw split would be for a 3-hour runner.

That’s all well and good, but split scores will only matter to anyone if we can use them to gain new insights about runners – in general, or when we group them by time, age, gender, or in other ways.

The core of any potential usefulness for split scores is their ability to make comparisons between runners of different abilities more meaningful – to say, “All else being equal, here’s how the 5 hour marathoners compare to the 3 hour marathoners.”

The math is easy, and the logic behind it makes superficial sense. What I’m not sure of is what, if anything we gain by doing this. Because the main difference between 5 hour and 3 hour marathoners IS their finish time.

Whatever. Get to the point, Charbonneau. Show us some split scores. Maybe we’ll see something.

Here’s finish time vs. split score for Boston 2014:

Click on any image to enlarge

Click on any image to enlarge

And to refresh your memory, here’s finish time vs. raw splits:

Boston 2014 Fin vs split w poly movave

You can’t compare the two directly, but you can see how the general shape of the split score scattergram hews closer to the horizontal axis.

If you plot raw splits vs. split score, you can see that the relationship is non-linear:

Bos 2014 split score vs raw

As it should be, since another way of looking at split scores is that if you make a right triangle by plotting splits versus finish times, the split score is the tangent of the angle between the hypotenuse of that triangle and the side represented by the finish time.

In spite of that, on the next chart, I plotted the moving averages and 4th order polynomials for the two data sets, scaling the raw split data so the linear regression for the two sets of data overlaid each other as much as possible. That choice is entirely arbitrary, and is probably more misleading than it is revealing since the relationship between the two data sets isn’t linear, but it does show something about how the two sets of curves change relative to one another:

Bos 2014 split and raw poly-mov

Let’s try comparing men vs. women. Here’s raw splits vs finish times, divided by gender:

Bos 2014 gender raw vs fin

And here’s the same chart, with split scores replacing raw splits on the Y axis:

Bos 2014 gender score vs fin

The use of split scores seems to accentuate the spread of values for any specific finishing time, which also increases the difference between men’s splits and women’s splits. Perhaps because split scores remove another source of autocorrelation error by factoring out finish time?

This chart tries to summarize the differences:

m-w raw-scores

Men have faster average finish times, but larger average splits, whether you measure raw splits or split scores. The difference is slightly greater for split scores. Note that the average split divided by the average finish time is not the same as the average split score.

There’s certainly a difference between raw splits and split scores. Does that difference help us find any better answers for our questions? What do you think? I still don’t know, but I did find one use for split scores in my next post.

What Are Split Scores?

June 8, 2014

(This is the 10th post in a series that started here)

Up until now, I’ve been talking a lot about the rate of change for marathon splits and not as much about the raw splits themselves.

That’s because raw split numbers aren’t always a very good way to compare splits, especially between individual runners.

Think about it. If a runner runs a 1:30 first half of a marathon and a 1:40 second half, she’s run a 3:10 marathon with a 10-minute positive split. Simple, right?

Now, suppose another runner runs a 2:30 first half and a 2:40 second half. That’s a 5:10 marathon, also with a 10-minute positive split.

Ten minutes is ten minutes. So when we compare the two races, both runners ran the same positive split, right?

Not really. 10 minutes is proportionally larger when compared to a 3:10 marathon than it is when compared to a 5:10. Just because both of them had raw splits of 10 minutes, saying that the splits are the same isn’t right. The 5:10 runner clearly ran closer to even splits than the 3:10 runner.

So if we’re going to look at runners and compare their splits or add splits together to analyze them in groups, we need to come up with a better way of assigning a value to each set of splits, a “spilt score” if you will.

It doesn’t have to be complicated. Let’s define a “split score” as “how far a runner is from even splits, relative to their finish time”.

Here are some sample splits:

Click any image to enlarge

Click any image to enlarge

I’ve plotted them on this chart:

sample split score

Runner A ran a 2:30 by running a 1:00 first half (wow!) and a 1:30 second half, giving him a raw split of 30 minutes (represented by the red line).

Our initial “split score” for A is 30 minutes divided by 2 hours and 30 minutes, or .2.

Note that the result is just “.2”, not “.2 minutes”. Since we’re dividing a time by a time, the “time” part drops out, leaving us with what they call a “dimensionless number”.

Most of us don’t have an intuitive feel for what a split of “point 2” means. To convert the score to something we do understand, we can turn it into a percentage by multiplying by 100. So A ends up with a split score of 20%.

Both runner C and runner D took 5 hours to finish the race, twice as long as runner A. Runner C’s raw split is 1 hour, also twice runner A’s, so C’s split score is also 20%.

On the other hand, while runner D’s raw split of 30 minutes is equal to runner A’s, since D’s finish time is twice A’s, D’s split score is half of A’s, or 10%.

Of course, as runner B shows, split scores can be negative, too (a 1:00 second half!?!).

We’ll start to find out what, if anything, split scores are good for in my next post.



Get every new post delivered to your Inbox.

Join 125 other followers