January 21, 2019

## CFB Trends – Part 5

Yards matter. Sure, there’s games where the team with less yards win, but it’s more likely that the team with more yards is going to win the game – which makes the stat we just looked at – plays per game – important and the one up next – yards per play – even more important.  The graph below makes the big picture clear: Teams that gain more yards than their opponents win.

Not only that, but the margin is getting larger.  In 2011 there was a 100 yard difference on average between winners and losers.  By 2016 that number had grown to 104.1.

## Geek Speak: Total Yards and Points Scored and Pearson Correlation

In a post on Wednesday I put forward the case for why total yards are important and referenced a chart from last year.  I wanted to include 2015 data, so here it all is.

The graph below contains 7,160 data points (though it’s difficult to tell) covering every college football game between 2 FBS teams since 2011 (3,580 games x 2 data points for each game).  As you can see, the slope remains up and to the right.

Yards gained is on the y axis (left) and points scored is on the x (across bottom) axis.

The Pearson Correlation for this data is .78.  Anything over .70 is considered a strong uphill linear relationship, mathematically confirming my hypothesis – total yards matters, especially in the context of points a team is likely to score.

## College Football: Total yards is one of the most important metrics in football

Gaining more yards than your opponent is one of the most reliable predictors of which team is going to win. You can argue over which is more important: gaining yards on offense or giving up less yards on defense, but the fact is that if you outgain your opponent your odds of winning have been 78% since 2011 regardless of any other metric(s).

While both the yards gained by winners (+5.3%) and losers (+3.6%) have increased since 2011, what has really grown is the difference between the two.  In 2011 the average difference was 100 yards.  In 2015 it had grown to 121 yards, a 21% increase in 5 years.

It’s about this time where someone says points are what matters, yards are irrelevant. Guess what? Points scored correlates closely to yards gained as this post shows.

More yards equals more points on offense and the inverse is true on defense. Sure, there are outliers and some teams perform better in the red zone. Statistically speaking though, the odds of winning increase the bigger the positive total yardage difference between you and your opponent.

## Yards Still Matter

If you spend any time on this site you know that one of my favorite college football metrics is also one of the most straightforward – total yards.  With the exception of yards per play, and it’s close, you’d be hard pressed to find a metric that correlates more to winning than total yardage differential between two teams.  Why?  Because yards = points.

Every time I hear someone say yards don’t matter – and provide an example of a game where a team with less yards won – my head wants to explode.  Yes, there are exceptions to the rule.  Just like there are exceptions to “You have to win the turnover battle” or “You have to run the ball to win”.  For every example you provide of a team winning despite having less total yards, I can provide more where a team won the turnover battle and lost.

If you made it through freshman stats then the graph below should tell you a story.  I’ve plotted every college football game from 2011 through 2015 (D1 vs. D1 only, 3,580 games and 7,160 data points) with yards and points.  Notice the slope of the line that goes through the data points.  It’s pointing up, as in more yards means more points.

This is significant for Clemson because one of the narratives of the offseason is that the Tigers defense lost a lot of talent and will have to “rebuild”. Fair enough.

In the period covered, teams that reached the magic 500 yard mark won 79% of the time, without regard to any other metric.  Clemson has reeled off 10 straight (and counting) 500 yard games.  Just by the fact that your offense is gaining 500 yards means you are highly likely to win.

But it gets better.

The Clemson defense gave up 313.0 yards per game in 2015.  For arguments sake, lets say the Tigers 2016 defense regresses to “average”, which in NCAA terms in 2015 was 400 yards per game.  That’s an additional 87 yards per game given up (27.7% increase).

In our mythical game Clemson gets 500 (or more) yards and gives up 400 yards (regressed to average).  What are the chances the Tigers win?  Over the same time period (2011-2015) teams with this profile won 94% of the time (671-43).

To recap, the Clemson offense is likely going to be so good that the Tigers can absorb a defensive regression to “average” and still have a high probability of winning.

Obviously, there are no guarantees and every game is an independent data point on a graph such as the one above. The Tigers may reach 400 yards in one game (reducing the odds of winning) and 600 (very high probability of winning) in the next.

The point is an offense that’s likely to reach 500 yards in any given game and give up an “average” amount in the same game is still likely to win and is also why our early win probabilities have the Tigers favored in all 12 games.

As we saw in the championship game 500 yards of offense a game is not a 100% guarantee of a win, but 500 yards of offense combined with holding your opponent to 400 or less is about as close as it gets.

## Geek Speak – Yards Per Play (YPP) vs. Total Yards

Much as I did with the total yard metric, I plotted the yard per play (YPP) for the 2,116 FBS vs FBS games for the last three years.

Not surprisingly, the curves and results are nearly the same. In fact, the YPP metric has a slight edge – 79.0% to 77.7%.

However, two things stand out to me:

1. With the exception of the last range, in which both are at 100%, the total yard metric has a higher percentage than the YPP metric in each range. How then does the YPP metric have a higher overall percentage? There are many more (328 to 169) games over the last three ranges in the YPP metric thereby weighting those ranges much heavier.
2. While the ranges go progressively higher without exception in the total yards metric, it actually goes lower (slightly) from the 3-3.49 range to the 3.50-3.99 range. The sample size is small, only an average of 26 games per year fit in this range, so it’s likely to be an anomaly and will work itself out over time.

There’s not a lot of difference in these metrics in my mind and that was part of my point in the total yard post. YPP is a simple and easy calculation, but you could easily use a metric that doesn’t even require a calculation (total yards) and get similar results.

## Geek Speak: Total Yards Matter – 2014 Version

While I don’t believe total yardage is the “end-all, be-all of football” it’s pretty clear to me that total yards are an important stat in college football.

Besides the obvious – it generally takes yards to score points – I have some numbers that back up this theory.

There are many guys smarter than me that say total yards mean little, are an “overrated” or “simplistic” metric and spend many hours devising complicated formulas to prove why that is.

I’m not smart enough to understand all of the mathematics behind those theories, but my general operating theory is “the simpler the better”.

It’s difficult to find a simpler metric than total yards, and this seems to give those smarter than me fits.

Specifically, out gaining your opponent is important.  The more the better.  If you think about it, out gaining your opponent takes into account many factors that occur during the game.  If you turn the ball over consistently you are likely to gain less yards, score less points and win less often, for example and using the difference between teams total yardage also means defense is factored into the equation.

So while gaining  yards is important, this analysis looks at the difference in yardage between winners and losers.  Another way to put it is, if Team A gains 600 yards and gives up 575 yards in game 1 and gains 125 yards and gives up 100 in game 2, Team A has the same odds of winning both games.

It’s not about the number of yards you gain, it’s about the difference between the number of yards you gain and the number of yards your opponent gains.

The charts and graphs below cover 2,116 games (6 games resulted in teams having exactly the same number of yards) between Division I teams from 2011 through 2013 and tell a simple story: Outgain your opponent and you will likely win. The more you outgain your opponent the higher your odds of winning.

A little further proof that yards matter? Teams with more yards than their opponents cover 64.6% of the time. And, as with the winning %, the higher the yardage differential the more likely a team is to cover, without exception.

Using the Pearson Coefficient I found a solid 0.606149 correlation between total yard differential and winning.

How did Clemson fare using this metric in 2013? I’ve previously posted on why I wasn’t that worried as Clemson fell behind in the Orange Bowl vs. Ohio State and the Tigers were 9-1 (lost South Carolina) when they outgained their opponent and 1-1 when being outgained (won Georgia, lost Florida State). Against the spread the Tigers were 6-5 when outgaining an opponent and 1-1 when being outgained.

No, total yards aren’t the end-all, be-all of football. But total yards, specifically when compared to your opponents total yards, matter and this simple metric can also increase the odds of picking the team that’ll not only win, but cover the spread, too.

It’s important not to confuse correlation with causation and I’m not saying having more total yards causes teams to win by itself.  Other factors (turnovers, for example) can cause a team to have more (turnovers gained) or less (turnovers lost) total yards and win or lose the game.

I’m saying total yards is an important factor in determining winners and losers, more than many want to acknowledge.

## Total Yards Matter: Clemson Version

On Tuesday I explained why total yards matter in the big picture of college football – total yards correlates to points at a much higher rate than other statistic in college football.

I realize however, that many of those that read this site don’t care about this stat for college football as whole, but rather what it means for Clemson in particular.  The answer: The same thing.

While the number of plays run for Clemson has a strong (but smaller) correlation to points, I’m struck by the other numbers below.  The passing stats have weak to almost non-exisitent correlation to points scored and the rushing stats all have negative correlations.

If you think about it, the negative correlation to rushing numbers makes sense – for Clemson.  While the rushing game is important, the Tigers are much more efficient at passing and plays spent rushing decrease the overall efficiency of the offense (yards per play) even if the rushing plays are successful.

Running plays are obviously an important part of the offense and the analysis above made no distinction between a passing play when behind by 10 or a rushing play when ahead by 10.  Perhaps that is something I can develop at some point – analyzing these numbers by game situation.

The data also confirms what we learned in Tuesday’s post (total yards matter) and refines the findings for Clemson – Total yards (no matter how they are gained) and plays run are two key stats for the Tiger offense.

## Geek Speak: Why Total Yards Matter

A few years ago I happened upon a now forgotten web site that attempted to prove that yards don’t matter in football. All that mattered was points. My first thought was, “Well, yards are how you get points most of the time”.

Turns out I was right (for once).

The site went on and on with examples of how and why yards don’t matter. As I recall it was the heyday of the Honey Badger and LSU seemingly won week after week by scoring on special teams and defense.

Great, that’s one team out of 120 (at the time). For the vast majority of the other 119 teams and in the vast majority of games yards do matter. As a matter of fact, they matter more than anything else.
For 2011 and 2012 in games between FBS teams (1,404 games in all), 76% of the time the winning team has gained more yards than the losing team.

It’s not rocket science.

What I didn’t analyze at that time was the correlation between yards and points scored.

Points seem kind of important, so I decided to find out which of the statistics I track most likely leads to points.

Out of the 12 (other than points scored) numerical categories I track, there is no greater correlation to points scored than total yards and, with the exception of yards per play, it’s not even close.

Total yards may not matter in a single game, season or perhaps (stretching a bit) for a particular team for some period of time, but in the big scheme of things if you want to know who is going to score more points figure out who is going to gain more yards.

Using the Pearson Coefficient I came up with a correlation of 0.766324 between total yards and points. You can read more about the Pearson Coefficient here, but the basic concept is a coefficient of 1 means there is a perfect positive correlation between the two variables and -1 means there is a perfect negative correlation between the two variables.

What does this tell us? There’s not a better predictor of points scored than yards gained.

Anything above 0.5 can be considered a high correlation.

The interesting numbers to me are the ones that are typically associated by coaches, announcers and writers (and bloggers) as important to winning with low correlation to points scored like plays run (technically this falls in the medium correlation range), time of possession and penalty yards.

Remember that correlation should not be confused with causation. For example, the number of passes thrown has a low (and negative) correlation to scoring points, but the number of passes a team throws may not necessarily be the cause of the number of points scored.

Finally, when I do an analysis of this type invariably someone asks, “What about defensive statistics?” Defensive numbers are included in these statistics in this way: For every yard an offense gains a defense gave that yard up. For example, one would expect a given team to have less yards against Alabama and therefore less points. If an offense averages 500 yards a game and then plays Alabama, for example, they’re likely to gain less yards and score fewer points.

What we know from this analysis is that over the last two years a point is scored (or given up by a defense) for every 14.2 yards gained (or given up by a defense) on average.

This analysis included both the statistics of the winning and losing teams in 1,404 games, 2 teams in each game and 2 variables (points and total yards gained) for each team per game.