July 28, 2014

Geek Speak – Yards Per Play (YPP) vs. Total Yards

ch20p8

Much as I did with the total yard metric, I plotted the yard per play (YPP) for the 2,116 FBS vs FBS games for the last three years.

Not surprisingly, the curves and results are nearly the same. In fact, the YPP metric has a slight edge – 79.0% to 77.7%.

YPP Chart and Graph 2014

However, two things stand out to me:

  1. With the exception of the last range, in which both are at 100%, the total yard metric has a higher percentage than the YPP metric in each range. How then does the YPP metric have a higher overall percentage? There are many more (328 to 169) games over the last three ranges in the YPP metric thereby weighting those ranges much heavier.
  2. While the ranges go progressively higher without exception in the total yards metric, it actually goes lower (slightly) from the 3-3.49 range to the 3.50-3.99 range. The sample size is small, only an average of 26 games per year fit in this range, so it’s likely to be an anomaly and will work itself out over time.

There’s not a lot of difference in these metrics in my mind and that was part of my point in the total yard post. YPP is a simple and easy calculation, but you could easily use a metric that doesn’t even require a calculation (total yards) and get similar results.

50,000 Foot View of College Football

Random Numbers

The charts below tell the big picture story of college football from 2011-2013 and cover 2,116 games between two FBS teams.

Some things that I found within the data:

  1. Almost all categories for winners increased (far right column) over the 3 seasons.
  2. Losing teams had reduced numbers in most categories in 2013 compared to 2012.
  3. Turnovers have remained remarkably consistent for both winners – 1.3 per game across all 3 seasons - and losers (slight variation in 2013).
  4. Winning teams average more penalty yards than losers.
  5. While the losing teams yard per pass average has remained constant, the winning teams have increased their yard per pass metric 2.5% over the 3 seasons.
  6. Both have increased their yards per rush, but winners have increased at a higher rate.
  7. Average rush yards for winners has increased by 9.2% and yards per rush by 5.1% for winners from 2011 to 2013.
  8. Scoring is up for both winners (5.7%) and losers (2.9%).
  9. Both winners and losers have increased plays and total yards, but winners have increased at  a higher rate than losers.
  10. As a whole, these numbers tend to lead credence to the theory that offenses are moving faster and have the upper hand (known as the Saban/Bielema Complex)

These numbers lay the foundation for an upcoming analysis by Paul Chimenti who holds an MS in Mathematical Sciences with Statistics Concentration. Paul is using a statatistics package that will arrange offenses and defenses in “clusters” based on metrics from the 2011-2013 seasons.

Winning Teams

2011-2013 Winners

Losing Teams

Losers 2011-2013

Geek Speak: Total Yards Matter – 2014 Version

Random Numbers

While I don’t believe total yardage is the “end-all, be-all of football” it’s pretty clear to me that total yards are an important stat in college football.

Besides the obvious – it generally takes yards to score points – I have some numbers that back up this theory.

There are many guys smarter than me that say total yards mean little, are an “overrated” or “simplistic” metric and spend many hours devising complicated formulas to prove why that is.

I’m not smart enough to understand all of the mathematics behind those theories, but my general operating theory is “the simpler the better”.

It’s difficult to find a simpler metric than total yards, and this seems to give those smarter than me fits.

Specifically, out gaining your opponent is important.  The more the better.  If you think about it, out gaining your opponent takes into account many factors that occur during the game.  If you turn the ball over consistently you are likely to gain less yards, score less points and win less often, for example and using the difference between teams total yardage also means defense is factored into the equation.

So while gaining  yards is important, this analysis looks at the difference in yardage between winners and losers.  Another way to put it is, if Team A gains 600 yards and gives up 575 yards in game 1 and gains 125 yards and gives up 100 in game 2, Team A has the same odds of winning both games.

It’s not about the number of yards you gain, it’s about the difference between the number of yards you gain and the number of yards your opponent gains.

The charts and graphs below cover 2,116 games (6 games resulted in teams having exactly the same number of yards) between Division I teams from 2011 through 2013 and tell a simple story: Outgain your opponent and you will likely win. The more you outgain your opponent the higher your odds of winning.

Winning Pct by TYA Chart

Winning Pct by TYA Graph

A little further proof that yards matter? Teams with more yards than their opponents cover 64.6% of the time. And, as with the winning %, the higher the yardage differential the more likely a team is to cover, without exception.

 

Cover Pct by TYA Chart

Cover Pct by TYA Graph

Using the Pearson Coefficient I found a solid 0.606149 correlation between total yard differential and winning.

How did Clemson fare using this metric in 2013? I’ve previously posted on why I wasn’t that worried as Clemson fell behind in the Orange Bowl vs. Ohio State and the Tigers were 9-1 (lost South Carolina) when they outgained their opponent and 1-1 when being outgained (won Georgia, lost Florida State). Against the spread the Tigers were 6-5 when outgaining an opponent and 1-1 when being outgained.

No, total yards aren’t the end-all, be-all of football. But total yards, specifically when compared to your opponents total yards, matter and this simple metric can also increase the odds of picking the team that’ll not only win, but cover the spread, too.

It’s important not to confuse correlation with causation and I’m not saying having more total yards causes teams to win by itself.  Other factors (turnovers, for example) can cause a team to have more (turnovers gained) or less (turnovers lost) total yards and win or lose the game.

I’m saying total yards is an important factor in determining winners and losers, more than many want to acknowledge.

 

Figure The Odds – Final Numbers for 2013 Season

WP 3.1.1

Below are the final numbers of the “Figure The Odds” series for the 2013 season. While the numbers may not look impressive on the surface, realize that within these numbers the model managed to go 20-14 straight up and 17-17 against the spread in one of the craziest, most upset filled bowl seasons in recent memory.

WP6

During the off season I’ll continue to add to the database (currently 1,879 games) and hit the ground running in week 1 of 2014.

(The numbers above include a record of 2-4 straight up and 4-2 ATS in the first week of December).

Figure The Odds – Version 1.4

WP5

With the completion of last night’s games the model’s record now stands at 19-9 straight up and 13-15 against the spread.  I’ve noticed a potential issue with the Cover Probability portion of the algorithm (hence the .464 record) that will probably need to be addressed.  I’m not exactly positive how to best accomplish that, but for now we’ll plow ahead.

As a reminder, these probabilities are based on the results of 1,869 college football games from 2011 to 2013.

WP5

An important distinction here – I’m not predicting what will happen in these games – I’m saying that given the data that I have teams similar to Alabama have won 87.5% of the time since 2011.

To me this is a logical way to look at things. I can’t predict the future, but I do know what’s happened in the past.

Think of these less as predictions and more as a look at history of similar games.

 

Figure The Odds – Version 1.3

Metrics

The model picked up a little steam over the weekend and the record now stands at 13-7 straight up and 11-9 against the spread.

As a reminder, these probabilities are based on the results of 1,869 college football games from 2011 to 2013.

WP4

An important distinction here – I’m not predicting what will happen in these games – I’m saying that given the data that I have teams similar to Oregon have won 88.0% of the time since 2011.

To me this is a logical way to look at things. I can’t predict the future, but I do know what’s happened in the past.

Think of these less as predictions and more as a look at history of similar games.

Figure The Odds – Version 1.2

Random Numbers

Through 12 games we are 6-6 straight up and 6-6 against the spread. As a reminder, these probabilities are based on the results of 1,869 college football games from 2011 to 2013.

WP 3.1.1

An important distinction here – I’m not predicting what will happen in these games – I’m saying that given the data that I have teams similar to Notre Dame have won 88.0% of the time since 2011.

To me this is a logical way to look at things. I can’t predict the future, but I do know what’s happened in the past.

Think of these less as predictions and more as a look at history of similar games.

Figure The Odds – Version 1.1

Random Numbers

After going 4-2 against the spread in my debut, here’s the first wave of bowl games. These probabilities are based on the results of 1,869 college football games from 2011 to 2013.

WP2

An important distinction here – I’m not predicting what will happen in these games – I’m saying that given the data that I have teams similar to East Carolina have won 88.0% of the time since 2011.

To me this is a logical way to look at things. I can’t predict the future, but I do know what’s happened in the past.

Think of these less as predictions and more as a look at history of similar games.

Figure The Odds

Random Numbers

The probabilities below are based on research of 1,794 college football games between 2011 and 2013 using information that I have found particularly predictive of winning and, to a lesser extent, covering the spread (or not).  In general, I believe many models attempt to take too many factors into consideration.  Distinguishing between the signal and the noise is important.

While the data for the larger spreads is much clearer, there is a larger sample size for the smaller spreads which makes me more confident, in general, in those numbers (though I am positive Florida State beats Duke).

WP

An important distinction here – I’m not predicting what will happen in these games.  I have no idea if the Seminoles will turn the ball over 5 times or Jameis Winston gets hurt in the first series.  What I’m saying is given the data that I have teams with similar attirbutes and facing a similar opponent as Florida State have won 96.2% of the time since 2011.

To me this is a logical way to look at things.  It’s easy to say Baylor should beat Texas.  But what effect will the freezing rain forecast for Waco Saturday have on these teams?  No one knows.  But I do know what’s happened in the past.

So think of these less as predictions and more as a look at history of similar games.

Defensive Efficiency Ratings Week 12

VB

Georgia Tech comes in a surprising 12th in defensive efficiency, while the Tigers are at 31. We’ll find out in a few hours whether the Jackets ranking is more competition or talent related.

As mentioned in my weekly piece at orangeandwhite.com, third downs on both sides of the ball are likely to be a key.