May 19, 2019

Geek Speak: Kmeans Clsutering with College Football Defenses

Editors Note: Paul Chimenti is a marketing analyst and provides statistical analysis for Seldom Used Reserve. The tables below originated with analysis done by Paul. For more detailed information on methodology, data, assumptions, etc., please contact

The data below includes games from 2011-2013 and includes games between “Big 5” teams only (or those that will be this season such as Louisville) and is an attempt to “cluster” defenses together using Kmeans clustering. If you’re not familiar with Kmeans clustering you may want to read this page prior to attempting to digest the data.

First, let’s clarify what the data includes (and doesn’t include):

• Includes all games between two “Big 5” teams (ACC, SEC, Big 10, PAC 12 and Big 12), Notre Dame and teams (like Louisville) that will be in a Big 5 conference this season.
• For teams like Louisville (and Notre Dame) only games against Big 5 teams are included.
• Does not include games against FCS teams or games with teams outside of the Big 5 – i.e. Clemson vs. Citadel and Clemson vs. Troy, for example, are not included.

The data does not purport to tell you which defensive style (“cluster”) is better than another – Cluster 1 is not necessarily better than cluster 2, just different – but rather gives you an idea of defenses with similar attributes.

I’ll have to admit that I was surprised to see Michigan in Cluster 1, but 348.1 yards per game is not a bad average these days, even if the Wolverines do play in the Big 10.

D Cluster 1 2013
Cluster 2 is where the Tigers reside and that’s probably about right over the last 3 seasons.  I would classify these as “decent”, but not top tier defenses.  An interesting side note here (at least for me) is that Arizona gave up 95.4 more yards per game than Clemson, but only 1.8 more points.  Perhaps turnovers were the key as the Sun Devils averaged a half more turnover per game than Clemson.

D Cluster 2 2013

Cluster 3 introduces us to some of the more problem defenses and includes one that many see as “good” – Ohio State.  Again, we have to remember this is a 3 season window, not a look back at 2013 and the clustering is not a referendum on “good” or “bad”, but rather grouping like defenses.

D Cluster 3 2013

Each team in Cluster 4 gave up at least 411.6 yards per game and a minimum of 29.9 points per game.  Ouch.

It’s also notable that 4 ACC teams reside in this cluster.  And remember how there were 5 Big 12 teams (of 10 conference teams) in Cluster 1 on the offensive side?  Well, there are 5 Big 12 teams in Cluster 4 on defense.

D Cluster 4 2013


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: