January 21, 2019

Geek Speak: Kmeans Clustering with College Football Offenses

Editors Note: Paul Chimenti is a marketing analyst and provides statistical analysis for Seldom Used Reserve. The tables below originated with analysis done by Paul. For more detailed information on methodology, data, assumptions, etc., please contact chimenti80@gmail.com.

The data below includes games from 2011-2013 and includes games between “Big 5” teams only (or those that will be this season such as Louisville) and is an attempt to “cluster” offenses together using Kmeans clustering. If you’re not familiar with Kmeans clustering you may want to read this page prior to attempting to digest the data.

First, let’s clarify what the data includes (and doesn’t include):

• Includes all games between two “Big 5” teams (ACC, SEC, Big 10, PAC 12 and Big 12), Notre Dame and teams (like Louisville) that will be in a Big 5 conference this season.
• For teams like Louisville (and Notre Dame) only games against Big 5 teams are included.
• Does not include games against FCS teams or games with teams outside of the Big 5 – i.e. Clemson vs. Citadel and Clemson vs. Troy, for example, are not included.

The data does not purport to tell you which offensive style (“cluster”) is better than another – Cluster 1 is not necessarily better than cluster 2, just different – but rather gives you an idea of offenses with similar attributes.

Most of these are givens – you won’t get much argument about Clemson, Oregon, Oklahoma State and Texas Tech being clustered together.

But what about Indiana? The data shows they performed worse in every category than other cluster 1 teams except for turnovers.

However, the Hoosiers played fast, averaging 2.85 plays per minute of possession, which by the way was second only to Oregon’s 2.86 in cluster 1.
O Cluster 1 2013
In other words, Indiana played fast but not efficient and in that sense in makes sense they’re included in Cluster 1.

Cluster 2 also makes sense to me as it contains some very good, but not fast paced offenses like Florida State, Alabama, Georgia and Louisville.
O Cluster 2 2013

Conversely, many would question Auburn in the 3rd cluster, as I did. But remember, this is a 3 year window so the horrid Tiger offense of 2011 most likely offset the torrid Auburn offense of 2013.
O Cluster 3 2013

Cluster 1 confirms a couple of long held assumptions of mine.

First, the Big 12 has been the “fastest” offense in college football with 5 of the 11 teams in the cluster coming from that league (there are only 10 teams in the conference).  3 more come from the PAC 12. That means 8 of the 11 teams in Cluster 1 come from conferences that are known for fast paced offenses.

Secondly, Clemson is the lone ACC team in cluster 1 and that gives the Tigers an advantage over the rest of the ACC in general, Florida State’s dominance notwithstanding.

That doesn’t mean Clemson will win every game, offense is only half the battle, but absent a stout defense (i.e. Florida State) it means the Tigers have a decided advantage in most ACC games.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: