Graphing changes in file size across git commits
Both as an excuse to try to learn gnuplot and as a way to track the growth of compiled javascript and css assets files, I was looking for a way to grab the size of a given file across a series of git commits, and end up with output like this:
# size commit date
439323 d4d09e047d50388180a1e317efc61af5d8961275 20130201
439323 fd30e151e35efba1bda65488e621c7338895542e 20130130
439241 6ce650d7e97add955b7cd07150732890c0edaf49 20130129
439241 3c1d2aec69f874926965843800163be71ec5f376 20130128
If the name of the file stays the same, it turn out this is pretty simple. The following git command will show the size of the file for the commit in question:
git ls-tree -r -l <COMMIT> <PATH>
So we can do something like
git ls-tree -r -l HEAD~$COUNTER compiledjs.min.js
in a bash script and increment $COUNTER as much as we want, grabbing the file size with some ugly use of tr and cut, e.g:
git ls-tree -r -l HEAD~39 compiledjs.min.js | tr -s ' ' | tr '\t' ' ' | cut -d ' ' -f 4
But if the name of the file changes across commits, as it will if you are tagging it with a date or SHA1 for cache-busting, this approach won’t work. The approach I came up with, which is hacky, involves creating and deleting temporary branches based on HEAD~1, HEAD~2, etc., and getting the requisite date, file size, and commit info by pattern-matching on the name of the file in question.
Shell script to accomplish this, along with some basic gnuplot commands to plot the output, here: https://gist.github.com/4700556