Store git activity in MySQL with PHP

Git hooks are saving me so much time and providing me with interesting solutions to problems I didn’t even know I had. I can’t be the only person who this would be useful to, so give it a go.

As I said, I work on loads of sites, and keeping track of what’s been done and where is sometimes a bit of a pain. I keep a todo list, but if I get an emergency email from someone, chances are that won’t go through my todos. It will, however, be put into version control.

So this morning I had the bright idea to write a git hook that pushes relevant information to MySQL so that I can run activity reports later. All my bare git repositories are stored in a directory on our dedi, so it’s just a matter of making sure each repository has the post-receive hook in. I do this by keeping the actual hook in the same directory as all my repositories, then symlinking the hook into the appropriate place with the following little script. Obviously, this assumes that your post-receive hook is in the same place as your repositories, and that you want this hook everywhere. But that’s all true, so we’re all good. Once you’ve run the linked script, you’ll only have one hook to maintain and every time you create a new repository, you can just run the script again and everything will all be up-to-date.

Now for the hook. It’s not beautiful PHP, but little scripts like this rarely are, in my experience.

Create this table:

CREATE TABLE `log` (
`id` int(10) unsigned NOT NULL auto_increment,
`repo` varchar(255) NOT NULL,
`commit` varchar(40) NOT NULL,
`date` datetime NOT NULL,
`message` text NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `commit` (`commit`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Here’s your script. chmod +x it.

#!/usr/bin/env php
<?php
date_default_timezone_set('Europe/London');
exec('pwd',$pwd);
$repo = rtrim(array_shift($pwd),'/');
$repo = substr($repo,strrpos($repo,'/') + 1);
$db = new PDO('mysql:dbname=DB;host=127.0.0.1','USERNAME','PASSWORD');
exec('git log --all --pretty=format:"%H%n%ct%n%s%n%b%n<><><>"',$capture,$log);
if ($capture){
    // preprocess the log
    $commits = array();
    $current = array();
    foreach ($capture as $row){
        if (trim($row) === '<><><>') {
            $commits[] = $current;
            $current = array();
        } else {
            $current[] = $row;
        }
    }
    $v = array();
    $b = array();
    foreach ($commits as $commit){
        $sha = $commit[0];
        $m = $commit[2] . (trim($commit[3]) === '' ? '' : "\n\n. implode("\n",array_slice($commit,3)));
        $d = date('Y-m-d H:i:s',$commit[1]);
        $v[] = '(?,?,?,?)';
        $b[] = $repo;
        $b[] = $sha;
        $b[] = $m;
        $b[] = $d;
    }
    $stmt = $db->prepare('insert ignore into log (repo,commit,message,`date`) values. implode(',',$v));
    try {
        if ($stmt) {
            if (!$stmt->execute($b)) throw new PDOException;;
        } else Throw new PDOException;
    } catch (PDOException $e) {
        mail('EMAIL','Commit did not reach db',$e->getMessage());
    }
}
?>

So basically we’re extracting the log data we need, doing some funky stuff to handle multi-line commit messages (I like to store lots of details as my subject messages tend to be a bit vague!). Other than that, if you’re familiar with PHP, the above should be pretty self-explanatory. If it’s not, hit the comments and I’ll explain things.

I’ve only been using this a little while, but it seems to work very well. If you use it and stumble across any bugs, I’d love to know about them!

Update: I’ve today realised that git log only logs the currently-selected branch, or master on a bare repo so I’ve added the –all switch to git log so I can get the logs for every branch. Most of it’s just “Merged blah” but that means it can be filtered easily and I’d rather have everything and need to filter than be missing something important.

Managing multiple working copies with git

We have a CMS at Buffalo, which we have deployed to several servers for a few of our clients. Until recently, it was only two separate servers and was relatively easy to manage with Subversion (though slightly cumbersome – svn ci -m “blah”; ssh server; cd /site; svn up; exit; ssh server2; cd /site; svn up;) but now that we’re deploying the same CMS to 8 servers, simple changes and bug fixes are becoming a pain in the ass to deploy.

I’ve been tentatively looking into git for quite some time now. I’m incredibly cautious of the bleeding edge when it comes to how I make money because I prefer to have things that I can rely on and will serve me well than keep up with the latest fads. That’s not to say that I’m not interested in all the cool new jazz, and I keep up as much as family life permits, but I don’t dive in and adopt without learning and understanding everything I need first. Sensible? Yes. I’m surprised at how many people flaunt this ethos.

That being said, git has proved itself invaluable to me in the last 4 months. The way it encourages you to work is great. I do all changes on branches then merge them into master and remove branches to keep everything clean. I also have $deploy-staging and $deploy-live (where $deploy is a working copy) so that I can manage configuration for each working copy. This probably isn’t git best practice, but I’ve found it to be incredibly convenient. I work on up to 20 different sites in any week, so being able to merge changes and conflicts for live stuff locally saves me headaches galore. No-brainer. My git workflow goes something like this:

git checkout -b hotfix-phperror
# do some work
git commit -am "Hotfix for PHP Error fixed"
git checkout master
git merge hotfix-phperror
# resolve any conflicts
git checkout staging
git merge master
# test on staging - everything OK
git checkout live
git merge master

That might seem like quite a bit of work, but it keeps everything tidy. Unfortunately, it does mean that if things fail on staging, I still have the history of it in master. I suppose it would make more sense to merge the hotfix into staging, then merge that into live, then merge everything into master and tag once it’s verified working to keep it clean, but either way, this tip will work.

I’m under the impression that git frowns upon what I’m about to recommend, but it works so well that I can hardly ignore it! One proviso for this is that all the working copies you push to have to be the same. Lucky for us, the configuration for our CMS is held by the live site (separate repository) rather than the CMS itself, so each working copy is the same.

With the above in mind, I’m going to assume the following:

• You’ve got a central, bare repository to push to
• Your live branch (identical for all working copies) is tracked from said central repository

First off, you have to make sure that your local working copy has all remotes available. Obviously your bare repository will be available because that’s how you’re doing things anyway, but we also need working copies. You can check that everything’s there by running a quick git remote. If you’re all good, then we can start pushing code around.

As I said before, git doesn’t seem to like this sort of thing. If you were to try and push to one of your working copies where the checked out branch is the one you’re pushing to, git will whine at you and preserve the changes you push so that you can stash any local differences. You can force this by changing to the directory and running git reset --hard or git stash, but that defeats the purpose of this so we need a workaround.

Luckily, git gives you all the hooks you need and more for this. We’ll be creating a post-update hook that will force-update your push. With that in mind, cd to the root of your working copy on one of your remotes and do something like:

#!/bin/sh
cd ..
env -i git reset --hard

In .git/hooks/post-update and chmod it executable.

Now, when you push to the repository git will moan at you, but this hook will run and force all your changes. Awesome.

In the warning message git spits out, it threatens that new versions of git will auto-reject pushes to checked out branches, so a quick run of:

git config --set receive.denyCurrentBranch "ignore"

Will shut git up and have it doing what you tell it to. I suppose the reason it does this is it assumes that someone is working in that working copy and they don’t want you frivolously overwriting their code. Luckily, we know what we’re doing is safe so there’s no harm in doing this. Do remember, though, that all changes made in the working copy will be overwritten when you push using this method, so don’t work on live working copies. Don’t do that anyway, but definitely don’t do it here!

Now, you can do the following:

git remote add remote-alias ssh://root@blah/path/to/working/copy
git fetch
git push remote-alias live

In all the feedback, you’ll see something like “HEAD is now at 0d5431b HELLO!!!” (the hash and first line of your last commit message) which lets you know that things have worked.

Now you’re able to push to remotes without complaints, from the comfort of your local working copy, with a bit of scripting you can deploy your local live branch to all its remote locations with a little bit of scripting:

for remote in `git remote`; do git push $remote branchname; done;

For future reference, we’ll assume that this isn’t the only time you’ll do this, so save the following as a script in your path somewhere (I call it git-rpush):

for remote in `git remote`; do git push $remote $@; done;

Which you can call with git rpush live. Assuming that all your working copies have the config set and the hook installed, you can now push a change to all of your local repositories at once without having to remember which ones you actually need to push to, then spending half an hour SSH-ing all over the place to do it.

I don’t know about you, but that’s just saved me a crap-load of time. Hope it helps someone!

Coincidentally, if you do need this to be locally configurable, you can easily check out your branch and create a patch for configuration, then add the patch removal and application into hooks. I would do something like the following (assuming you already have your patch):

#!/bin/sh
cd ..
env -i git apply -R config.patch

In .git/hooks/pre-update and chmod executable

This will remove your config so that no conflicts occur during the update. All you need is to re-apply the patch in the post-update, after you’ve done the reset. Insert the following:

git apply config.patch

Your config will be preserved. You can even keep these patches in version control, just change the name for all your remotes and update your hooks accordingly. Then enter my competition on how easy you can make your life by scripting it!

Any and all questions and improvements are welcome and appreciated.

I would like to thank the following post for hand-holding through the intricacies of what I was trying to achieve with hooks. So thanks!

Being compulsive isn’t always annoying

My two least favourite things about myself (well, about anyone really) are laziness and compulsiveness. Normally, my laziness takes over my compulsiveness, case-in-point: I have to do the washing up in the evening, I’m very lazy, the washing up doesn’t get done. Gross, I know.

I have, however, found a way to combat my laziness with it’s strongest rival—you guessed it—my compulsiveness! Lists. Until recently, I listed everything in my diary, in a notebook, on my laptop, basically anywhere I could find some paper. That is, until I started using GTD apps.

I started out with Cultured Code‘s Things. A deceptively simple app that organises your lists into projects, areas and tags with a useful Quick Entry HUD for creating tasks as soon as you think of them (so they don’t just go in and out). I love the simplicity of Things, and the fact that it has a counterpart iPhone app (buggy though it may be) is very useful. I used it briefly in the beta, but I just couldn’t integrate it with my workflow. It wasn’t until I started listing everything I do that I saw how useful having this on a computer could potentially be.

As much as I love Things, it does have some holes. I won’t go into all of them, because a lot are bugs in the software or things that are on the roadmap, so it wouldn’t be fair. All I’ll say is that I’m no longer using Things, and have replaced it with Potion Factory‘s The Hit List.

For me, The Hit List’s single feature that sets it apart from Things is the presence of timer functionality for each item in your lists. You simply select an item, hit B and the timer starts. As a freelancer (or anyone who works on billable projects, really) this is incredibly useful. I create lists that directly map onto projects that I’m working on, so being able to log time against each item shows me not only how long each aspect takes me (for future quoting), but how much to charge.

The Hit List is still in beta, but I’ve found it to be very usable. I’ve only had to restart it once, when the HUD wouldn’t invoke, and it’s not done it since so I’m happy! There is also the promise of an iPhone app (which I have found to be almost a must-have), which I hope materialises soon.

Now that I have my GTD app in-hand, I’m able to make lists and set deadlines. My compulsiveness doesn’t like deadlines, oh no. It forces me off the sofa and to the sink to do the washing up, or to take out the bins, or to send the email I’ve been putting off and offers laziness as a reward. How generous!

So, there you have it, if you’re compulsive like me, but also incredibly lazy at times; try pitting your compulsiveness against your laziness. If you give it the right ammo, compulsiveness always wins!