Saturday, November 14, 2009

Using Mercurial Bisect to Find Bugs

Yesterday I tried to find a regression bug in Grizzly that was preventing grizzly-sendfile from using blocking IO. I knew that the bug was not present in grizzly 1.9.15, but somewhere between that release and the current head someone introduced a changeset that broke things for me. Here is how I found out who that person was.

Grizzly is unfortunately still stuck with subversion, so the only thing (besides complaining) that I can do to make my life easier, is to convert the grizzly svn repo to some sane SCM, such as mercurial. I used hgsvn to convert the svn repo.

Once I had a mercurial repo, I wrote BlockingIoAsyncTest - a JUnit test for the bug. And that was all I needed to run bisect:
$ echo '#!/bin/bash                                                                          
mvn clean test -Dtest=com.sun.grizzly.http.BlockingIoAsyncTest' >
$ chmod +x
$ hg bisect --reset #clean repo from any previous bisect run
$ hg bisect --good 375 #specify the last good revision
$ hg bisect --bad tip #specify a known bad revision
Testing changeset 604:82e43b848ae7 (458 changesets remaining, ~8 tests)
517 files updated, 0 files merged, 158 files removed, 0 files unresolved
$ hg bisect --command ./ #run the automated bisect
(output from the test)

[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9 seconds
[INFO] Finished at: Sat Nov 14 11:41:07 PST 2009
[INFO] Final Memory: 25M/79M
[INFO] ------------------------------------------------------------------------
Changeset 500:983a3fc2debe: good
The first good revision is:
changeset:   501:b5239bf9427b
branch:      code
tag:         svn.3343
user:        jfarcand
date:        Wed Jun 17 17:20:32 2009 -0700
summary:     [svn r3343] Fix for
In under two minutes I found out who and with which revision caused all the trouble! Bisect is very useful for large projects, developed by multiple users, where the amount of code and changes is not trivial. Finding regressions in this way can save a lot of time otherwise spent by debugging.

The only caveat I noticed is that you need to create a shell script that is passed in as an argument to the bisect command. It would be a lot easier if I could just specify the maven command directly without the intermediate shell script.