Wide Awake Developers

« "Release It" is a Jolt Award Finalist | Main | Two Books That Belong In Your Library »

Well Begun Is Half Done

How long is your checklist for setting up a new development environment? It might seem like a trivial thing, but setup costs are part of the overall friction in your project. I've seen three page checklists that required multiple downloads, logging in as several users (root and non-root), and hand-typing SQL strings to set up the local database server.

I think the paragon of environment setup is the ubiquitous GNU autoconf system. Anyone familiar with Linux, BSD, or other flavors of UNIX will surely recognize this three-line incantation:

./configure
make
make install

The beauty of autoconf is that it adapts to you. In the open-source world, you can't stipulate one particular set of packages or versions, at least, not if you actually want people to use your software and contribute to your project. In the corporate world, though, it's pretty common to see a project that requires a specific point-point rev of some Jakarta Commons library, but without actually documenting the version.

Then there are different places to put things: inside the project, in source control, or in the system. I recently went back to a project's code base after being away for more than two years. I thought we had done a good job of addressing the environment setup. We included all the deliverable jars in the codebase, so they were all version controlled. But, we decided to keep the development-only jars (like EasyMock, DBUnit, and JUnit) outside the code base. We did use Eclipse variables to abstract out the exact filesystem location, but when I returned to that code base, finding and restoring exactly the right versions of those build-time jars wasn't easy. In retrospect, we should have put the build-time jars under version control and kept them inside the code base.

Yes, I know that version control systems aren't good at versioning binaries like jar files. Who cares? We don't rev the jar files so often that the lack of deltas matters. Putting a new binary in source control when you upgrade from Spring 2.5 to Spring 2.5.1 really won't kill your repository. The cost of the extra disk space is nothing compared to the benefit of keeping your code base self-contained.

Maven users will be familiar with another approach. On a Maven project, you express external dependencies in a project model file. On the first build, Maven will download those dependencies from their "official" archives, then cache them locally. After that, Maven will just use the locally cached jar file, at least until you move your declared dependency to a newer revision. I have nothing against Maven. I know some people who swear by it, and others who swear at it. Personally, I just never got into it.

Then there are JRE extensions. This project uses JAI, which wants to be installed inside the JRE itself. We went along with that, but I was stumped for a while today when I saw hundreds of compile errors even though my Eclipse project's build path didn't show any unresolved dependencies. Of course, when you install JAI inside the JRE, it just becomes part of the Java runtime. That makes it an implicit dependency. I eventually remembered that trick, but it took a while. In retrospect, I wish we had tried harder to bring JAI's jars and native libraries into the code base as an explicit dependency.

Does developer environment setup time matter? I believe it does. It might be tempting to say, "That's a one-time cost, there's no point in optimizing it." It's not really a one-time cost, though. It's one time per developer, every time that developer has to reinstall. My rough observation says that, between migrating to a new workstation, Windows reinstalls, corporate re-imaging, and developer churn, you should expect three to five developer setups per year on an internal project.

For an open-source project, the sky is the limit. Keep in mind that you'll lose potential contributors at every barrier they encounter. Environment setup is the first one.

So, what's my checklist for a good environment setup checklist?

  • Keep the project self contained. Bring all dependencies into the code base. Same goes for RPMs or third-party installers.
  • Make sure all JAR files have version numbers in their file names. If the upstream project doesn't build their JAR files with version numbers, go ahead and rename the jars.
  • Make bootstrap scripts for database actions such as user creation or schema builds.
  • If you absolutely must embed a dependency on something that lives outside the code base, make your build script detect its location. Don't rely on specific path names.
  • Don't assume your code base is in any particular filesystem on the build machine.

I'd love to see your with your own rules for easy development setup.

Comments

The last time I set up build scripts I had just one rule: sync and run should be a two step process than even non-developers could perform easily. The rules you describe were corollaries to that one rule. As such, the only required prerequisites were the source control client and the Java SDK.

In my case, there were a handful of short (a couple of lines each) batch files in a scripts folder under the sync root containing just enough commands to fire up Ant (which was also sync'd from SCM) and invoke the applicable target. There were batch files for common targets for easy access. There was one that would fire up the server (building it and creating an initial database if necessary) and show the web page with the Java web start link on it.

We actually had 3 or 4 people not on the core development team who were able to trivially view the latest version (as in HEAD of the CVS trunk) of the software any time they wanted to see it (they could see from the Cruise Control server that the build was clean). Two were on another development team (embedded and hardware engineering) with which the software needed to integrate, one was a documentation person who needed to use the app in order to write about it and get screenshots, and one was a tester that sometimes wanted to dig into the source code of the app. Tortoise CVS was easy enough that any of them could grab a build any time they wanted to see the latest changes, including 5 minutes after they were committed.

Since the software was for storage area network management, there was also a "mock" server (and a batch file to build and fire it up, of course) that provided fake data that looked like a big storage area network so that you could browse or even do demos of new features without needing connectivity to any real networking hardware in the lab.

Check out Ivy for lib dependency management in Ant build scripts. It will support your checklist and allow you to keep all third party libs under version control. In a multi-project environment you could set up a separate project for third party libs.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)