I spent a good time last night troubleshooting a “works on my machine” problem.
It takes pain to learn something; this pain perhaps was good. It reminded me of a concept that is really important in your software development infrastructure.
I have three golden rules of development environments and deployment:
- I should be able to run a software build locally that is the exact same build that will be run on the continuous integration server.
- Only bits that were built by the continuous integration server are deployed and the same bits are deployed to each environment.
- Differences in configuration in environments only exist in one place and on that machine.
You don’t have to follow these rules, but if you don’t, most likely you will experience some pain at some point.
If you are building a new deployment system, or you are revamping an old one, keeping these 3 rules in mind can save you a large amount of headache down the road.
I think it is worth talking about each rule and why I think it is important.
Rule 1: Local Build = Build Server Build
If you want your continuous integration environment to be successful, it needs to appropriately report failures and successes.
If your build server reports failures that are false, no one will believe the build server and you will be troubleshooting problems that are build configuration related instead of actual software problems. Troubleshooting these kinds of problems provides absolutely no business value. It is just a time sink.
If you report false successes when you deploy the code to another environment, you will discover the issue, will be wasting time deploying broken code, and you will have a long feedback loop for fixing it.
As a developer, I should be able to run the exact same command the build server will run when I check in my code. I would even recommend setting up a local version of the continuous integration server your company is using.
By being able to be confident that a software build will not fail on the build server or during deployment if it doesn’t fail when running it locally, you will prevent yourself from ever troubleshooting a false build failure. (The deployment still could fail, and the application could still exhibit different behavior on different environments, but at least you will know that you are building the exact same bits using the exact same process.)
Rule 2: Software Build Server Bits = Only Deployed Bits
Build it once and deploy those bits everywhere. Why?
Because it is a waste of time to build what should be the exact same bits more than once.
Because the only way to be sure the exact same code gets deployed to each environment (dev, test, staging, production, etc.), is to make sure that the exact same bits are deployed in each environment.
The exact same SVN number is not good enough, because an application is more than source code. You can build the exact same source code on two different machines and get a totally different application. No, I’m not loony. This is a true statement. Different libraries on a machine could produce different binaries. A different compiler could produce different binaries.
Don’t take a risk. If you want to be sure that code you tested in QA will work in production exactly the same way, make sure it is the exact same code.
This means you can’t package your configuration with the deployment package. Yes, I know you always have done it that way. Yes, I know it is painful to figure out another way, but the time it will save you by never having to question the bits of a deployment ever again will be worth it.
Rule 3: Environment Configuration Resides in Environment
Obeying this rule will save you a huge amount of grief.
Think about it.
If the only thing different in each environment is in one place, in one file in that environment, how easy will it be to tell what is different?
I know there are a lot of fancy schemes for adding configuration to the deployment package based on what environment the deployment is going to. I have written at least 3 of those systems myself.
But, they always fail somewhere down the line and you spend hours tracing through them to figure out what went wrong and ask yourself “how the heck did this work again?”
By making the configuration for an environment live in the environment and in one place, you take the responsibility of managing the configuration away from the software build process and put it in one known place.
In my previous post, I talked about the idea of having a simple branching strategy and why I prefer one where everyone works off the same branch.
In this post I will show you how to create what I believe is the most simple and effective branching strategy.
Take a look at this diagram of a sample project’s code lines:
Walking through it
The idea here is very simple. Let’s walk through a development cycle together:
- Development starts. Everyone works off of trunk. Code is frequently checked into trunk, many developers checking in code 3-4 times a day, as they complete small quality sections of development.
- The continuous build server is continuously building and checking the quality of the code every single time code is checked in. Any integration problems are immediately fixed.
- Enough features are done to create a release. Trunk is tagged for release and a release 1 branch is created representing the currently release production code.
- Developers continue to work on trunk not being interrupted by the release.
- A customer finds a high priority issue in Release 1.
- A Rel 1 Hot Fix branch is created, branched off of Release 1 to fix the high priority issue. It turns out that a good fix will take some time. Team decides the best course of action is to apply a temporary fix for now.
- Rel 1 Hot Fix is done and merged back into Release 1 branch. Release 1 is re-deployed to production.
- In the meantime another emergency problem shows up that must be fixed before the next release. Rel 1 Hot Fix 2 branch is created.
- The bug fix for Rel 1 Hot Fix 2 is a good fix which we want in all future releases. Rel 1 Hot Fix 2 branch is merged back to Release 1 branch, and merged back to trunk. Release 1 is redeployed.
- In the meantime work has been going on on trunk, team is ready for Release 2.
- Release 2 branch is created…
Breaking it down
I gave a pretty detailed walk-through for a very simple set of actual steps. But, I hope you can see how simple this process really is.
The basic idea here is that we are trying to decouple releases from development as much as possible. The team is always going to keep chugging along, building new features and enhancing the code base. When we decide we have enough features for a release, we simply branch off of trunk and create the release branch.
We can even do some testing on the release branch before we go to production if we need to without impacting future development.
The release branch code-lines never come back to trunk. They don’t need to, they only exist so that we can have the exact production code and make modifications to it as hot-fixes if we need to.
We branch hot-fixes off of the release branch so that we can work on them independently, because not all hot-fixes go back to the main code-line. We can make a hot-fix just for the current release, or we can merge it back to trunk to make it a permanent fix.
That is all there is to it. This kind of branching strategy almost completely eliminates merges. The only merge you ever do is small merges for hot-fixes.
Your branching strategy does not have to be complicated. A simple strategy like this can fit almost any software development shop.
Frequently disputed points
Almost immediately when I introduce this simple system someone says:
What about half-completed features? I don’t want to release half-completed features. Using this strategy with everyone working off trunk, you will always have half-completed features.
So what? How many times does a half-completed feature cause a potential problem in the system? If the code is quality and incrementally developed, it should not impact the rest of the system. If you are adding a new feature, usually the last thing you do is actually hook-up the UI to it. It won’t hurt anything to have its back-end code released without any way to get to it.
Continuous integration, (especially running automated functional tests), trains you to always keep the system releasable with every commit of new code. It really isn’t hard to do this, you just have to think about it a little bit.
If worse comes to worst and you have a half-finished feature that makes the code releasable, you can always pull out that code on the release branch. (Although I would highly recommend that you try and find a way to build the feature incrementally instead.)
If you know you’re going to do something that will disrupt everything, like redesigning the UI, or drastically changing the architecture, then go ahead and create a separate branch for that work. That should be a rare event though.
I need to be able to develop the features in isolation. If everyone is working off of trunk, I can’t tell if what I did broke something or if it is someone else’s code. I am impacted by someone else breaking things.
Good, that is some pain you should feel. It hurts a lot less when you’re continuously integrating vs. working on something for a week, merging your feature and finding that everything is broken.
It is like eating a meal. All the food is going to end up in the same place anyway. Don’t worry about mixing your mashed potatoes with your applesauce.
If something someone else is doing is going to break your stuff, better to fail fast, then to fail later. Let’s integrate as soon as possible and fix the issue rather than waiting until we both think we are done.
Besides that, it is good to learn to always check in clean code. When you break other people and they shoot you with Nerf guns and make you wear a chicken head, you are taught to test your code locally before you check it in.
How to be successful
How can you be successful at this simple strategy?
- Make sure you have a continuous integration server up and running and doing everything it should be doing.
- When you work on code, find ways to break it up into small incremental steps of development which never break the system. Hook up the UI last.
- Always think that every time you check in code, it should be code you are comfortable to release.
- Check in code at least once a day, preferably as soon as you make any incremental progress.
- Test, test, test. Test locally, unit test, test driven development, automated functional tests. Have ways to be confident the system never moves backward in functionality.
- So important I’ll say it twice. Automated functional tests. If you don’t know how to do this, read this.
- Release frequently instead of hot-fixing. If you never hot-fix you will never have to merge. If you never have to merge, you will live a longer, less-stressed life.
- Don’t go back and clean up code later. Write it right the first time. Check it in right the first time.
Hopefully that helps you to simplify your branching process. Feel free to email me or post here if you have any questions, or are skeptical that this could work.
Source control management has always been one of those sticky topics which always causes many questions. Many veteran programmers are baffled by the in-and-outs of branching and merging. And for good reason; it is a difficult topic.
I’ve been around in many different organizations. I’ve been the person who was told what the SCM strategy was, and I have been the person who designed it. I’ve seen it done just about every way possible, and after all that, do you know what I think? (I’ll give you a hint, it’s the name of this blog)
Keep it simple. Working directly off the trunk is by far the best approach in my opinion.
It almost sounds like heresy when I actually type it onto my screen, but if you’ll bear with me for a moment, not only will I show you why I think this is essential for an Agile process, but I’ll show you how to make it work.
Continuous integration is the root
As a software development community as a whole, I think most of us can agree that CI is good and provides real value.
What is Continuous Integration?
Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily – leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible.
So the idea here is to integrate as soon as possible, the faster the better. If something is going to break, we want it to break sooner rather than later.
The best way for this to happen is that whenever I check in my code it is instantly integrated with everyone else’s code and we build and test it.
This is a change from the old way of doing things, in which I am working on my code in isolation for some period of time and periodically I show up with a bunch of code and merge it all together.
The only practical way to have continuous integration is to have everyone working off the same branch. It doesn’t have to be trunk, but it has to be the same branch. Sure, there are some ways to have people work off different branches and automatically merge them when they commit as a trial integration, but that is a large amount of effort, and it can’t handle conflicts.
One of the “big dogs” in Agile is collaboration. Collaboration is pretty important. Working on separate branches tends to go against this idea. I know you can still collaborate when you have separate branches, but it is more likely that you won’t.
In general I have found that software developers in particular are the kinds of people who will go off and do their own thing if given the chance. It is not a bad thing, but it means that we need to set up developers to collaborate. If your process isn’t explicitly fostering collaboration, it probably won’t happen (at least not as much as you would like it to.)
If we are all working on the same branch we have to communicate. We have to work together to accomplish a goal. I can’t just work in isolation, because what I am doing affects you, and what you’re doing affects me. We have to talk about it.
I’m more likely to watch your code come in and say “hey, where is your unit test, bub?” You’re more likely to watch my code come in and say, “is there a reason you’re not using superWidgetX to solve that problem?”
If I check in some junk and break the build, everyone is impacted. This might seem like a bad thing, but it fosters a team mentality instead of an individual mentality. Think about boot camp (or at least what you have seen on TV.) What happens when one guy screws up? EVERYONE does push-ups. Why? To build a team, a team mentality.
I have found in Agile processes like Scrum, the best quality work is done when developers are paired up working on a backlog item. This isn’t impossible to do with individual developer branches or feature branches, but it is much harder and less likely to happen.
Keeping it simple
One of the central themes of my blog… of my life… is keeping it simple. Simple is good, it makes everything easier.
I am notorious for walking into an organization and ripping out all the ridiculous process that provides no value and putting in very simple and lean process in its place. It is in my nature.
If you have ever spent some time in what I like to call “merge hell”, you’ll understand what I am about to talk about.
I have been working with source control for a very long time. I have been the “merge master,” merging huge numbers of branches together on large projects. I have worked on developer branches, feature branches, release branches and everything in-between, and I still can’t get it right!
Don’t get me wrong. I know the theory behind it, I know how to do it. But merging can be so complicated at times that the chance of making a mistake is very high.
You really have to ask yourself a question here: “Is all the overhead you are incurring from doing your complicated branching and merging strategy resulting in a real value that does not exist over a more simple strategy?”
My general rule of thumb is, never branch unless you have to.
A simple Agile branching strategy
In my next post, I will show you what I think is the most simple and effective branching strategy. A strategy I have effectively used in the past and have developed over time. I’ll sum it up briefly here.
- Everyone works off of trunk.
- Branch when you release code.
- Branch off a release when you need to create a bug fix for already released code.
- Branch for prototypes.
That’s it, stay tuned for next time when I talk about how to make this simple branching strategy practically work in your Agile environment.
In my earlier post, I talked about why you need to start using continuous integration and need a continuous integration server. In this post I will look more at continuous integration best practices when you have a continuous integration server set up.
One of the first things your continuous integration server needs to do is to run and report on your unit tests as part of the build. It is not really a good enough quality checkpoint to make sure that the code simply compiles without errors. Your unit tests should be executed as part of every build. Every time someone checks in code you want to know that unit tests have not broken. Along with this I will include code coverage reports. It is very useful to have your code coverage reported by your continuous integration server for the code coverage metrics on your unit tests.
Static Analysis Tools
One of the best ways to enforce coding standards and quality standards is through the use of static analysis tools. Depending on the language you are using there are several options available. For Java there are tools like PMD and Checkstyle, for .NET there are tools like FxCop and StyleCop. Here is a good list of some of the ones available for each language. There are two basic branches of static analysis tools, ones that check for formatting and convention, and ones that check for bad practices. I would strongly suggest employing both. Your CI server should run the static analysis tools at the end of each build and preferably track the progress of violations introduced in the build. Some CI servers allow you to actually fail the build if an amount or percentage of the code has violations. I would highly recommend this for a new code base. If you end up with a high number of violations the violations just become noise. CI server is a good way to make this visible to everyone on the project.
If deploying your code to any environment requires more than the push of a button, you’re doing it wrong. I know this may sound like a bold statement, but why would you need to do anything more than this? One thing I always emphasize is that the only way for a build to get to an environment is from the CI server pushing it there directly. Basically, you should only be deploying builds that are built by the CI server. When you follow this practice you make sure that the code in an environment is the exact bits that you expect. Taking a build from somewhere else is error prone and risky. Doing a build manually is a waste of time and also risky. Your deployment should be a simple button push which takes bits that have already been built on your CI server and deploys them to the desired environment.
Perhaps the most important part of a CI server is notification when a build fails, or when it is fixed. If you get this one wrong, you will be constantly chasing down developers breaking the build and your efforts will be in vain. It is very important that when a build fails it is a big deal, a really BIG DEAL. Some teams use flashing lights, others use large flat-screen TVs, and others just emails, but however you do it, it must be effective and not ignored. I would recommend at a minimum setting up an email notification when the build fails and when it is fixed. When there is a broken build it is very important that fixing the build becomes top priority. One of the things which can help reduce build breaks is to make sure that developers have a way to do the exact same build which will be done by the CI server. If a developer can run the same build locally before checking in their code, there really aren’t many good excuses for breaking the build. (This will also require that the build be very fast, in one of my earlier posts I talked about having a dedicated developer tools team to do things like optimizing the build.)
I won’t go into extreme detail here about database integration, since the focus of this post is on CI best practices, but I want to make this important point. If your database is not in some way version controlled and built with your source code, there is no point in having the ability to replicate any code build. Basically if you cannot tie a version of the database that is also reproducible to the source code, you cannot actually roll back to a specific point in time. For some projects this is important, for others it is not, but for all it must be considered. Either way you should minimally consider how you will handle database integration with your CI server. One of the solutions I have helped to employ on the project I am working on currently is to have a database build which works very similarly to the source code build and applies database changes as SQL change scripts which are applied in a certain order to build the database.
Is your team using some form of continuous integration?
If not, why not?
Continuous integration is one of the hallmarks of a good development process. I’ve done continuous integration for many years now on every project I work on. I am usually the one putting up the continuous integration server because I consider it a “must have.”
What is continuous integration?
If you don’t know what continuous integration is, it is basically a build server that builds the software every time someone checks in a change. If you are familiar with the terms nightly build, or weekly build, continuous integration is every check-in build.
I have a nightly build, why do I need a continuous integration build?
I will start here and assume you understand the value of a nightly build. The biggest issue and reason is one simple word:
The tighter your feedback loop the more accurate your steering. Imagine if turning your steering wheel in your car had a 2 second delay between the wheels actually turning. Or when you looked out your rear view mirror, you actually saw the image that was there 2 seconds ago. Would that affect your driving ability? I certainly hope it would.
The driving metaphor is the image of the nightly build in contrast to a continuous build. When you are getting nightly information on whether or not your build broke, you are getting information too late. By the time you course correct, you have lost valuable time and some of your developers may have steered right into a ditch. As soon as someone has committed bad code, I want to know it, and I want them to know it. Every second the build remains broken is time that other developers are unable to get the latest code and build the system or check that their changes did not break the build.
How do I get some of this continuous integration goodness?
There are many options out there, and most of them are pretty good actually.
For the .NET world, Visual Studio Team Systems actually has a continuous integration server integrated now called Team Foundation Build.
For Java or .NET I am liking Hudson now, although it is probably better for Java.
What makes a good continuous integration system?
In my next post I’ll cover how to set up your server from a perspective of what should you have and what things are important. I’ll talk about unit test support, code coverage and all the other good things you should do with your CI server.