We often hear things like “we’ve set up CI”, which makes no actual sense when you consider what CI is. It’s not a server or a tool, Continuous Integration (CI) is an ongoing practice whereby we keep the code continuously integrated. That sounds simple but has more subtlety than you might expect. Many places today that think they’re doing CI, actually aren’t, and as a result aren’t getting the benefit they could.
Let’s look at where it came from and what problem we’re really trying to solve.
eXtreme Programming has been around since the mid 1990’s and right from the beginning, one of it’s core practices was continuous integration.
What this meant in the 1990’s was that we would regularly copy our changes to a single common machine so that there was one copy of the full build that contained everyone’s changes. We were integrating multiple times a day to ensure that my changes would work with your changes and that there would be no surprises when we tried to run the application.
This was at a time that version control was not commonplace and many people were still unconvinced that version control was even a good idea. Even if you had version control, it might only be local to your own machine and not synced to a common shared repository. We would often copy source on floppy disks in order to integrate them on the shared build server.
If you started your career in the last couple of decades, you may be horrified by this. The fact that we were able to integrate our code multiple times a day was seriously leading edge.
By the early 2000’s, version control was becoming commonplace and we’d started to see the introduction of what we now call “CI Servers”. Programs that would pull all the code, run all the tests and spit out a build.
It had become easier to continuously integrate our code. Now we only had to sync with the repository and check our code in frequently.
At this point, branching was relatively rare. We created a branch when we released something and we might create a development branch if we were making a really significant change, but otherwise, we all developed on the single main line in the repository. This made continuous integration relatively easy.
Then along came along distributed version control systems, specifically Git. Git made branching so easy that people started branching to provide a bit of isolation when they were making changes. GitHub then published a recommendation that people do feature development on branches and then submit a PR when they wanted to merge their changes again.
This was a serious blow to continuous integration. We immediately went from a place where we had been integrating regularly to a place where we were off working in isolation again and integrating less often. Now when we did integrate, we were more likely to have merge conflicts because we weren’t all synced all the time.
The fact that we were doing development on a feature branch was bad enough, but then we would often take days, or even weeks, to get PR’s approved. Continuous integration had stopped. Now we were intermittently integrating, at best.
To make it even worse, teams started setting up their “CI Servers” to do builds off feature branches. There is no integration until it gets back to the main branch and yet teams would proudly say that they were doing CI on all the branches. That’s not CI.
Continuous integration means that we are continuously integrating all changes back to the main line. If we’re off on a branch then by definition, we’re not integrating.
The reality is that we don’t need to be integrating every minute. If we’re off the mainline for a couple of hours, we’re likely still getting most of the benefit of CI. If we’re isolated for days or weeks, then we aren’t.
Is it possible to be on a branch and be integrating fast enough? Yes, I’ve seen teams work on short lived branches that get merged frequently, and they are getting good value from CI. I’ve seen teams where PR’s get closed in minutes and that’s also ok.
Far more often, I see branches open for weeks and PR’s open for days and that’s not CI. We’re not integrating continuously at that point.
So what should be doing today to get the benefit of continuous integration?
- If at all possible, stop using feature branches. Do all your development on the main branch and commit frequently. This is called Trunk Based Development.
- If you must use feature branches then make them short lived. Try to keep each one active for no more than a few hours.
- If you must use PR’s then make it a priority to have them reviewed and merged. I’ve heard of one team that averages six minutes from PR creation to close. Aim for that level of responsiveness.
- Make small commits and more of them. Keep pulling latest to ensure you’re integrating with what everyone else has committed.
If XP teams could do continuous integration back in the 1990’s with the technology that was available then, there is nothing stopping us from doing it today.
One last thought: Many companies want to do Continuous Delivery so they have the ability to deploy multiple times a day. What I describe above is a prerequisite for that. You can’t do CD without CI and that’s why it’s often abbreviated as CI/CD.