The Importance of Clean Changelists

Every software company I have worked for uses some sort of source control software and honestly I don’t think I could refer to any business entity as a “software company” if they didn’t use some sort of source control. Personally, most of my experience has been with Helix / Perforce but I’ve spent time using Git SCM, IBM’s Rational ClearCase, Alienbrain, and yes, I’ve even used Microsoft Visual SourceSafe.

These tools are great for tracking changes to your source code over time, but as with most things, you need to figure out your best practices for how to use the tool in the most effective manner. One thing I see happening a lot is that a changelist (Perforce’s term for a list of files along with the operations performed on them) will get submitted that contains some big important change along with several other unrelated changes such as variable renaming or some other minor refactoring.

There are several problems with putting multiple, unrelated changes into a single changelist:

1. Code Reviews May Be Less Accurate

Imagine you are working on some software such as a game or productivity application that tracks various statistics while the user is interacting with the application. Each time they click on a button or other UI widget, perhaps a call is made to some backend service that tracks all of these various interactions. Depending on the implementation that exists, you may end up with a file somewhere in your code that contains hundreds of similar lines:

// skipped lines
//
RecordInteraction("main_window.file.new.mousedown");
// skipped lines
//
RecordInteraction("main_window.file.new.mouseup");
// skipped lines
//
RecordInteraction("main_window.edit.preferencs.mousedown");
// skipped lines
//
RecordInteraction("main_window.edit.preferences.mouseup");
// skipped lines
//
RecordInteraction("main_window.edit.copy.mousedown");
// skipped lines
//
RecordInteraction("main_window.edit.copy.mouseup");

Now imagine that an engineer finds out that one of the events wasn’t working properly due to a typo as in the “main_window.edit.preferencs.mousedown” event above (a missing ‘e’). So the engineer goes into the code and makes the correction, but then decides to remove the underscore from “main_window” for consistency with the current naming conventions of the project. Now if they put this code up for review, rather than having a single change to look at, the reviewer must look at perhaps hundreds of lines of changes, but only one that is an actual bug fix. The reviewer’s job is to now look at every single one of those lines to verify its correctness and may completely overlook the one actual change they were supposed to be reviewing in the first place.

2. Code Reviews Take Longer

Using the same example from above, a good code review would require the reviewer to hunt through every single line that changed and find the one that she’s supposed to be really focus on reviewing. Now imagine if there were two changelists instead:

  1. Changelist #1: Renamed “main_window” to “mainwindow” for consistency.
  2. Changelist #2: Fixed a typo in “main_window.edit.preferencs.mousedown” that was preventing this event from getting recorded.

Now the reviewer can very quickly check that the reviewee knows how to use “find and replace” in their editor for the first changelist and then only has a single line to look at for the second one! It’s a win-win all around.

3. Finding Application-Breaking Changelists Is Easier

At some point on every project, a change goes into the code that causes the application to break in new and mysterious ways. Sometimes, the best way to track down the culprit code-submission is to go through your source control’s history and look for the submission that caused the application to stop functioning correctly. If you are going through and diffing the past submissions and you come across one that had hundreds of changes in it, it would be far too easy to overlook the fact that something else in that changelist may have been the culprit.

4. Easier to Back Out Breaking Changes

Another thing that has caused pain in the past is when a good change like variable renaming is mixed with a bad change such as an inadvertent crash bug, perhaps. Depending on how extensive and interwoven those two changes are the changelist might have to get backed out in its entirety, variable renaming and all! This can get incredibly messy if future changelists were already using that new variable name in which case it’s safe to say that nobody is going to be happy.

Summary

Trust me when I say that it’s always better to isolate your changes as much as possible. These are lessons that have cost me personally more than my fair share of pain and suffering and I hope that you can learn from my mistakes as well as the mistakes of many of the other ghosts-of-engineering-past. As always, I’d love to hear any of your thoughts and opinions on this as well and keep an eye out for my next article on a step-by-step guide for taking your own Perforce Changelists and breaking them into their own isolated changes.

Advertisements