The deployment runbook is an artifact usually found within larger IT departments, and is designed to help guide the deployment of ‘Enterprise’ applications that requires a large number of steps, and typically can cover a period of a number of days. Anyone who had experienced this beast knows that it is almost always a sign of an environment that is horribly complex and barely manageable.
In its common form, the deployment runbook is a master Excel spreadsheet, often with many worksheets within it, typically numbering hundreds, if not thousands, of rows long, and almost always very colorful. Because it is usually the joint creation of multiple departments, it commonly takes multiple conference calls, often involving dozens of people, as well as dozens if not hundred of revisions through email, to come into its final form.
What the deployment runbook attempts to do is to lay out in exacting detail every single step that is required (or believed to be required) to deploy an application or system successfully. The success of the runbook will often depend on its ‘maturity.’ If it is the first time that the runbook has been developed and employed, there is probably little chance that the experience will be successful. As it gets re-used and ‘refined’ through subsequent deployments, the gaps and pitfalls will lessen over time.
This ‘maturation’ of the deployment runbook can often be viewed, seemingly paradoxically, as a bad thing, as it is a signal that sub-optimal practices are being embedded and entrenched.
How does such a thing come into being? Often times, it is a matter of consolidation or acquisition. Multiple disparate systems must be made to play together well. Regardless, it seems to come into being most often because the nature of the technology in use, and the practices that have evolved around them, are almost impossible to automate (for either technical or cultural reasons).
Common steps within such a runbook (described at a very high level, of course) will involve things such as backing up databases before the deployment of new code, the recycling of subsystems such as web servers, message queues, and the like, and will almost always involve a step or set of steps where the configuration of multiple systems must be manually changed to reflect what is required for the production environment.
During the deployment, there will be multiple checkpoints so that a massively attended conference call can be held to determine the status of the deployment, and all of these will be listed in the runbook. There will be an entry within the runbook to make a note of all the actual events that were not accurately reflected in the runbook, so that a meeting can be scheduled to discuss how to better improve the runbook for the next deployment.
What should one do if they are faced with the existence of such a runbook? This depends on one’s role, of course, but at a base level, every attempt should be made to eliminate the need to manually change configurations (ways to do this can be found here and here). In almost every situation, these attempts will be viewed upon favorably, and can make demonstrable improvements.
The hardest thing to do, but in many, if not most, cases, the best thing to do, is to think about how to make incremental changes that begin to reduce the scope of the deployment runbook. Ideally, you can find ways to eliminate it entirely, but like most ideals (cure cancer, eliminate poverty, end injustice, etc.), usually the best you can do is to strive for the ideal as best you can, knowing the result will forever be imperfect.
Do not despair. The deployment runbook actually does serve a purpose, imperfectly attempting to manage an almost unmanageable situation, and in a way that is understandable and acceptable to to CEO/CTO/etc. types (and this benefit should never be discounted). But it can be improved over time.
And if all else fails, you can always quit and go work somewhere else.