Release engineering is a discipline within a field of computer science called Software Configuration Management. For most people, this is just building and installing software. Here I will show that it can be much more.
Another case for full releases
So on the next release last week, when the deploy blew away the developer's changes, the only way to get back the developer's changes was on the peer server. With a single server environment, this would not have been possible.
Yet again, this shows that partial releases have no place in live server environments.
Full or partial releases
My answer is always "No!". There are a number of reasons why, but most simply is that doing so defeats the basic tenets of release engineering: reproducability, traceability, accountability.
As a simple example that recently occurred in my own job, we were starting to upgrade a pair of server that had not been touched in nearly three years. We started in on the first of the pair however we soon found that the code needed to be reverted. When we redeployed the backup of the war file, the code was exhibiting incorrect behavior, especially compared to its twin. Turns out that soon after the deployment three years earlier, a developer also deployed a single JSP file into the exploded war directory. Three years go by, developer has moved on, managers have forgotten everything, two different release engineers moved on. The developer was playing around with a production system, didn't document what happened and didn't re-release the product to correct the situation. As a result, the product could have been in jeopardy if we had not been more tentative and worked on only one server at a time (another topic of discussion: a lot of developers want me to deploy to all the servers at once). By only working on one, we could compare the old war with the changed war on its twin.
This anecdote illustrates one major issue with partial deployments. There was no accountability for the product: none of the manager, developer or the release engineer at the time of the first release held themselves accountable. There was not tracability: no documentation was kept, no build records, no labeling - just a file plopped down into a directory. No reproducability: the environment could not be reproduced without looking at the twin server. If that twin production server hadn't existed, then there would have been nothing to revert to.
Full deployments from a single, release distribution build lead to all of these tenets. Anything less breaks release engineering practices.
How not to release
First, they would copy the software from one environment to the next instead of installing the software from distribution files. This led to a large number of conflicts within the base configurations (usually with the multicast addresses and ports which would be forgotten).
Next, they insisted on having tweaked startup programs that were never committed to a revision control system, replaced the ones that would have been in revision control and that the "installation" system needed to work around (i.e., not overwrite). Instead of the operations team working with development to have a set of startup programs that worked across the board and having those committed to revision control, there are multiple sets of install/start/stop programs that may or not be correctly maintained and most assuredly are not properly tracked and auditable.
Lastly, the operations team insists on configuring their "large number" of J2EE server systems by using GUI Jconsole-like applications using mbeans. This is laboreous, time consuming, error prone and unweildy. Most sites that I have worked at will generate configuration files and have them pushed out as text to the multiple servers, modified for each specific server and environment. These files can be recreated at any time and if properly managed through version control, can even be rolled back to any point in time. Managing a large number of servers through a GUI is an effort in futility.
Distribution program
Names of distribution files, server hosts and pathnames are captured in one of the aforementioned XML file. The distribution files are copied to the distribution directory on the remote server ('server:~/vrel#'); files that have already been copied are checked with md5 checksums. The product's install program is also implicitly copied.
After the distribution files have been copied, the program calls the product's install program on each remote server.
There are options to manage the program flow and to access the XML data files.
The program is simple, quick and efficient.
A lot of the past work has been laid as foundation to be able to write a program such as this. Individual application accounts, a +90% common code install template system for 20+ products, more than 25 products working on the same Generalized Release Process.
Application accounts
The application installation directory is managed by the product's install program. The only contents should be from the installation and from the application. Users should not "move" files aside, or copy files to that directory. This is the production installation directory, not a work-area.
The home directory is to house builds, distributions, patch files, other "cruft", depending on the host. Release builds are stored under the "releng" directory by release number, for example "~/releng/1.2". QA builds are stored under the "qa" directory in the same fashion, or as the QA engineer chooses. Distributions are stored in the home directory in sub-directories starting with "v" and the release number, e.g. "~/v1.2". The install program, always found in "sbin" in the source repository, is copied reach release (since there may be changes for that release) to the distribution directory (on the destination server).
Generally builds are performed in the engineering environment and no build tools are available in the production environment. Because of co-location, the engineering environments and QA environments share home directories and the production environment is isolated. Many of the products are installed and deployed on multiple hosts, within each environment. Therefore, it is important to have a shared resource (the home directory) in each environment to be able to install distributions or patches from.
Release numbers are just symbolic names, they could be anything - conventionally they are dotted numeral sets that match tags in the source repository (e.g. "1.4.6.1"). But a release number could be "1.2-test0012" for a QA release. This may correspond to the "1.2" source tag.
QA tagging
The QA tag is a label (or tag) that breaks the typical definition of a label - it repeatedly gets changed, much like a branch would.
In Subversion, I create a new peer directory to the standard "branches", "tags", "trunk" directories called "qa". This directory holds the QA tags. The QA tags are copied from trunk as the release tag would have. Changes are merged as needed. When QA approves the product for release, the release engineer will make the release tag from the QA tag.
This has the advantage that development can continue without directly affecting the QA engineer and selective revisions can be merged into the QA tag as needed (not taking HEAD every time).
Types of files to be released
I have divided files in the application into four possible types:
- program - binary executables, shell scripts and library files. A library file is code that is being used by another program, but is not executed directory.
- configuration - files that define values specific to this installation, e.g. db source, appl dir, peer hosts, etc.
- persistent - files that would need to survive upgrades and/or backups, e.g. databases
- transient - files that are created for temporary use, e.g. log files, caches
Program and configuration files could be set to read-only by the install program; configuration files changed to read-write as needed. The directories for the persistent and transient files should be read-write for the owner of the application and any other application that needs access.
With this simple separation, upgrades and upgrade programs can be made much more easily. Backups can be tailored as needed. And the application's release structure is similar to its memory structure: program code separated from data, separated from heap, separated from kernel, etc.
Generalized Release Process
Of course, the release processes are completely different. At first, I did not try to merge the different processes, but developers were starting to cross "company boundaries". I decided to create a new theory of process for the developers.
The release process is broken into four components:
- build
- distribution/packaging
- installation
- deployment
Build
The build step is the traditional build of software, but for qa and production, as an isolated build process within a standardized, controlled environment. The developer environment is too fluid and unstable to be used for releases. The production build environment should be separate and only accessible to the release engineers. This helps prevent unknowns from being introduced into the release process. The build is performed on a build host often within a revision controlled work area ("sandbox").Additionally, QA/Prod builds are to be made from tagged code. This allows for reproducibility. This implies that builds do not perform code management/version control. Version control should not be managed by build scripts (exception being CruiseControl-like systems for automated testing based on code changes).
For some code bases, there will be nothing to build, e.g. only perl or php modules. In this case, the build step would be empty, but would still exist.
Distribution/packaging
This step will copy deliverables into a presentable directory structure for the intended target platform or environment (Windows vs. Linux, JDK 1.4 vs. JDK 1.5, and eng vs. qa vs. prod). There may be different environment sets and multiple options. As examples, environments for eng, qa, and production, options for load testing, regression testing or CruiseControl, selecting different database engines.
Build scripts (Makefile, build.xml, maven.xml) may represent the different options and environments by separate targets, variables or options. The build scripts would select the configuration files, libraries or executables to be "shipped".
The distribution step will also package the project for installation at a later time. It will not copy outside of the release engineering environment. Note that this is not packaging of jar/war/ear files – that should be part of the build process.
The purpose of the distribution step is to select proper files, position them, and then package them for later installation.
If there is no need for changes and no need to package deliverables (one file being delivered), then there is no need for this step. Distribution occurs on the build host often within the sandbox (local revision control work area).
In a practical matter, there may be two "targets" for this step: "distribution" and "packaging". Distribution should precede packaging in the process.
Installation
Installation is the traditional step to create a copy of a software distribution and make it available for execution or deployment. There is no distinction between environments; the previous distribution step should have handled that. This step is solely to take a packaged copy of the built project and install it on a machine or machines. Any pre-execution, pre-deployment steps that need to be made are executed here. For example, creation of directories, setting of permissions, changes to databases.
Installation will most often occur on a different machine from the build host and outside of the sandbox. An installation will likely not have targets in the build scripts, unless a build script gets packaged with the application to be used as an install script.
Deployment
Deployment is the activation of the software on the installation machine. This could be deploying a servlet to Tomcat/JBoss, adding a cron job or starting a daemon that has been installed with a startup script or some other means.
After the software has been installed on a machine, the code is deployed from the installation point to the appropriate application server.