Release Engineering: 2007

QA tagging

A QA tag is a construct that I invented over a period of time to deal with another concept called slushing. Slushing is to refreeze an intermediate "QA release". Conceptually, you freeze your code, then thaw it slightly to make a change, refreezing the code with the new change (bug fix during QA), repeat each time there is a QA iteration. At the end of QA testing, the QA tag is used to make the release tag.

The QA tag is a label (or tag) that breaks the typical definition of a label - it repeatedly gets changed, much like a branch would.

In Subversion, I create a new peer directory to the standard "branches", "tags", "trunk" directories called "qa". This directory holds the QA tags. The QA tags are copied from trunk as the release tag would have. Changes are merged as needed. When QA approves the product for release, the release engineer will make the release tag from the QA tag.

This has the advantage that development can continue without directly affecting the QA engineer and selective revisions can be merged into the QA tag as needed (not taking HEAD every time).

Types of files to be released

One issue that I have seen come up again and again in badly conceived projects is the mixing of files that should have been properly partitioned when the distributions were created. One very simple example is that log files are created in the application's root directory. This makes it more difficult for various utilities and system maintenance routines to operate as well as for upgrades within the release engineering framework.

I have divided files in the application into four possible types:

program - binary executables, shell scripts and library files. A library file is code that is being used by another program, but is not executed directory.
configuration - files that define values specific to this installation, e.g. db source, appl dir, peer hosts, etc.
persistent - files that would need to survive upgrades and/or backups, e.g. databases
transient - files that are created for temporary use, e.g. log files, caches

Each of these types of files should be separated into different directory trees and should not be mixed. With proper design, this should be easy to achieve.

Program and configuration files could be set to read-only by the install program; configuration files changed to read-write as needed. The directories for the persistent and transient files should be read-write for the owner of the application and any other application that needs access.

With this simple separation, upgrades and upgrade programs can be made much more easily. Backups can be tailored as needed. And the application's release structure is similar to its memory structure: program code separated from data, separated from heap, separated from kernel, etc.

Release notes... finally

I've finally gotten the company that I work for to standardize the release notes that are being submitted.

Previously, some groups would send text files with some information, other groups would send PDF or MSWord files with other information, still more groups wouldn't send any release notes at all.

The principles of the new release notes format are:

easy to write
target the audience
document change (only)
information related to this release of this product
easy to read (audience targeting again)

The format is a text file broken into two sections:

A set of value pairs that will show information like product name, release number, list of developers, etc.
Descriptions for super-system impact, changes to the product, changes to perform to OS, DB or release procedures.

The release notes should not include the complete release procedures. Those should be documented elsewhere. The release notes should only changes to the release procedures.

Project vs. Product

One issue that has been going around my company lately is the difference between "project" and "product". This difference has affected the development cycle, project management and release engineering.

Simply, a project is a work effort and a product is a deliverable. Releases are made of products, not of projects.

A project could consist of multiple products over possibly multiple releases, or just be one release of one product. A project is a statement of work, of change.

Generalized Release Process

As many companies have, my current employer has acquired a number of other companies and tried to integrate their technologies and employees into the fold.

Of course, the release processes are completely different. At first, I did not try to merge the different processes, but developers were starting to cross "company boundaries". I decided to create a new theory of process for the developers.

The release process is broken into four components:

build
distribution/packaging
installation
deployment

Build

The build step is the traditional build of software, but for qa and production, as an isolated build process within a standardized, controlled environment. The developer environment is too fluid and unstable to be used for releases. The production build environment should be separate and only accessible to the release engineers. This helps prevent unknowns from being introduced into the release process. The build is performed on a build host often within a revision controlled work area ("sandbox").

Additionally, QA/Prod builds are to be made from tagged code. This allows for reproducibility. This implies that builds do not perform code management/version control. Version control should not be managed by build scripts (exception being CruiseControl-like systems for automated testing based on code changes).

For some code bases, there will be nothing to build, e.g. only perl or php modules. In this case, the build step would be empty, but would still exist.

Distribution/packaging

This step will copy deliverables into a presentable directory structure for the intended target platform or environment (Windows vs. Linux, JDK 1.4 vs. JDK 1.5, and eng vs. qa vs. prod). There may be different environment sets and multiple options. As examples, environments for eng, qa, and production, options for load testing, regression testing or CruiseControl, selecting different database engines.

Build scripts (Makefile, build.xml, maven.xml) may represent the different options and environments by separate targets, variables or options. The build scripts would select the configuration files, libraries or executables to be "shipped".

The distribution step will also package the project for installation at a later time. It will not copy outside of the release engineering environment. Note that this is not packaging of jar/war/ear files – that should be part of the build process.

The purpose of the distribution step is to select proper files, position them, and then package them for later installation.

If there is no need for changes and no need to package deliverables (one file being delivered), then there is no need for this step. Distribution occurs on the build host often within the sandbox (local revision control work area).

In a practical matter, there may be two "targets" for this step: "distribution" and "packaging". Distribution should precede packaging in the process.

Installation

Installation is the traditional step to create a copy of a software distribution and make it available for execution or deployment. There is no distinction between environments; the previous distribution step should have handled that. This step is solely to take a packaged copy of the built project and install it on a machine or machines. Any pre-execution, pre-deployment steps that need to be made are executed here. For example, creation of directories, setting of permissions, changes to databases.

Installation will most often occur on a different machine from the build host and outside of the sandbox. An installation will likely not have targets in the build scripts, unless a build script gets packaged with the application to be used as an install script.

Deployment

Deployment is the activation of the software on the installation machine. This could be deploying a servlet to Tomcat/JBoss, adding a cron job or starting a daemon that has been installed with a startup script or some other means.

After the software has been installed on a machine, the code is deployed from the installation point to the appropriate application server.

Installation program templates

I work at a company with a number of products. As a release engineer, part of my job is to try to standardize process as much as possible to enhance reproducibility.

One part of the process that many release engineers know is sometimes tedious and intensive is the creation, testing and maintenance of installation programs. I manage over a dozen products' releases; each had their own installation process. There had to be a better way.

I developed a Python script for installations that was specially designed with one small (5-10 line) section that could be replaced by a real segment of code for the product that needs the script. This one small section that gets replaced defines the directory structure for the product that will need to be (re-)created before the distribution can be put in place. It also defines what kind of distribution file should be expected as input to the program (tarfile, directory, single file, etc.)

This allows me to create template instantiations for the installation programs for each of the products based on one single install program. All the install programs behave the same way, have the same options and have the same feature sets.