In Praise of Continuous Deployment

It doesn’t matter if you get there, every step along the way is an improvement.
Me, praising Continuous Deployment

Ever since coming across the idea on Eric Ries’s blog I’ve always been a big fan of Continuous Deployment. For those unfamiliar with the term, it means writing your code, testing frameworks and monitoring systems in such a way that it is possible to completely automate the process of going from source control commit to deployment to a live system without posing a quality melt down. This means teams can find themselves deploying 50 times a day as a matter of course.

Yeah

It’s not without it’s critics, and a lot of people see this as one way ticket to putting out poorly tested buggy code. I think that those folk completely miss the point and that in many scenarios in fact the opposite is true. The thing I really like though, is that, whether or you ever get to the point of automatically deploying every commit to live, every step that you might take to get there is hugely positive.

So, really, what would have to happen in order to employ a Continuous Deployment regime?

18 months ago my then team started to take this idea more seriously, I thought it would be interesting to give an overview of the steps taken towards Continuous Deployment, and since we’re certainly not there yet, what we plan to do in the future.

We started from a point where we would release to live environment every few weeks. Deployments, including pre and post deploy testing could take two people half a day sometimes more. I should also say that we are dealing with machine to machine SaaS systems where the expectation is that the service is always available.

Reduce manual deployment load

Our first efforts aimed to reduce the human load on deployment through automation. Fear meant that we still needed to ssh into every node to restart but every other step was taken care of. This meant that it eventually became common place to deploy multiple times a week across multiple platforms.

Improve system test coverage

Once a deploy was live we were still spending considerable time on behaviour verification. To address this we worked to improve our system and load testing capability. Doing so meant that we had more time to manually verify deploy specific behaviour, safe in the knowledge that the general behaviour was covered by the tester.

Improve system monitoring

This approach also requires a high level of trust in system monitoring. We have our own in house monitoring system whose capabilities we expanded during this period. In particular, we improved our expression language to better state what constituted erroneous behaviour and we also worked on better long term trend analysis, taking inspiration from this paper . It’s no surprise to me that it came out of IMVU who have been practicing Continuous Deployment for a long time.

Reduce deploy size

Since the act of deployment was now much less expensive we looked to reduce the number of changes that went out in each deploy. At first this felt false, after all if the user can’t use the feature in it’s entirety, what’s the point? We soon realised that smaller chunks were easier to verify and sped us up over time. We took an approach that I’ve since heard referred to as ‘experiments’ so that new functionality could be deployed live but was hidden from regular users. It meant that we could demo new functionality in production, without disrupting the business as usual service.

Embrace lean inspired methodology

Breaking down deploys into a few day’s worth of work also improved our lead time meaning that we could more responsive in the event of a change of plan. It was during this period that we switched from time boxing to Kanban. This is interesting since Continuous Deployment is often championed by the lean startup movement.

The future

More recently, actively pursuing Continuous Deployment has taken a back seat, but the next logical steps could be to further flesh out the system test coverage and then look to completely automate deployment to the staging environment (modulo database changes).

However, it doesn’t really matter what we do next, if it takes us a step closer to theoretically being able to deploy continuously it will undoubtedly improve our existing lead time and responsiveness.

This post contains a number of Continuous Deployment resources, but a few further articles I found to be interesting follow include:-

Bad reasons to adopt Kanban

David Anderson recently posted Why do we use Kanban?

He proposes two lists, one entitled ‘Why do we use Kanban’ and a second entitled ‘We do not use Kanban because’, the second list contains complimentary practices that are themselves not a reason to adopt Kanban. For instance it is not necessary to adopt Kanban if all you want to do is dispense with iterations.

The thing I found really interesting about this post is that the reasons my team first adopted a kanban system were drawn almost entirely from the second list. In particular our dissatisfaction with time boxing.

This may, in part, be due to us coming at this very much from a software development perspective, rather than considering the entire value stream.

David lists the following as reasons for practising Kanban

  • Evolutionary, incremental change with minimal resistance
  • Achieve sustainable pace by balance throughput against demand
  • Quantitative Management and emergence of high maturity behavior in alignment with senior management desire to have a highly predictable business
  • Better risk management (the emerging theme in the Kanban community)

It’s only through practising kanban and learning more about the lean principles underpinning it that the real benefits such as improved risk management and greater predictability have become apparent.

In Defence of ‘Scrum but…’

Scrum, as a framework, is practiced in many different organisations and can be considered a mainstream approach to software development. To me agile is way of scaling common sense and while I no longer practice Scrum I consider it an excellent implementation and would happily return to it if my circumstances were better suited to time boxing. However as with all great ideas that are adopted widely, there are many instances where people are simply doing it wrong.

Warren Buffett has a neat way of describing this:

First come the innovators, who see opportunities that others don’t. Then come the imitators, who copy what the innovators have done. And then come the idiots, whose avarice undoes the very innovations they are trying to use to get rich.

In the case of software it’s reasonable to replace the goal of getting rich with that of successfully delivering software that is valued by the customer.

As such there are a wealth of posts discouraging the adaption and variation of Scrum, in fact the notion of Scrumbut is so well understood it even has a definition.

scrumbut [skruhmbut] noun.
1. A person engaged in only partially Agile project management or development methodologies
2. One who engages in either semi-agile or quasi-waterfall development methodologies.
3. One who adopts only SOME tenents of the SCRUM methodology.
4. In general, one who uses the word “but” when answering the question “Do you do SCRUM?”

Project management will necessarily vary from company to company. As such the ubiquity of a term like Scrum has less value within a specific company, since the shared understanding of the company’s approach ought to be evident to all. When I talk to people outside of my company it’s useful to reference commonly understood terms as a starting point. For time boxed approaches Scrum is an obvious point of reference, but the specific system that I wish to describe is highly likely to deviate from Scrum in some way and I do not necessarily see this as a bad thing.

The challenge of software development is to produce something that someone will find valuable. There are plenty of different ways to achieve this and all are highly dependent on the context of the project. Generally an iterative and incremental approach that bakes quality in from the start is preferable and a number of ready made implementations exist such as Scrum, Kanban or Evo. In some cases it may be appropriate to lift something like Scrum wholesale, but in many other cases, there are legitimate reasons for adapting the process. For example a team might be writing an implementation of a system where the behaviour is well understood. In fact in order for the system to be compliant and therefore usable it must adhere to a large pre-written spec provided by an external party. Clearly this is at odds with Scrum but there is no need for the team to drop time boxing or stand ups as a result.

So adapting a process is not in itself something to be frowned upon. Clearly bad project management is bad, and if the manager blames their own failings on their inability to implement a methodology then larger problems exist. The key point is that good managers should not worry that they are running Scrumbut, or indeed if they are ‘doing it right’ as the only metric for correctness is that the team maximises the value they create.

Kanban in practice

My team has been using a Kanban board to manage our day to day work flow for the past few months having switched from time boxed iterations, what follows is a description of how we use the board in practice. This post concerns itself only with day to day development and does not cover high level planning.

First some context. The team is made up of four developers, a technical lead and myself as project manager. We are responsible for the ongoing development, deployment, support and maintenance of six hosted products that are expected to run continuously. We would expect to release new functionality to the live environment a ‘few’ times a week.

The Board

Our is board is positioned in clear view of the team and we conduct stand ups in front of it.

Example Kanban Board

The board contains 5 sections

  • Not Started – Stories we have committed to completing and that can be started at any time.
  • In progress – Stories where development and verification are underway
  • Ready to deploy – Stories that are ready to be deployed into the live environment
  • Done – Stories deployed live
  • Blocked – A place to track stories that have become blocked by parties outside of the team as project manager it is generally my responsibility to unblock these stories.

We limit work in progress in two places, currently we assign roughly a week’s work for ‘Not Started’ and a further week’s work for the combined total of ‘In Progress’ and ‘Ready to Deploy’.

We do not consider these limits to be ‘hard’ and will from time to time exceed them, however in doing so we will have had a conversation to ensure that we are happy to be exceeding the limit temporarily.

Stories

Stories are represented by post it notes, in addition to a brief description of the story each card is presented in a standard form like so:-

Post It Example

Size

Many lean practitioners view estimation to be a form of waste and therefore something to eliminate. Estimating is still valuable to us as it provides a means to build consensus over what the story entails, this is especially useful for less experienced members of the team who may not have considered all aspects of the story fully.

Project

While each story is a deployable artefact they generally form part of a larger set. By grouping projects in this way the story description can be simpler since the project itself provides context.

Dates

We record the date a story entered the ‘Not Started’ state, the date we began work on it and the date the story was deployed into the live environment. Doing so allows me to answer questions like ‘if we were to start a three point story right now, how long would it take to get it live?’, this in turn allows me to justify spending a fortnight on improving our deployment process as the positive impact is there for all to see.

Description

We try to keep the description as brief as possible, by ensuring that lead time is minimised to a few days it is practical to communicate decisions verbally. For complicated stories we can provide more specific details via a wiki or a tool like agilefant.

Pink Notes

In addition to planned project work we will receive a certain level of support and maintenance requests, that we represent by pink post its. These tasks are large enough to add up, but each one will typically take no more than half a day. Anyone within the team can create a pink note and we ensure that our board has enough slack to accommodate these new tasks.

New pink notes are discussed at stand up, after which there are three outcomes:-

  • The task has been prioritised and someone is already working on it (this is rare and occurrences should be minimised)
  • The task is added to our standard work flow and does not receive special prioritisation – we work on the assumption that our lead time is sufficiently low to respond quickly.
  • The task is removed and ignored. We explicitly do not use bug tracking software reasoning that tasks we choose to ignore when they are fresh in our minds are unlikely to be returned to.

The volume of pink notes is monitored so that more realistic estimates can be made for project completion as well as tracking gradual increases and decreases over time.

Maintenance

Smaller day to day maintenance tasks like reviewing error logs and looking into monitoring alerts/trends are handled by a designated maintainer. One off tasks that will take more than a few hours to complete can be added to the board as a pink note. The role of maintainer switches on a weekly basis amongst the developers and the aim is that the non maintaining developers can focus on project work and/or pink notes.

Performance Tracking

We attempt to track team performance over time so that we provide estimates for project work with greater confidence. It also means that we can track the impact of changes to the team e.g. what is the impact of moving a developer on loan to another team?

To do this we use two different methods. The first draws on Aslak Hellesøy’s work, and measures standard lean metrics such as lead time and WIP. The second provides some supplementary metrics such as finer granularity over what types of work were delivered on a weekly basis and finer granularity of lead time between phases. Aslak’s sheet generates a number of graphics, one being a cumulative flow chart which neatly visualises our recent Christmas change freeze.

A second sheet of my own making produces a chart for velocity, this is an agile rather than a lean metric, but I find it useful to track where we are spending our effort. The yellow ‘Total Velocity’ line saw tooths at about ~ 14 points, suggesting that we work to a fortnightly rather than weekly cadence. It also shows that during late Nov early Dec we spent more time on unplanned pink notes than we did on project work. Much of the pink note load will have come from outside of the team so it’s important that we have a way to make the impact visible.

The Switch to Kanban

Recently my team have been trialling the use of a Kanban board to manage our work flow and in doing so have dropped time boxing. So the far the switch has been very positive, while it may not be an appropriate transisition for all teams I thought that it might be useful to share our experiences.

In brief, Kanban is an agile development methodology that does not prescribe time boxing (i.e. fortnightly iterations), rather it focusses on limiting work in progress. For each stage of a story’s life span e.g. ‘in progress’ or  ‘ready to deploy’ the team has a maximum amount of work that can exist. As stories are completed the team pull new stories from the project backlog. For more information I would recommend Henrik Knibergs excellent comparison of Scrum and Kanban

Why Kanban?

We originally adopted time boxing about 6 months ago, the key benefits we were looking for included:-

  • A sense of rhythm and points to reflect on our working practices.
  • Better visibility over tasks that were dragging on.
  • A highly visible feedback loop to help improve our estimations.

For the most part the adoption of time boxing was very successful, we’ve certainly improved our abilities to estimate, and generally we have better visibility over our performance. The thing that didn’t work so well was working to the self imposed iteration deadline since it often felt false and sometimes was counter productive.

Any team working to an iteration will occasionally finish early or fail to complete all tasks, it’s not ideal but manageable so long as the team continually refines it’s working practices to minimise occurrences. In my team’s specific case this is especially hard since so much of our work is dependent on our interactions and integrations with companies much larger than ourselves. Experience tells us that it is practically impossible to predict the time spent on an integration and even harder to guess at what point the time will be spent. As a result, despite working in a predictable manner on the majority of our iteration goals it was common that we would under or over shoot.

This is hardly the end of the world though it started to chip away at the sanctity of the iteration and continually raised questions of why the system was pushing us in ways that felt wrong to us i.e. if we finish early then we can take another story from the back log so long as it fits into the time remaining, however ideally, for a given project, we’d want to the story with the greatest risk attached which may or may not fit in the time allotted. It felt that time boxing was nudging us away from this idea.

Furthermore, my company is such that our developers typically have a very good level of product domain knowledge, as such the need for a formal Scrum style Project Owner is reduced. This means that while we would demo our work to interested parties on completion of features, come the end of an iteration it was rare to demo our progress.

It was clear that while providing some benefits we were fighting against time boxing and we needed to find an approach that better described our work flow. What we were aiming for was not so much to complete a set of tasks in a pre-allotted time period, more that we should be able to regularly produce deployable features in a predictable manner. By limiting the work in progress rather than limiting the work per time Kanban presented a viable alternative we felt better reflected how we actually work, while preserving the discipline necessary to deliver working software multiple times a week.

Initial findings

Since mid iteration our practices were already similar to that of Kanban we found that migration to the system caused very little disruption and we did not notice a productivity hit while we got used to the new system.

The migration has brought about subtle but positive changes to how we work, largely due to the fact that by explicitly limiting the amount of current work it forces conversations to be had where we are close to violating the limits.

In addition to making it easier for us to refine our process, the main benefits so far include:-

  • Greater flexibility in our work flow, we can respond in days rather than the three weeks a two week iteration promises. This is key, as delivering supplier driven changes quickly provides significant competitive advantage.
  • On a related note, we no longer feel that we are fighting our process. It is no longer the case that we regularly take a course of action that violates our process (albeit for perfectly sensible reasons).
  • The board provides a highly visible place to display tasks, communicate priorities and highlight bottle necks. These benefits are not Kanban specific however Kanban’s greater emphasis on use of the board (through tracking work in progress limits) means that any benefits are magnified.
  • It saves us an iteration meeting – the role of the meeting is covered in stand ups or in project specific design meetings.
  • Our stand ups have become much more focussed, this is partly because we do them in front of the board. Again this is not Kanban specific but in Kanban running them anywhere else makes no sense at all. Id’ say that we’ve halved the length of time we devote to stands ups, where further discussion is necessary it is taken outside of the stand up.
  • Our retrospectives are no longer coupled to the period of our iteration, currently we still run retrospectives fortnightly, though since our stand ups are now more effective for day to day tweaks we may extend this period.

In terms of cons, thus far we do not feel that we have lost anything by adopting Kanban, the only potential problem that we have encountered is that greater discipline is required in ensuring that all tasks are completed in a timely manner.

The future

So far we are very happy with Kanban and will continue to use it.

In order to measure performance over time Kanban teams track two metrics.

  • Throughput
  • Lead time – the time it takes for a feature to be delivered after the team has committed to its delivery.

At present we do not have enough data for meaningful results but long term we will maintain something much like this as suggested here.

Further reading

In addition to Scrum vs Kanban comparison referenced above you may find the following useful.

Finally not everyone is as sold as I am on Kanban for some anti Kanban debate see:-

I found the comments that follow the posts to be of particular interest.