The Complex, the Complicated and what it means for NASA

When people say ‘Simplifying Complexity’ I hear ‘I don’t know what these words mean’. This isn’t entirely fair, as we shall see, but it gives you an idea of my preconceptions before attending a training session of that title.

The facilitator started off by inviting each member of the group to talk about the influence of complexity in their professional lives. As we went around the room it became clear that for most of the group complexity was a synonym for hard or complicated. Many would agree, including the Oxford English dictionary. However, here was a group dominated by senior members of a very large IT professional services company. These are people for whom complexity science is highly relevant to their professional lives and people for whom there ought to be value in defining Complexity with a big C and in particular differentiating the Complex from the Complicated.

Cynefin

The Cynefin Framework characterises systems (and problem spaces) into five classes – Simple, Complicated, Complex, Chaotic and Disordered. Once the class of a system is identified the framework provides guidance on how to work most effectively with the system. On a personal level I find it very useful in explaining to me not only why an incremental and iterative approach generally works well in software development, it also suggests to me under what circumstances such as approach does not hold. You won’t see NASA using Scrum for instance, more on that later.

Cynefin can be expressed visually like so:-

 

cynefin_as_of_1st_june_2014-1

Credit Wikipedia

The key distinction I see is that of Complex and the Complicated.

  • Complicated

An example of a complicated task is building a submarine. It’s definitely not trivial, but if the tasks are broken down sufficiently the problem is tractable and predictable – essentially it becomes a series of Simple tasks. By contrast a Complex task retains its complexity even after being broken down.

  • Complex

An example of a Complex task could be to ask a room of people to arrange themselves such that the closest person to them is exactly half the distance from them as that of the 2nd closest. It would be extremely difficult for an individual to orchestrate this task, instead each individual actor must make a small change, observe the consequence and then make a further change based on the feedback.

What does this mean for Software?

Taking a manufacturing production line as an example, once it reaches steady state the system can be described as Complicated and is predictable. This is important because for many years Software Development looked to the world of manufacturing for guidance. In doing so, implicitly defining software development as a Complicated task. The thinking being that with sufficient up front analysis the problem could be solved without a line of code being being written. It is for this reason that the Waterfall model rose to such prominence and was so readily adopted.

More recently manufacturing has been shown to be an inadequate metaphor and that in fact software development is more akin to product design, thereby  inhabiting the Complex quadrant. This means that the correct approach, according to Cynefin, is to probe, sense and respond. In effect the iterative and incremental approach practiced by those inspired by Agile and Lean thinking.

So what does this mean for NASA?

If what I say is correct, what does this mean for NASA? After all they have some of the smartest brains on the planet available to think about this sort of thing and yet their processes appear to assume a Complicated rather than a Complex environment.

NASA is an extreme organisation and it is only natural that what works for them will deviate from the common case. Despite the challenging nature of their work I would argue that their domain is Complicated rather than Complex.

Complex systems are characterised by there being many unknowns and an inability to determine the nature of those unknowns at the beginning of the process. In NASA’s case they are genuinely in a position where, for software at least, the problem space is well defined.

For instance, the software runs upon hardware based on Intel’s 386 architecture, hardly cutting edge, but whose behaviour (warts and all) is well understood and has not changed in years. This means that the inputs to the system are well understood and the behaviour deterministic. NASA has managed to turn what would ordinarily be a Complex system (that of software development) into a Complicated system and in doing so their software teams work to vanishingly small defect rates, albeit at enormous cost and at the expense of delivery time. I found this article to be a fascinating insight into their working practices.

In conclusion, Complexity science provides a means to characterise systems, the Cynefin framework provides definitions to differentiate the Complex from the Complicated. Software development is almost always Complex though it was originally assumed to be Complicated. This is why agile and lean approaches have been shown to much more effective than traditional methods that assume a Complicated system such as Waterfall.

In Praise of Continuous Deployment

It doesn’t matter if you get there, every step along the way is an improvement.
Me, praising Continuous Deployment

Ever since coming across the idea on Eric Ries’s blog I’ve always been a big fan of Continuous Deployment. For those unfamiliar with the term, it means writing your code, testing frameworks and monitoring systems in such a way that it is possible to completely automate the process of going from source control commit to deployment to a live system without posing a quality melt down. This means teams can find themselves deploying 50 times a day as a matter of course.

Yeah

It’s not without it’s critics, and a lot of people see this as one way ticket to putting out poorly tested buggy code. I think that those folk completely miss the point and that in many scenarios in fact the opposite is true. The thing I really like though, is that, whether or you ever get to the point of automatically deploying every commit to live, every step that you might take to get there is hugely positive.

So, really, what would have to happen in order to employ a Continuous Deployment regime?

18 months ago my then team started to take this idea more seriously, I thought it would be interesting to give an overview of the steps taken towards Continuous Deployment, and since we’re certainly not there yet, what we plan to do in the future.

We started from a point where we would release to live environment every few weeks. Deployments, including pre and post deploy testing could take two people half a day sometimes more. I should also say that we are dealing with machine to machine SaaS systems where the expectation is that the service is always available.

Reduce manual deployment load

Our first efforts aimed to reduce the human load on deployment through automation. Fear meant that we still needed to ssh into every node to restart but every other step was taken care of. This meant that it eventually became common place to deploy multiple times a week across multiple platforms.

Improve system test coverage

Once a deploy was live we were still spending considerable time on behaviour verification. To address this we worked to improve our system and load testing capability. Doing so meant that we had more time to manually verify deploy specific behaviour, safe in the knowledge that the general behaviour was covered by the tester.

Improve system monitoring

This approach also requires a high level of trust in system monitoring. We have our own in house monitoring system whose capabilities we expanded during this period. In particular, we improved our expression language to better state what constituted erroneous behaviour and we also worked on better long term trend analysis, taking inspiration from this paper . It’s no surprise to me that it came out of IMVU who have been practicing Continuous Deployment for a long time.

Reduce deploy size

Since the act of deployment was now much less expensive we looked to reduce the number of changes that went out in each deploy. At first this felt false, after all if the user can’t use the feature in it’s entirety, what’s the point? We soon realised that smaller chunks were easier to verify and sped us up over time. We took an approach that I’ve since heard referred to as ‘experiments’ so that new functionality could be deployed live but was hidden from regular users. It meant that we could demo new functionality in production, without disrupting the business as usual service.

Embrace lean inspired methodology

Breaking down deploys into a few day’s worth of work also improved our lead time meaning that we could more responsive in the event of a change of plan. It was during this period that we switched from time boxing to Kanban. This is interesting since Continuous Deployment is often championed by the lean startup movement.

The future

More recently, actively pursuing Continuous Deployment has taken a back seat, but the next logical steps could be to further flesh out the system test coverage and then look to completely automate deployment to the staging environment (modulo database changes).

However, it doesn’t really matter what we do next, if it takes us a step closer to theoretically being able to deploy continuously it will undoubtedly improve our existing lead time and responsiveness.

This post contains a number of Continuous Deployment resources, but a few further articles I found to be interesting follow include:-

Why I don’t use bug tracking software – slides

In addition to speaking on Kanban at XP Day 2010 I also gave a short lightning talk based on my earlier fragile posts on bug tracking (1,2). Initially I was apprehensive about standing up in front a room of agilistas and telling them I’d dispensed with digital bug tracking, but since the previous session had been about throwing everything out (really really everything, just deploy to live), I felt positively conservative.

Motivation (almost) for free

Individuals and interactions over processes and tools

So reads the first line of the Agile Manifesto. Agile (and Lean) frameworks differentiate themselves from traditional software project management in the value that they place in people.

I’m always confused when I people attempt to separate out line and project management in the context of software. To me they are intrinsically linked, and cannot meaningfully be considered in isolation. In fact I’d go so far as to say that one of the principle drivers of the agile movement is not so much what it says about project management, significant as that is, but more what it says about motivation and simply letting smart people get on and do their jobs.

As an illustration, I’d like to use Martin Haworth’s ‘Ten Tips About People Management’ as a representative article of a more general approach to people management. For each point I’ve highlighted how it is commonly addressed in the context of an agile team.

1. Talk to Your People Often
By building a great relationship with your people you will bring trust, honesty and information. This gives you a head start in Performance Management of your people.

Daily stand ups provide a frequent and regular period to interact with the team. It doesn’t provide one on one time but it does mean that raising impediments and sharing of ideas is common place such that it then feels much more natural for one on one conversations to occur.

2. Build Feedback In
On the job two-way feedback processes gets rid of the nasty surprises that gives Performance Management such a bad name. By building it in as a natural activity, you take the edge away.

Agile is all about rapid feedback. Both at a technical level in the form of TDD and CI and a personal level through daily stand ups and a commonly agreed definition of done.

3. Be Honest
By being frank and honest, which the preparation work in building a great relationship has afforded you, both parties treat each other with respect and see each other as working for everyone’s benefit.

With lead times measured in days, trust and openness are essential for any agile team, where honesty does not exist the whole process collapses.

4. Notice Great Performance
When you see good stuff, shout about it! Let people know. Celebrate successes and filter this into formal processes.

At a team level daily stand ups and visualisation of work flow provides this for free. Furthermore, since features are delivered in a state ready for production, regular product demos provide the customer with a hands on measure of progress.

5. Have a System
Performance Management is a process and needs some formality – especially for good personnel practice and record. This need not be complicated, but it needs to be organised and have timescales.

Agile frameworks have little to say directly on the subject of performance management, though there is an assumption that team members are continually looking to improve their skills and performance. Agile working models introduce the idea of cadence where there is a period dedicated to retrospection and continuous improvement.

6. Keep it Simple
But do keep it simple. If you have a relationship with your people that is strong anyway, you already know what they are about. Formal discussions can be friendly and simple, with formality kept to a minimum.

With an emphasis on verbal communication, it’s easy to have serious conversations in a relaxed but constructive manner.

7. Be Very Positive
Celebrate great performance! Focus on what’s going well. It’s about successes and building on strengths, not spending ages on their weaknesses – that serves no-one. Go with the positives!

Again, constant feedback through regular delivery of working software, reinforces and encourages good practice.

8. Achieve Their Needs
Remember that we all have needs that we want fulfilling. By working with your people to create outcomes that will do this, you will strengthen your relationships and channel effort in a constructive direction.

Since teams are typically cross functional, team members are exposed to a range of challenges, and have a clearer idea of where they feel they are strongest. While an agile framework does not specifically look to fulfil the longer term needs of an individual, it at least attempts not pigeon hole them into a specific roles.

9. Tackle Discipline
Whilst it often happens, Performance Management is not about managing indiscipline. That has to be managed in a different way. By setting clear standards in your business that everyone understands and signs up to, discipline becomes much, much easier.

In the same way that team success and good performance are highly visible to the team, individual poor performance is also obvious. Expectations of good practice are generally arrived at through team consensus and so could not be clearer. If the expectations are inappropriate or unrealistic then the team has the power to amend as necessary.

10. Learn from Mistakes
As part of regular on-the-job and informal review, mistakes will come to light; things will go wrong. By using the ‘What went well? And ‘What could you do differently?’ format, the unsatisfactory performance becomes controllable and a positive step.

Where teams are correct to take take credit for their success, it is equally important that take responsibility for failure. Examples include retrospectives or Lean style 5 Whys root cause analysis.

Software is an industry not always known for the strength of its people management, by pushing human issues to the fore agile and lean frameworks ensure that not only is the management of of the project considered, but also the management of the people. In a practical sense, while applying agile principles won’t necessarily make me a good manager, they make me less likely to be a bad one.

Agile makes me SMART

Performance review 101, objectives should be SMART. Theoretically, SMART objectives make an awful lot of sense to me, but when asked to set objectives for a developer to cover the next 6 months it really doesn’t feel practical.

Normally, where I have a management problem, the agile community is a great source of inspiration and advice. It turns out that while it has little to say about performance review (why should it?), I’ve been using SMART objectives all along in the form of stories.

Let’s have a look at what SMART means and how it applies to stories.

Specific

By breaking down epics into manageable features it’s possible to isolate a small chunk of behaviour, such that all involved have a clear of idea of the scope of the problem.

Measurable

Acceptance tests provide a clear means to determine when the story is done. In many cases the tests can be expressed in terms of ‘Given a set of preconditions, when a specific event occurs, expect the following behaviour’, this serves to reduce ambiguity and provide black and white tests to determine success.

Attainable

Stories are estimated by the developers charged with their implementation. If the developers cannot see a way to approach the problem then it will not be possible to provide an estimate and the story cannot be accepted. At this point the story will either need to be broken down, or reconsidered before being re-estimated in its new form.

Relevant

The story must in some way relate to the bigger picture, to do this stories refer to features recognisable by the customer as opposed to tasks recognisable only to the development team.

Time Bound

Working in time boxed iterations, or alternatively a work in progress limited flow, means that clearly defined expectations for delivery exist. Where these expectations are not met is is clear to all and it is time to discuss what went wrong.

I have a long held belief that one of the main reasons that the agile movement has been so effective is that it builds good management practices in from the ground up. SMART objectives in the form of stories are just one example.

Bad reasons to adopt Kanban

David Anderson recently posted Why do we use Kanban?

He proposes two lists, one entitled ‘Why do we use Kanban’ and a second entitled ‘We do not use Kanban because’, the second list contains complimentary practices that are themselves not a reason to adopt Kanban. For instance it is not necessary to adopt Kanban if all you want to do is dispense with iterations.

The thing I found really interesting about this post is that the reasons my team first adopted a kanban system were drawn almost entirely from the second list. In particular our dissatisfaction with time boxing.

This may, in part, be due to us coming at this very much from a software development perspective, rather than considering the entire value stream.

David lists the following as reasons for practising Kanban

  • Evolutionary, incremental change with minimal resistance
  • Achieve sustainable pace by balance throughput against demand
  • Quantitative Management and emergence of high maturity behavior in alignment with senior management desire to have a highly predictable business
  • Better risk management (the emerging theme in the Kanban community)

It’s only through practising kanban and learning more about the lean principles underpinning it that the real benefits such as improved risk management and greater predictability have become apparent.