In my last article I explored why high-performing teams and technology-driven organisations are leveraging an investment in resilience to augment their ability to evolve business-critical systems quickly. Resilience is the key to being able to be agile and reliable, or secure. In this article I’m going to show you how to get started gradually by investing in your own system’s resilience capacities through 7 properties that you can start developing today.

Invest in Resilience Capacities
The good news is that investing in resilience can start gradually. In fact I’d argue that the hardest thing is the switch in mindset from “It must work” to “It will fail, we need to be better at preparation for that”. This mindset change is the force behind the cultural change in your approach to resilience and reliability. In parallel you can then start to focus on what are the systemic capacities that you can invest in developing to improve reliability and security while maintaining and improving your speed of delivery, such as:

  • Developing and improving your capacity to anticipate.
    Can we see problems coming? What signals are we looking out for?
  • Developing and improving your capacity to synchronize.
    When we anticipate something, how do we bring the right resources to bear?
  • Developing and improving your capacity to respond.
    With our resources in play, how do they respond? How effective are those responses?
  • Developing and improving your capacity to learn.
    Given how we anticipate, synchronize and respond to inevitable problems with reliability and security, how do we better learn from and promulgate those learnings in the most effective way across the organisation?

In a future article we’ll dive into some of the strategies we’ve seen working to develop these capacities in different real-world contexts. For now, let’s dive a little deeper into how you can develop those capacities for the challenge of system reliability.

Develop Resilience Properties for a Specific Goal
In my experience there are 7 key measurable properties that you can develop across your business-critical socio-technical systems to improve your resilience for a given concern, i.e. Reliability. Those 7 properties can be best expressed as a set of questions we can ask about a desirable systemic quality, call it X for now, we are looking to develop with resilience:

  • How do we define X?
  • How do we observe X?
  • How do we explore X?
  • How do we fix/improve X?
  • How do we continuously verify X?
  • How do we learn with regards to X?

For the case of developing your system’s reliability, you would interpret these 7 properties as:

  • How do we define reliability?
  • How do we observe reliability?
  • How do we explore reliability?
  • How do we fix/improve reliability?
  • How do we continuously verify reliability?
  • How do we learn with regards to reliability?

For each new desirable systemic quality that you wish to develop you build a plan that purposefully invests in developing each of these properties for that quality. As you iterate over those plans you are looking to gradually and measurably invest in developing your socio-technical system’s resilience capacities.

Next Steps
In this article you’ve learned about the 4 resilience capacities that you can invest in to gain the advantages of speed of delivery, while not sacrificing crucially important concerns such as reliability and security. Those capacities can be improved and evolved for reliability and security by building concrete plans to define, observe, explore, fix/improve, continuously verify and learn; the 7 properties to develop for resiliency in a particular area.

In future articles I’ll explore some concrete approaches to developing each of these resilience properties in specific contexts, such as tips for developing reliability in a FinTech environment.

Russ Miles, a Lead Associate of esynergy, is on a mission to help organisations establish agile, reliable, secure and, ultimately, resilient and humane socio-technical systems that enable all stakeholders, from the users and customers to the builders and operators, to thrive inside and outside of those systems.

Russ is currently a lead engineer with Segovia Technology at Crown Agents Bank where his team develop the payment and foreign exchange systems that help incredible organisations such as the UN and Save the Children distribute much-needed funds to hard to reach countries and markets.Russ is co-founder of the free and open source Chaos Toolkit project. He’s also an international consultant, trainer, speaker, and author. He is a recognised expert in Chaos Engineering and has contributed to “Chaos Engineering: System Resiliency in Practice” from O’Reilly Media as well as having written “Learning Chaos Engineering”, also by O’Reilly Media, where he explores how to build trust and confidence in modern, complex systems by applying chaos engineering to surface evidence of system weaknesses before they affect your users.

Russ can be reached on Linkedin and on Twitter.

Start the journey to a more impactful future

Talk To Us