Home > Risk > The concept of resilience: a new buzzword

The concept of resilience: a new buzzword

February 22, 2021 Leave a comment Go to comments

There seems to be a lot of talk and articles these days about resilience.

I have somewhat ignored the term, but recently read an interesting piece in Forbes: What Is True Resilience? (Hint: It’s Not About Managing Risk).

Before I cover that piece, it is interesting to see what people have said about the difference between ‘risk’ and ‘resilience’.

One academic has written (key sentence highlighted):

Resilience is essential to living in a world filled with risk. Resilience has historically been defined as the ability to return to the status quo after a disturbing event. However, in the face of a changing climate and growing population, resilience cannot be based on the capacity to recover from the sorts of disasters we have faced in the past, but requires that we build capacity to avoid damage and/or recover from to the sorts of disasters we can expect to face in the future. If our goal is a sustainable future, we must understand the risks we will face and prepare for those risks through adaptation and mitigation measures. Resilience is crucial in this endeavor, as it is our capacity to cope with both expected events and surprises. To this end, it is critical that we identify, assess, communicate about, and plan for risks that the future will bring.

The OECD has shared:

The ability of households, communities and nations to absorb and recover from shocks, whilst positively adapting and transforming their structures and means for living in the face of long-term stresses, change and uncertainty. Resilience is about addressing the root causes of crises while strengthening the capacities and resources of a system in order to cope with risks, stresses and shocks.

Professor Linkov of the University of Connecticut tells us:

Traditional risk management focuses on planning and reducing vulnerabilities. Resilience management puts additional emphasis on speeding recovery and facilitating adaptation.

XX

The Forbes article is written by a practitioner rather than an academic or consultant. That makes it more interesting as it based on experience borne out of responsibility.

Will Grannis is the leader of Google Cloud’s Office of the CTO and says his customers are asking how his organization’s services “stay resilient in the face of many unexpected, unpredictable events”.

This experience is of interest:

Just this week, unprecedented weather patterns across the U.S. pushed many IT and business leaders to virtual “war rooms” in order to ensure capacity, networking, and applications were instantly and persistently available. But those rooms were in the moment, rapidly assembled and then rapidly disassembled—just like the technology that underpins the real-time applications and services we all depend on. This is the new normal, and it calls for a new model of operations. Rather than setting a fixed reliability as the calculation for contracts and practices, the focus must be on resiliency under any number of conditions.

Building on that, he says (key sentence and words highlighted):

True resilience isn’t about managing a particular instance of risk, but being ready for anything through the way you operate. Today’s disasters may come from wild, unanticipated success (leading to traffic spikes) as much as devastating unforeseen failure (be that a natural disaster, a political event, or a system configuration error that cascades into a global outage).

The rest of the article explains what happened at Google Cloud and some of their philosophy around architecting their services for the general (not specific) customer. There is a continuing article about their approach to resilient IT.

XX

This reminds me of my own experience when I was a vice president in IT at a large financial institution. One of my responsibilities was to develop a disaster recovery plan for our two data centers. I was able to hire a wonderful lady, Ann Tritsch, as my DRP Coordinator (a direct report at manager level). She led the initial effort[1] and we soon faced an important question: did we need to build separate sections of the plan covering the various causes of a disaster?

Operations already had sound processes in place to address and recover promptly from a short outage and our task was to determine how data center operations would recover from an event or situation that would shut down one or both data centers for a longer period. This could be the result of:

  • A fire
  • An earthquake (we were in Southern California)
  • A flood (we were in an area that could possibly flood if a dam broke or there was an extended period of torrential rain), or
  • Some other reason

At that time, emerging thinking was that the planning should address how you recovered, regardless of the cause. That is how we built our plan (with the help of a software solution, I should add).

XX

But the DRP was not enough.

We still had to concern ourselves with making sure the likelihood of a disaster and the effect on the business were minimized – given cost and other constraints.

For example, our senior vice president (Ron Reed) led an effort to determine whether it would be viable to establish what would amount to mirroring the data center. He was looking at the possibility of sending copies of every transaction processed at one data center to the other by satellite (which we did not yet have – and this was before the age of the internet). But the cost was prohibitive. In addition, the two data centers were less than 20 miles apart, so a regional disaster could well affect both.

Ann performed, with the assistance of the operations staff, a review that we would today call a risk assessment. It considered each of the causes we might anticipate and confirmed that we had an acceptable set of measures in place. For example, we considered loss of power and examined the power system and the ability to either switch to a different power station or rely on our battery back-ups. We also recognized that there was a single point of failure in the network where all traffic from outside Southern California passed through a single station; but, there was little we could do to minimize the possibility of an outage.

XX

This still was not enough. While there were some causes of a prolonged outage that we could identify, there was always the possibility of an ‘unknown unknown’: something happening that we could not seriously identify as a likely event, such as being hit by a meteor, a pandemic (worse than today’s), or a terrorist attack.

With this in mind, we developed another plan that we called a Disaster Preparedness Plan. The DPP was designed to help us recover from any event (including unknown unknowns) that would cause more than a short disruption of our data center’s operations.

The DPP included a detailed Communications Plan. While we didn’t know with certainty who in management might be required to respond to the event or situation, we developed the necessary structure and processes.

XX

Between these initiatives and plans, we did what we could to make ourselves what would today be called ‘resilient’.

XX

What I like about the idea of resilience is that rather than designing response around specific foreseeable events and situations, it preplans and prepares you (as best you can) for what you cannot predict.

To quote the Google executive again:

True resilience isn’t about managing a particular instance of risk, but being ready for anything through the way you operate.

XX

Personally, I believe in monitoring and considering what might happen so you can both include it in decision-making and be prepared to respond to foreseeable events and situations.

But I also believe in being as prepared as possible to respond to (and mitigate if you can) unforeseeable events and situations.

XX

So, resilience merits our attention in addition to or as an integral part of any ‘risk management’ activity. (As usual, please note that I much prefer managing for success.)

XX

There is one more and very important aspect to this discussion.

In the same way that you should be prepared and resilient for unforeseen adverse events and situations, you need to be agile and sufficiently aware and responsive to unforeseen opportunities!

People pay far more attention to the first and far too little to the second.

XX

I welcome your thoughts.

[1] Unfortunately, we lost her before the plans were completed.

  1. February 22, 2021 at 8:37 AM

    It may be a “buzz word” in your domain but resilience is the basis of all we do in complex engineered systems domain

    Here’s the framework
    https://www.incose.org/incose-member-resources/working-groups/analytic/resilient-systems

    Here are the practices
    https://www.sebokwiki.org/wiki/System_Resilience
    https://www.researchgate.net/publication/334549424_Systems_Engineering_for_Resilience

    If you watched Perservence land https://mars.nasa.gov/mars2020/timeline/landing/ all the hardware, all the software, all the “human” process at JPL and around the worlds were specifically designed to be resilient, otherwise, the probability of success would be very low.
    Perhaps there is some guidance from the complex adaptive system of systems domain that would be applicable to your business domain??

  2. John J Brown
    February 22, 2021 at 9:09 AM

    Excellent post! I support the concept of “early detection and rapid response” as a key to resilience. I tried to include a graphic illustrating the concept but am not able to.

  3. February 22, 2021 at 12:44 PM

    There is another term that is a companion to resilience, but is too often used as a synonym, which it is not: sustainability. Climate change advocates in particular misuse the two.

    In this rapid fire world of ours, we are beginning to see organizations and communities pulsed by hazards with a frequency and ferocity that was unimaginable 10 years ago, and the statistics are beginning to confirm this. The key thought is that resilience is a critical, but not an infinite, capacity. In an age where short-term thinking and strategic myopia are contagious, it is all too common for executives and individuals to fail to realize that resilience to an event or series of events is pointless if the fundamental premise of the situation is unsustainable.

    By way of example:

    – FEMA is still supporting raising or hardening structures in areas that are fundamentally unsustainable within the likely time horizon of economic utility, and the progression of sea level rise. It promotes resilience, but ignores fundamental sustainability.
    – Diversifying one’s portfolio of hydrocarbon production distributes risk and enhances resilience, but is of little benefit if expanding the portfolio in a de-carbonizing economy risks adding stranded assets.
    – Substituting and expanding combat drones to reduce the risk and increase the resilience to pilots is of little benefit if the fundamental data network that they all depend upon is not sustainable, and is susceptible to putting the entire enterprise at risk of compromise.
    – Creating crops that are resilient to drought is of diminished value if they are deficient in yield and nutritional value.
    – Building a fighter that is resilient against detection, but its resulting design characteristics do not permit it to carry enough fuel and ordinance to sustain in battle is rather pointless.
    – Modernizing and distributing a nuclear force is of little benefit if a nation is greatly exposed to being incapacitated by a more likely cyber-attack.
    – The internet is only as resilient as the quality of its information, integrity, interoperability and source of energy.
    – Automating processes to reduce exposure to the cost and availability of labor is of little value of the talent needed to run the more complex automated process is even more scare than the labor it seeks to replace.

    So the underlying theme is that resilience options must be considered in context to be effective. And no amount of resilience can achieve its goal in a context that is fundamentally unsustainable. In the long run, sustainability trumps resilience, and it’s important to know the difference.

  4. February 23, 2021 at 4:38 AM

    Great piece, and I fully agree. A lot of different “things” will be thrown at you in terms of disasters or disruptions and other changes in your business conditions. Your resilience is your ability to survive and even prosper despite all of these.

    That is NOT done through management of individual risks (as suggested by risk management traditionalists), but though intelligent building of a platform which enables effective handling.

    To provoke: for some, having a “super task force” able to make good and effective decisions fast can/will be more effective in ensuring resilience than any and all risk mitigation plans. Those doing that the most (combat soldiers) do not have risk management for this and that and the other specific risk – but they have practices a load of threat scenario to ensure that if/when something happens, they will know how to act effectively and fast.

  5. Tarun
    February 25, 2021 at 11:17 PM

    Very nicely articulated. As I am from India, like many other things we take our inspiration from Nature.

    So, taking cue for nature, to me resilience gets defined by the two events that happen to plants cyclically ever summer and during snow. In summer most of the grass drys out and is often burnt naturally or otherwise, similarly during snow season the plants existence all but vanishes.

    However come rains or spring the new offshoots appear and there is flowering all over again.

    So to me resilience is ability to adapt to the changes and when needed go into survival mode so that when time comes we are ready to flourish again.

    Resilience is ability to adapt to changing times, good or bad and make the best of the ongoing circumstances.

  1. February 22, 2021 at 8:45 AM
  2. February 23, 2021 at 11:04 PM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

<span>%d</span> bloggers like this: