Inability to Undo Implies Risk Assessment is Necessary
Recently, I was thinking about a past effort to develop communication patterns to facilitate composition and reasoning of loosely coupled medical system and I remembered the problem of supporting orchestration pattern — a single controller orchestrates an activity by commanding and controlling multiple medical devices that are connected to or acting on a patient.
In software-only environment, the above pattern would be easier to think about and achieve with the help of transactions (as found in databases). Loosely speaking, the controller would instruct each device to perform specific actions. If any of the devices failed to perform an action, then the controller would instruct other devices to undo their actions. (We won’t focus on how an action may be undone, e.g., rolling back an action, performing a counter action that nullifies the effect of the action.)
While all-or-nothing semantics of transactions offer a simple solution, they hinge on the ability to undo actions. Many actions in the real world cannot be undone.
A good non-software example is the injection of wrong medication into a patient’s bloodstream. A good software example is the leakage of sensitive data that is left unsecured.
When depending on an action that cannot be undone, assess the risk associated with the inability to undo the action and employ appropriate risk avoidance and mitigation strategies.
In terms of assessing risks, we should consider the risk, its likelihood, and its magnitude/impact. Based on these aspects, appropriate risk avoidance and mitigation strategies should be employed.
In the first example, the risk could be injecting the wrong medication. Its likelihood could be once every 60 days in a specific hospital. Its impact could be the death of a patient. Since one death every 60 days is undesirable for a hospital (or any institution), involved risk avoidance strategies (e.g., use of medicine checklists, crosscheck medication) and risk mitigation strategies (e.g., have counteracting medicine on hand) should be used to handle the risk.
In the second example, the risk could be the failure of a word processor’s auto-save feature that automatically saves the current document to the disk every ten minutes. Its likelihood could be once every 1000 automatic saves. Its impact would be the loss of ten minutes of user effort. Clearly, this risk is not comparable to the risk in the previous example. So, we can decide to not adopt a risk avoidance strategy. As for risk mitigation, the saving of the document every ten minutes suffices as a risk mitigation strategy.
Suppose the likelihood changed to once every ten automatic saves to the disk. Now, ten minutes of user effort is lost every 100 minutes of effort. This would be unacceptable. In this case, we can adopt a risk avoidance strategy of saving the file both to the disk and in the cloud. (Of course, we now have to consider the risk of not being able to save the file in the cloud and the risk of a data breach in the cloud.)
Why is this Important in Software Development?
With software systems becoming more connected and more decentralized (including IoT), we are getting into a world where considering and handling failures of components outside a system will become an integral part of designing, developing, deploying, and operating the system.
Writing this post, I can’t help but think of ways this aspect will affect our current software development practices, e.g., software design, software build, software testing, software maintenance. Since most of our current practices are geared to reason about minimally connected or centralized systems, they will have to change quite a bit if they are to enable the development of new systems that are either highly connected or decentralized.
As we try to adopt and adapt existing practices to such new systems, we will realize the limits of these practices and devise new practices. We are already witnessing such realizations and new practices with the rise of microservice architecture. Even so, could this just be the tip of the iceberg? Will we have deeper realizations that will overhaul the way we think, reason, and go about developing these new breed of software systems? If so, what will they be?
If you are interested in how we develop software, then I am certain you too will find this new frontier both interesting and exciting.