This week I gave two keynote presentations at two different conferences (ICSOB 2017 in Essen and EASE 2017 in Karlskrona – please don’t ask me how many days on the road I log every year). As part of my keynotes, I bring up our “Stairway to Heaven” model (see figure below) and the adoption of continuous deployment in almost every industry that I work with. Interestingly, at the end of both talks the same question came from the audience: how do we protect the quality of the systems deployed in the field when we adopt continuous deployment.
The interesting aspect of this question is that there is an underlying assumption in this question: if testing and validation is done slow and manual, the quality will be higher than if the testing is done automatically and fast. The second assumption is that quality is higher when I build the software for a system, freeze it, test the heck out of it and then put it in products during manufacturing with the intent of never touching it again.
When making the assumptions explicit, it is obvious that both assumptions are wrong. No amount of testing pre-deployment will be able to find each and every issue in each and every deployment of the system. Customers use systems in different contexts, configure it differently and use it differently. Setting up a test and validation environment that covers all permutations will be prohibitively expensive. This is also the case for safety critical systems, even if we set higher pre-deployment testing standards for these types of systems. However, any quality assurance system that incorporates the post-deployment phase inherently leads to higher quality than one that does not.
Concerning the second assumption, it is interesting to see the difference in approach between the software & system safety community and the security community. The latter will actively patch security holes through new software deployments when security issues are identified. The safety community will do everything it can to avoid deploying new software versions as the risk of introducing new safety issues is considered to outweigh the benefits of fixing a know safety concern and the cost of re-certifying are prohibitively high.
It is obvious that a validation and deployment infrastructure that collects information from deployments in the field, that identifies concerns before customers may even notice them and that deploys new software whenever quality concerns of any type are identified is superior to a system that fails to incorporate some or all of the aforementioned characteristics. I would even go as far as to say that we have a moral and ethical obligation to adopt this approach, especially in safety and security critical systems.
Certification institutes are in many contexts, ranging from medical to automotive and aeronautics, hindering innovation by causing major delays in the adoption of new technologies, approaches and methods. Their standard argument is that they protect human lives by doing so. However, we fail to account for all the lives could have been saved if innovations would have been adopted earlier. Who is accountable for the lives lost due to delays in adoption of new innovations? How do we measure those?
So, next time you the need to grasp for the good old days when everything was milk and honey, where men were men and women were women and a product would be optimal at the time of acquisition and then slowly deteriorate over time as as it aged, remember that continuous deployment allows us to enjoy products that become better, safer and more secure throughout their lifetime. That any system that uses continuous integration, continuous deployment, monitoring of systems in the field and identification of post-deployment issues is inherently higher quality. And, finally, that speed drives quality, rather than being detrimental to it.