May 1st, 2023 by Adam Sandman

As we deploy AI and ML-based systems and applications is: how do we test them? With traditional software systems, humans write requirements, develop the system, and then have other humans (aided by computers) test them to ensure the results match. With AI/ML-developed systems, there often are no discrete requirements. Instead, there are large data sets, models, and feedback mechanisms. In this article, originally published on AI Business, we discuss the risks of letting machines check themselves and provide some potential solutions.

The Risks of Trusting Machines to Check Themselves

We are surrounded by risks every day, in every organization and with every activity we undertake. The field of risk management is a discipline devoted to trying to identify, quantify and manage these risks so that we as humans can live safe, secure and happy lives. Every time you fly in a plane, drive a car, swallow a pill or eat a salad, you are relying on the tireless work done every day by risk managers to make that activity safe whilst still economic.

risk cube

Risk management is the understanding of events that might happen, regardless of how unlikely or remote. For example, when driving a car, there is the risk that a tire might blow out, the probability is low, but the impact if it happens, is quite high. Conversely, the risk that a headlight will stop working is relatively high, but the impact if that happens is moderate, but not zero (which is why you have two headlights). When assessing risks, you need to consider the combination of impact/severity and probability/likelihood, this is called the risk exposure and determines how much effort should be put into mitigating (reducing either its severity or probability).

Identifying Risks Overlooked by Humans

Traditionally, the first step in risk management is to identify all the risks, and then quantity their probability and severity. This is done by humans brainstorming possible events that can happen, and using their previous industry knowledge to make an educated guess about the probability and severity. Now one of the major issues in this process is the fact that humans (especially in teams) are affected by biases such as confirmation bias, hindsight bias, and availability bias which mean that we tend to overestimate previously known risks and either ignore or downplay obscure edge cases that might be really serious.

Using Artificial Intelligence (AI) in the field of risk management offers the potential to address these issues in several ways. Firstly, when you have large datasets of real-word evidence, AI applications and Machine Learning (ML) models both can uncover hitherto unknown risks, so called “zero day” risks. For example, an AI system that is monitoring unrelated data feeds for weather, hospital stays and traffic could identify a risk in a new automotive system (e.g., a new autopilot mode) that has been obscured from view. In addition to identifying new risks from large uncorrelated data sets, an AI system could take previously identified risks and be able to systematically quantify them for impact and probability using external environment data vs. human intuition. This would take a lot of the guesswork and variability out of classic risk management models.

Understanding Risks in Our Models

Conversely, using AI and ML for safety-critical systems such as automotive or aerospace autopilots can itself introduce new risk into the system. When you build an AI/ML model, unlike a traditional algorithmic computer system, you are not defining the requirements and the coding steps to fulfill those requirements. Instead, you are providing a data set, some known rules and boundary conditions, and letting the model infer its own algorithms from the provided data.

That lets AI/ML systems create new methods that are not previously known to humans, and find correlations that are unique and breakthrough. However, these new insights are unproven, and may be only as good as the limited data-set they were based on. The risk is that you start using these models for production systems and they behave in unexpected and unpredictable ways. This is a classic example of a problem that needs independent risk analysis and risk management, either from a separate AI/ML model that can “check” the primary one, or from human experts who can validate the possible risks that could occur in the AI system.

Harmony Between Humans and Machines

So, introducing AI and ML into complex safety or mission-critical systems brings both benefits and challenges from a risk management perspective, a sensible framework for adopting AI/ML techniques recognizes these factors and attempts to maximize the benefits whilst mitigating the challenges.

risk model between different AIs

Humans can check the machines

As you begin to implement AI/ML models, making sure that you have a clear grasp of the business requirements, use cases, and boundary conditions (or constraints) is critical. By defining the limits of the data sets employed and the specific use cases that the model was trained on, will ensure that the model is only used to support activities that its original data set were representative for. In addition, having humans independent check the results predicted by the models is critical. This independent verification and validation (IV&V) will allow humans to ensure the AI models are being used appropriately, without introducing unnecessary risk into the system. For example, when deploying an autopilot to an automotive or aviation platform, such I&V will verify the plane or car behaves as expected, either in a simulator or real-world test environment.

Machines can check the humans

Conversely, AI models can be used to check the risk management models and assessments made by humans to ensure they are not missing zero-day risks. In this case, the computer is acting as its own IV&V for the human decision-making process. This is a mirror image of the previous case. In this case, we’re taking advantage of machine learning models to avoid human biases and find flaws in our existing risk models.

Harmony

In conclusion, instead of seeing AI as an alternative to classic risk management methods, we should see the inherent harmony between computers and humans checking each other’s assumptions and recommendations. Using a combination of manual risk assessment techniques and AI tools we can reduce the overall risk in systems while at the same time allowing unprecedented innovation and breakthroughs. If you look at many historical safety accidents (Three-mile island, Chernobyl, etc.) it is the human factor that often presents the most risk after all.

Adam Sandman is a visionary entrepreneur and a respected thought leader in the enterprise software industry, currently serving as the CEO of Inflectra. He spearheads Inflectra’s suite of ALM and software testing solutions, from test automation (Rapise) to enterprise program management (SpiraPlan). Adam has dedicated his career to revolutionizing how businesses approach software development, testing, and lifecycle management.