• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
Analytic Strategy Partners

Analytic Strategy Partners

Improve your analytic operations and refine your analytic strategy

  • Home
  • Blog
  • Books
  • About
  • Services
  • Contact Us

uncertainty

Lessons from the Ever Given and Archegos: Four Ways Predictive Models Fail

April 9, 2021 by Robert Grossman

The Hierarchy of Uncertainty and Why Models Break Down

Figure 1. The container ship Ever Given blocking the Suez Canal (2021). Source: Copernicus Sentinel data (2021), processed by Pierre Markuse.

In the news over the past month was the story of how the Ever Given blocked the Suez Canal and stopped an estimated 10% of global shipping and how the collapse of Archegos Capital Management led to billions of dollars of losses, including $2 billion of losses for Nomura, Japan’s largest bank, and $5 billion of losses from Credit Suisse, an investment banking company headquartered in Switzerland. Lloyd’s of London is expected to have losses of approximately $100 million from the delay to shipping that it insured.

Banks run multiple types of risk models to protect themselves from these types of events from happening, as do insurance companies. In this post, we examine some of the reasons that predictive models have trouble with these types of losses and what can be done about it. More generally, we look at the types of uncertainty that arise when building machine learning and AI models.

The Hierarchy of Unknowns in Machine learning

We assume that we have a predictive risk model that has inputs or features. It is quite helpful to distinguish between four types of unknowns and to view these as forming a hierarchy.

  1. Known stochastic variance. Stochastic here simply means that the inputs or features are not deterministic but instead vary and that the variances can be described by a probability distributions. This is the normal state of affairs for most models. These types of models have false positives and false negatives, but are the types of models that data scientists build routinely. One of the ways that these types of models can break down is through drift – over time the behavior that models attempt to capture tends to drift and models might be updated and re-estimated to capture this drift.
  2. Unknown stochastic variance. It’s common not to have enough data to be able to characterize the probability distributions of the variables driving your model. In this case, getting more data is critical. A particular challenge are long tailed distributions, or distributions associated with power laws, since this can create situations in which collecting enough data is particularly challenging. There are a number of specialized techniques used in this case, such as catastrophe modeling (“cat modeling“), which is used to predict catastrophic losses, such as losses associated with earthquakes or hurricanes.
  3. Unknown variables, features, actors and interactions. In practice, models are approximate and usually do not have all the relevant features. Overtime, as modelers understanding improves, additional features can be added to improve the performance of the model. These days this is often done by using large amounts of data and using deep learning to create features automatically. This is a very effective approach, but the larger the data, the more likely it is to be biased, which introduces biases into the model.
  4. New behavior, new interactions and actors. As container grew in size, they became large enough to block the canal, a new behavior. You can also think of this as an emergent behavior, since the size of ships has been increasing for a long time, but as the size increases past certain thresholds new types of behavior emerge, such as blocking a particular canal. As another example, one of the reasons that Archegos Capital collapse is the instability caused by a new type of complex financial instrument called a total return swap [2], which again you can think of a new type of behavior associated with a new type of financial instrument.

The Difference Between Level 2 and Level 3 Uncertainty

It’s standard in data science to distinguish between risk and uncertainty: You are dealing with risk when you know all the alternatives, outcomes and their probabilities (Level 1 uncertainty above). You are dealing with uncertainty when you do not know all the alternatives, outcomes or their probabilities (Level 2 or higher levels of uncertainty above).

The Difference Between Level 3 and Level 4 Uncertainty

There is a subtle but critical difference between Level 3 and Level 4 uncertainty. With Level 3 uncertainty, the features, actors, or interactions are present, but not yet understood and not yet included in the model. Once identified, there may, or may not, be enough data to accurately model their distribution (the difference between Level 1 and Level 2 variables and features). With Level 4 Uncertainty, new behavior appears, such as the use by investors of total return swaps, or the transportation of containers by ships so large that they can completely block a canal.

Black Swans

A black swan can be defined as an unpredictable or unforeseen event, typically one with extreme consequences. Black swan events were popularized by Nassim Talab in his influential book by that name [3]. The term “black swan” has been used since the second century to refer to something impossible or unlikely, since only white swans were seen by Europeans until a black swan was seen by Dutch explorers visiting Australia in 1697.

Black swans can arise from the behaviors 2, 3 and 4 in the hierarchy of uncertainty. For example, they can arise when tail events occur in Level 2 uncertainty, when behavior that is not yet identified occurs in Level 3 uncertainty, or when new behavior or actors arise in Level 4 uncertainty.

Talab has importantly pointed out the fragile and unstable state that often occurs due to new interactions that arise, such as the excessive risk taking by banks, the bursting of the housing bubble, and credit illiquidity that led to the 2007-2008 financial crisis [4].

Deep Uncertainty

Another way of thinking of the different types of uncertainty in the hierarchy of uncertainty is through the concept of deep uncertainty. One definition of deep uncertainty is [5]:

  1. Likelihood of future events & outcomes cannot be well-characterized with existing data and models
  2. Uncertainty cannot be reduced by gathering additional information
  3. Stakeholders disagree on consequences of actions

There is an emerging field called modeling under deep uncertainty (MUDU) which is establishing best practices for building models with deep uncertainty [5].

Best Practices

Figure 2 summarizes some best practices when faced with different types of uncertainty. With Level 1 uncertainty, simply improving the model helps, while re-estimating the model is necessary to manage drift. With Level 2 uncertainty, getting more data can help. It’s critical to understand whether the variable is long tailed and whether a power law is involved. If so, it may not be likely that you will get enough data before a black swan like event occurs. With Level 3 uncertainty, it’s more about gaining more insight into root causes, external interactions, the robustness of the system than improving the model. Finally, with Level 4 uncertainty, the best approach is to develop more caution and try to be quicker to detect new actors and new interactions and to take appropriate actions than your competitors.

Figure 2. Some best practices for managing uncertainty.

References

[1] Container Ship ‘Ever Given’ stuck in the Suez Canal, Egypt – March 24th, 2021. Contains modified Copernicus Sentinel data [2021], processed by Pierre Markuse (License: Creative Commons Attribution 2.0 Generic)

[2] Quentin Webb, Alexander Osipovich and Peter Santilli, Wall Street Journal, March 30, 2021, What Is a Total Return Swap and How Did Archegos Capital Use It?

[3] Taleb, Nassim Nicholas. The black swan: The impact of the highly improbable. Vol. 2. Random house, 2007.

[4] Taleb, Nassim Nicholas. Antifragile: Things that gain from disorder. Vol. 3. Random House Incorporated, 2012.

[4] Walker, Warren E., Robert J. Lempert, and Jan H. Kwakkel. “Deep uncertainty.” Delft University of Technology 1, no. 2 (2012).

Filed Under: Uncategorized Tagged With: black swans, cat modeling, deep uncertainty, known unknowns, risk, risk modeling, uncertainty, unknown unknows

Profiles in Analytics: Frank Knight

March 15, 2021 by Robert Grossman

Frank Knight at work. Source: University of Chicago Library, https://www.lib.uchicago.edu/media/images/Knight.original.jpg.

I have written several short profiles in this blog about individuals that have contributed to the practice of analytics or made intellectual contributions that can be applied to the practice of analytics, including:

  • Claude Hopkins: An Early Advocate of Test Measure and Refine
  • George Heilmeier: Twelve Rules for a Chief Analytics Officer, which covers Heilmeier’s Catechism and Heilmeier’s Twelve Rules for a new CIO and adapts them to the duties of a Chief Analytics Officer
  • William H. Foege: Why Great Machine Learning Models are Never Enough: Three Lessons About Data Science from Dr. Foege’s Letter
  • In Chapter 8 of my book Developing an AI Strategy: a Primer, I also provide brief profiles of Kenneth R. Andrews (1916-2005) and his four step process for strategic planning; H. Igor Ansoff (1918–2002) and what is now called the Ansoff Matrix; and, Bruce D. Henderson (1915-1992) and the experience curve.

In this post, I want to discuss some of Frank Knight’s insights about risk [1] and how they can be applied to the practice of analytics today.

Frank Knight and the Chicago School of Economics

Frank Knight (1885 – 1972) was a professor of economics at the University of Chicago from 1928 to 1955, where he was one of the founders of what became known as the Chicago School of Economics. His students included three Nobel prize winners in economics: Milton Friedman, George Stigler and James M. Buchanan.

His intellectual interests were broad and stretched from economics to social policy, political history and philosophy. He taught courses in the history of economic thought, the relationship between economics and social policy, and was one of the founding faculty members of the University’s of Chicago Committee on Social Thought in the early 1940s [2]. He was (cross)-appointed as a Professor of Social Science in 1942, a Professor of Philosophy in 1945, and was named the Morton D. Hull Distinguished Service Professor in 1946 [2].

Frank Knight’s Book – Risk, Uncertainty and Profits

In this post, I would like to discuss the important distinction between risk and uncertainty that Knight introduced in his 1921 book Risk, Uncertainty and Profit [1], which was based on his PhD dissertation. (The book is now in the public domain.) One of his key insights was that perfect competition would not eliminate profits, since even with perfect competition different firms would make different judgements due to the presence of uncertainty, which he distinguished from risk.

Modeling risk vs planning for uncertainty. At the most basic level, phrased as we would describe it today, Knight distinguished risk and uncertainty this way: you are modeling risk, when you know the different variables, alternatives and outcomes and can estimate their probabilities. You are planning for uncertainty, when you do not know or cannot measure the relevant variables, alternatives and outcomes and are developing plans and mechanisms to deal with this uncertainty.

It is interesting to note that when Knight’s book was published in 1921, mathematics and econometric modeling had not yet dominated the field and his entire book (at least from a quick scan) does not contain a single equation, although I counted six graphs illustrating concepts such as supply and demand.

Modeling Risk vs Planning for Uncertainty

When modeling risk, you can use standard machine learning models. When planning for uncertainty, you need to consider different scenarios and try to create appropriate plans for those you can imagine and even those that are challenging to image. As an example, planning for uncertainty includes thinking about what today we we would call black swan events [3].

Another way to look at this distinction, is that risk is about objective probabilities about events that can be measured, while uncertainty is about subjective probabilities about events that have to be postulated. Both are important in the practice of analytics.

A third way to look at this distinction was described in his book using markets. If there is a market so that you can insure against some unknown outcome, it is risk; if there is not a market, it is uncertainty [1].

Knight viewed entrepreneurs as those willing to put with uncertainty in certain circumstances and and to manage it in search of profits [1].

As more and more complicated risk instruments were developed over the years, fewer outcomes could be classified as uncertainty. On the other, as more and more complicated risk instruments were introduced, uncertainty increased, fragility increased, and the number of unknown unknowns increased. This created situations in which markets could collapse, such as the 2008 financial collapse, with mortgage backed securities introducing their set of uncertainties.

Risk, Uncertainty and the Practice of Analytics

In last month’s post, I discussed the importance of distinguish between developing analytic models, developing a system that employs analytics, and growing an analytics business. Recall that I use the term analytics to include machine learning, data science, statistical modeling, and AI. Let’s apply Knight’s distinction between risk and uncertainty to analytics models, systems, and businesses.

Quantifying risk when developing models. I have found it helpful at times to apply Knight’s insight about the distinction between risk and certainty to the practice of analytics. When developing machine learning or AI models, it’s important to quantify the errors and risks associated with a model by understanding the stochastic nature its inputs, hidden variables, and outputs and being able to describe their probability distributions. It’s equally important to be able to estimate the confidence level of the parameters in the model.

Managing uncertainty when developing ML or AI systems. On the other hand when developing systems that use machine learning (ML) or AI, it’s important to use engineering best practices to reduce the impact of uncertainty. For example, a best practice is “fuzz the system” that you are building by sending noise as inputs for days on end to make sure that the system performs gracefully no matter what inputs are provided, even non-numeric inputs, not printing characters, etc. Although fuzzing is standard in testing software for security vulnerabilities, it is also an excellent method to test AI-based systems to make sure they perform well in practice. Another standard best practice is to exponentially dampen temporally varying input variables in a system so that the system provides approximately good predictions even when one or two temporally varying inputs are in error.

Managing uncertainty in ML or AI businesses. Finally, uncertainty is also critical in building profitable machine learning or AI businesses. Good business strategies are robust in the sense they survive not just contingencies that are likely and predictable, but also those in which there is not enough data to quantify the risk and those that involve outcomes that have not yet been observed. Today, we might talk about how a business can survive disruptive changes in a market, but this can also be thought of a type of what is sometimes called Knightian uncertainty.

References

[1] Knight, Frank Hyneman. Risk, uncertainty and profit. Houghton Mifflin, 1921. The book is in the public domain and a copy can be found here.

[2] University of Chicago Library, Guide to the Frank Hyneman Knight Papers 1908-1979, University of Chicago Frank Hyneman Knight Special Collection.

[3] Taleb, Nassim Nicholas. The black swan: The impact of the highly improbable. Random house, 2007.

Filed Under: Uncategorized Tagged With: Chicago School of Economics, Frank Knight, modeling risk, planning for uncertainty, risk, Risk Uncertainty and Profits, uncertainty

Primary Sidebar

Recent Posts

  • Developing an AI Strategy: Four Points of View
  • Ten Books to Motivate and Jump-Start Your AI Strategy
  • A Rubric for Evaluating New Projects that Produce Data
  • How Does No-Code Impact Your Analytic Strategy?
  • The Different Varieties of Advisors & the Difference it Makes

Recent Comments

    Archives

    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • June 2019
    • May 2019
    • September 2018

    Categories

    • Uncategorized

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Copyright © 2025