The behavorial psychology behind stress testing

Written by FactSet Insight | Jun 20, 2010

If you were to predict people’s reactions to a stressful event, such as a fire, where would you start? Would you gauge their responses to a similar high stress event like an earthquake, multiply their reactions to a mildly stressful event like a traffic jam, or monitor their emotions on a relaxing weekend day?

The Basel Committee on Banking Supervision recently released a very interesting document called “Principles for Sound Stress Testing Practices and Supervision.” The ideas expressed in it are broadly relevant and go far beyond banking industry-specific issues. Here are two examples:

“Most risk management models, including stress tests, use historical statistical relationships to assess risk. They assume that risk is driven by a known and constant statistical process i.e. they assume that historical relationships constitute a good basis for forecasting the development of future risks.”

"…given a long period of stability, backward-looking historical information indicated benign conditions so that these models did not pick up the possibility of severe shocks nor the build up of vulnerabilities within the system. Historical statistical relationships, such as correlations, proved to be unreliable once actual events started to unfold… Extreme reactions (by definition) occur rarely and may carry little weight in models that rely on historical data.”

These quotes are interesting to us for two reasons. First, they exhibit a very common misconception that assuming a constant process and measuring stable correlations is pretty much the only way to use historical data. Secondly, they indirectly validate the method for performing factor stress tests that FactSet introduced last Spring, called Event Weighted testing.

As for the first point, the crux of the matter is in the assumption of a stable and a linear system. This type of thinking can be roughly understood in the following way: Airline stocks depend on oil prices as related their inputs. Oil stocks depend on oil prices as related to their outputs. Therefore, there should be some stable economic relationship between the airline companies and the oil companies that would warrant their stocks to move together in a stable pattern. This is the constancy assumption. The assumption of normal distribution by using correlation as the measure of that relationship completes the picture by introducing the assumption of linearity.

Stability and linearity of relationships are ideas that led to the much publicized talk of “decoupling” of international markets from the U.S. in the Summer of 2008. It was prominent enough for Donald Kohn, the Vice Chairman of the Federal Reserve, to discuss at the International Research Forum on Monetary Policy as late as June 2008. The idea was that foreign markets – and especially emerging markets – diversified their trade and could withstand a U.S. meltdown. Of course, the ensuing events quickly put that idea to rest, but it seemed plausible based on the assumption of stability and linearity.

What it failed to take into account is that during extreme events the forces in play are not really economic, but rather trading-induced and psychological. The financial system is not some random process, but driven by real human beings who frequently make very similar judgments (call it herd behavior if you like, I just don’t like that term) based on their emotions, especially in times of crisis.

There is not any permanent linear correlation such as would be described:
When stock A goes down 4% stock B usually goes down by about 2%; therefore when stock A will go down 40% stock B will go down by about 20%. No, in fact stock B is likely to follow stock all the way to the 40%, because in crisis the economic relationships are not nearly as important. This discontinuity produces a so-called “rise in correlations,” which the document cites.

One could say hindsight is 20-20. But the idea of decoupling was not difficult to disprove even based on the historical data available in August of 2008.

Consider the following table:

Correlations of S&P 500 with Emerging Markets as of 8/31/2008

The “Calm” row contains correlations of S&P 500 with various emerging markets calculated using one year of data prior to 8/31/2008. However, if we used only the 25 most extreme days during the 8/31/2007 to 8/31/2008 period, the picture would be completely different. The decoupling and the suspect diversification benefits vanish even without considering the events of Fall 2008. This is because extreme environments, though caused by different events (in fact an infinity of possible trigger events) produce similar environments, which can be modeled by focusing on the historical extreme data points, rather than simply historical data without any regard for the period from which it was drawn.

This is precisely the idea of the Event Weighted stress testing which FactSet introduced in presentations last Spring. In summary, stress tests should not use the data from benign conditions AT ALL. Depending on the stress test, historical data points that are used to model the correlations for the stress test have to be drawn from the most similar extreme events. Risk in financial markets is very much about people’s behavior, and you don’t learn about people’s behavior during a fire by observing them on a calm picnic day.

View full post