During the last couple of weeks, the U.S. Dollar has lost some power with respect to the currencies of the United States’ major trading partners, prompting myself and my colleagues - Velislav Bodurov and Bono Nonchev - to question how strong and characteristic the movements are, as well as what methodology should be used to model them. Following is our analysis, which looks at the U.S. Dollar Index (DXY) using BISAM's patented Cognity® real-world risk modeling approach.
Introduction
One of the most widely known measures of the value of the United States Dollar is the U.S. Dollar Index (DXY), which has been frequently mentioned in the financial news in recent weeks. The index measures the value of the U.S. dollar relative to a basket of major currencies. It is almost unique in construction due to its fixed composition over time.
The DXY was introduced in 1973 and the only change since appeared in January 1999, when the Euro replaced a bucket of European currencies. On that date the Europe representation in the U.S. Dollar Index was determined to be 57.6%. The constituents to the rest of the currencies are given in Table 1.
Recently, the DXY reflects weakening of the U.S. Dollar. However, such local short-term gains and losses are not something extraordinary. That is why we have decided to look more holistically at the DXY risk profile and assess what methodology should be used to model the risk of the index.
Following our general approach for risk model validation – fat-tailed VaR - we have executed out-of-sample daily risk backtesting for a period between January-1976 and February-2017, encompassing more than 40 years. The results from our model are compared to the Normal EWMA (0.94) and Historical models. We compare the Value at Risk (VaR) 99%, VaR 95% and volatility forecasts of the three models over the 40-year timeframe and over shorter periods of characteristic movements.
Empirical Tests
The VaR estimates of the DXY are generally compared based on the number of violations tested. Namely, a good risk model ensures that the VaR at confidence level (1 - α) is violated approximately α% of the time. For example, VaR 99% should be violated approximately one percent of the time, if the model is to produce consistent results.
Along with the VaR tests, we have tested the three models’ volatility forecasting power using the so-called bias test, i.e. the realized return for the period (t, t+1) scaled by the predicted volatility at time t. If the model is good, the scaled returns should have a standard deviation of approximately one, and the realized value is called “bias statistic.” If the bias statistic is consistently greater than one, then the model underestimates the true volatility and vice versa.
To additionally strengthen the analysis, we have analyzed the Cognity model performance and compared it to the widely used Normal EWMA and Historical models on several selected sub-periods, along with the total backtest period (January 2, 1976 – February 10, 2017). The periods are visualized below on Figure-1, and each one encompasses a period with characteristic movement of the index – both strong periods and weak periods for the U.S. dollar, as well as some more neutral periods. The strong periods are marked in green, the weak periods are marked in red and the neutral ones are marked in orange on Figure-1. The aim is to assess the model quality in different regimes and make sure that it is consistent.
Analysis of VaR Exceedings
Table-2 below clearly showcases the forecasting power of the Cognity model in terms of VaR 99% forecasting. The Cognity model has 110 violations of VaR 99% for the total backtest period between January, 1976 and February 10, 2017. The 95% confidence interval for the number of violations given the backtest time window of 10,727 days is 88 to 128. In other words, if the Cognity model produced anywhere from 88 to 128 VaR 99% exceedings over that period, we can be relatively confident that our model is in line with what is expected from a theoretically perfect model.
From the reported number of VaR 99% violations we conclude that the Cognity model performs equally well in strong, weak and moderate periods. On the other hand, both Historical and Normal models fail this test for the entire backtest period. Moreover, the Normal model fails for all the periods. For example, the number of violations for the whole period under the Normal model is almost twice as high than expected. For the level of exceedings, the Historical model seems to perform reasonably well for VaR 99% forecasting.
Table-3 confirms the accuracy of the Cognity model and confirms the well-known fact that the Normal and Historical models work reasonably well at a 95% confidence level. However, as was the case with VaR 99%, the only model that passes all the tests for each period is the Cognity model.
Empirical Probability of Exceedings
The fact that the Normal model has more than twice the number of exceedings of the Fat-Tailed model for VaR 99% for such a large period (> 10 000 days) merits further investigation, which we have done by investigating how much the results depend on the given time window.
The next analysis shows how the results would have looked if we had executed the experiment at a previous point in time, essentially starting with an initial time window set to one year and adding day by day until we get the whole sample of 40 years. At each point the plotted value shows the realized exceedances as a percentage of the time window, which is dubbed “Empirical Probability of Exceedings.” We have shown the end date of the time window on the x-axis.
For the VaR 99%, we would naturally expect that for a good model, about one percent of the periods are exceedings and that the empirical probability will be closer, especially with increasing the time window. This however happens only for the Cognity model, with the other models consistently having too high a number of observed exceedings.
Figure-2 below shows that empirical probability of VaR 99% violation for the Cognity model converges to one percent. It is in fact sufficiently close to one percent even for time window equal to five years (or approximately 1250 business days, shown in the chart with time window ending on November 1981). The empirical probabilities for VaR 99% violation per both the Historical and Normal EWMA models also become more stable with the increase of the backtest time window, however the Historical Model converges to 1.20% and the Normal EWMA model converges to a number larger than two percent, which shows the inability of those two models to consistently predict Value-at-Risk at higher confidence levels.
Quite similar results are obtained for the behavior of the empirical probabilities for VaR 95% violations, presented on Figure-3.
Bias Comparison
The final step of the analysis is to assess the volatility forecasting power of the three models. The 250-days rolling bias statistics for the three models along with the confidence bounds is plotted on Figure-4. This picture illustrates why the forecasting of the number of exceedings is by itself not sufficient to validate the risk model.
In terms of VaR violations tests the historical model performed reasonably well, as shown in the previous sections. However, looking at Figure-4 it is obvious that there are periods with quite significant underestimation and overestimation of the DXY volatility, if the historical model was used to estimate it.
On the other hand, the Normal EWMA model has a much more reasonable volatility forecast, compared with the Historical model. To further explore the comparison with the Cognity model, we remove the rolling bias statistics of the Historical model and compare the Cognity model and the Normal EWMA model in more detail on Figure-5.
The overall higher level of the bias statistic for the Normal EWMA model is clearly visible. The chart on Figure-4 is summarized in Table-4 to more clearly see a quantification of the fact that the Normal model tends to underestimate risk on a regular basis.
This effect is especially visible in the beginning of the backtesting period, where the rolling bias statistic significantly drifts away from its ideal value of one. Also, note on Figure-4 that in more recent crisis periods (for example the period right after the financial crisis in 2008) both the normal and fat-tailed models deviate from one, but for the normal model this effect is more pronounced. Of course, one might argue that a normal model might be complemented by a series of historical or hypothetical stress scenarios in order to assess the risk of one’s portfolio under extreme market events, but without resorting to such explicit ad-hoc adjustments, on a stand-alone basis the Cognity Fat-Tailed methodology provides better estimates of daily risk (measured as standard deviation in this case) than the industry standard approach.
Below, Table-5 reports the bias statistic for each model calculated over the entire backtesting period – 10,727 days in total. The results posted here lead us to several conclusions:
- The bias statistic computed using the whole sample at once might be a misleading indicator of model behavior. Indeed, the Historical model posts a bias statistic that is on average close to one (albeit significantly different at periods). This does nothing more than prove that the Historical model is good “on average.” Investigating on specific periods reveals that this is not sufficient criterion, pointing to periods of significant bias that can lead to calamitous consequences if not considered.
As Figure-5 shows, the bias statistic for the Normal model generally follows the same trends as that of the Cognity model, but has an overall bias higher than one, which shows that the Normal model underestimates the volatility of the index. This can be clearly seen on the table below.
Conclusion
In this short note, we have explored the modeling of the US Dollar Index with three methodologies. Our findings can be summarized as follows:
- The Historical model produces Value-at-Risk exceedings at the 99% that are in line with expectations (although the model is prone to producing clustered violations), but the forecasted standard deviation can be over/under-estimated for extensive periods.
- The Normal model produces an excessive number of VaR violations at the 99% level although it performs well on the rolling bias test.
- Out of the three models the one that captures both the specific tail behavior of returns and the standard deviation is the Cognity Fat-Tailed model. It provides us with superior risk measures as evidenced by the VaR violations and bias tests.
Overall only the Cognity Fat-Tailed model provides strong, out-of-the box performance on both VaR and volatility levels, throughout both high and low volatility regimes, as well as periods of both strong and weak periods for the US dollar.
This article originally ran on the Bisam Insight blog