We explained that there are five potential methodologies for linking macro-economic factors to default probabilities for the purpose of stress testing according to the Federal Reserve’s Comprehensive Capital Analysis and Review 2016 scenarios, scenarios of the European Banking Authority, or scenarios for other internal or third party purposes. The five potential methodologies are listed here:
- Use the same logistic regression formula and inputs used to determine the current term structure of defaults and historical defaults
- Use the lagged default probability as an input for a simple naïve model to predict forward default probabilities
- Use current macro factors to predict current default probabilities
- Use current macro factors and financial statement ratios and stock price inputs known at time zero to predict future default probabilities.
- Use current macro factors, forecasted financial statement ratios and forecasted stock price inputs to predict future default probabilities.
There are also two econometric techniques that are popular methods for linking macro-economic factors to a time series of default probabilities:
- An ordinary least squares regression fitted to the transformed default probabilities
- A fractional regression using the logistic distribution fitted to the actual unannualized default probabilities, expressed as a decimal.
We conduct an effective challenge of the ordinary least squares approach using data for Wells Fargo & Co. default probabilities from Kamakura Risk Information Services’ KRIS default probability and bond information service. We explain why fractional regression (logistic) is so effective as a challenger model that we reject the ordinary least squares approach in favor of fractional regression.
Champion and Challenger Models for Predicting Future Default Probabilities
In this section we discuss all five methods estimation strategies and the two econometric procedures for executing those strategies. We start with Method 1 and Method 5, because both of these approaches are immediately rejected on both theoretical and practical grounds. Method 1 uses the same logistic formula and inputs for current and historical default probability generation (i.e. in sample) for out of sample default probability forecasting. The Kamakura Risk Information Services Version 6.0 Technical Guide lists the explanatory variables as inputs for the Jarrow-Chava reduced form model. Twenty-seven of those inputs are related to company-specific financial statements, stock prices or both. Since 27 out of 47 inputs are unknown on future dates, for this reason alone Method 1 is rejected.
“Forbidden regressions were forbidden by MIT professor Jerry Hausman in 1975, and while they occasionally resurface in an under-supervised thesis, they are still technically off-limits. A forbidden regression crops up when researchers apply 2SLS [two stage least squares] reasoning directly to non-linear models.”
They go on to say (2009, page 192)
“As a rule, naively plugging in first-stage fitted values in non-linear models is a bad idea. This includes models with a non-linear second stage as well as those where the [conditional expectation function] for the first stage is non-linear.“
Earlier (page 122), Angrist and Pischke address the simpler case where both the first stage and the second stage are linear:
“The [two stage least squares] name notwithstanding, we don’t usually construct 2SLS estimates in two steps. For one thing, the resulting standard errors are wrong…Typically, we let specialized software routines…do the calculation for us. This gets the standard errors right and helps to avoid other mistakes.”
We refer interested readers to Section 4.6 of Angrist and Pischke (2009) for the full details of the problems with this pseudo-two stage approach to generating 13 quarter scenarios, like those required for the Federal Reserve’s
2016 Comprehensive Capital Analysis and Review process, for the default probabilities of ABC Company. This is a topic
described in further detail on the Kamakura Corporation website. For this reason, we also reject Method 5.
A Discussion of Methods 2, 3, and 4 using KRIS 3 Month Default Probabilities on Wells Fargo & Co. quarterly through September 30, 2015.
Methods 2, 3, and 4 represent an alternative to the rejected methods 1 and 5 that is both robust and practical. Kamakura refers to this approach as the “reduced reduced form” approach. Here is an example of how it works.
We start with the logistic regression model discussed above, and we use it to produce estimated default probability values for each of our corporate counterparties over a long historical period. We take Wells Fargo & Co. as an example, using the KRIS Version 6 Jarrow Chava public firm model from January 1990 to the present. We have the monthly history of Wells Fargo & Co. default probabilities with maturities of 1 month, 3 months, 6 months, 1 year, 2 years, 3 years, 5 years, 7 years and 10 years. We assume these default probabilities are "true" or correct. If our forward looking simulation will have quarterly time steps, we want to "reduce" the "reduced form" default probabilities for Wells Fargo & Co. to a "reduced reduced form" model that has only known inputs like these:
- Financial statement inputs and stock price information from time zero, not from the future points in time.
- The firm’s default probabilities at time zero, not from the future points in time.
- Macro factors which will be known at the future points in time.
Methods 2, 3, and 4 differ only in which of the three classes of inputs are used. We compare them using standard model validation procedures. For expository purposes we decide to derive a model for the 3 month default probabilities (in blue) for Wells Fargo & Co., which are graphed on a log scale for KRIS version 6.0 below along with 1 year (red) and 10 year (orange) annualized default probabilities:
For exposition purposes, we use quarterly data, even though we have monthly observations, to avoid “overlapping” the three month default probabilities which would trigger autocorrelation in the error terms of the regression. To derive this reduced reduced form model, we fit the logistic regression formula to the quarterly history of Wells Fargo & Co. 3 month default probabilities. We have three potential sets of explanatory variables:
Method 2: Time zero 3 month default probability only
Method 3: Macro factors only
Method 4: Macro factors and time 0 financial ratios and stock price data
We have two potential econometric techniques, which we discuss below.
If we select a list of macro factors that is rich enough, any errors in fitting the "true" time series of Wells Fargo & Co. default probabilities will not only be uncorrelated with any macro factors--they will also be uncorrelated with the errors of any other counterparty. This is an insight of Jarrow, Lando and Yu:
Given the time series data for the Wells Fargo & Co. example, we want the formula we use for simulation forward to be consistent with the logistic regression formula for default probabilities. There are a number of techniques by which we can achieve this objective:
- Maximum likelihood estimation
- Fractional regression, logistic (a special case of maximum likelihood estimation)
- Non-linear least squares
- Ordinary least squares on the transformed default probability
For reasons that we explain below, fractional regression with the logistic distribution is best practice even though for many readers common practice is the ordinary least squares approach (see
Williams [2015] for more on its use). The ordinary least squares approach produces biased estimates of the default probability conditional on the input values (macro factors and others) as we explain below. The KRIS version 6.0 reduced reduced form equations are generated from fractional regression. For model validation purposes, we compare fractional regression results with the ordinary least squares approach in this appendix.
Background on the Ordinary Least Squares Approach
Both the ordinary least squares approach and the fractional regression approach are based on the logistic formula for the unannualized default probability expressed as a decimal (not percent), P[t]:
Fractional regression using the logistic distribution fits the time series of default probabilities directly, without using any transformations. If we transform both sides of equation, we can explain a transformation Z (the logit of the default probability) as a linear function of our explanatory variables:
Ordinary least squares regression can be used to derive the intercept alpha and beta coefficients and to determine the magnitude of the error term in fitting the transformed variable Z[t]. Unfortunately, this method is a biased estimator of the absolute level of the default probability for reasons we explain in the next section. Although the ordinary least squares method is commonly used, Kamakura has upgraded to fractional logistic regression for the KRIS Version 6.0 reduced reduced form equations because fractional logistic regression has succeeded in an “effective challenge” of the ordinary least squares approach.
For purposes of our Wells Fargo & Co. example, we derive the following linear regression for Method 2, a naïve model that is closely related to the common practice assumption that the distribution of future default probabilities is the same as their historical distribution (no matter how the forecast of macro factors may vary). A simple naïve model says the Wells Fargo & Co. transformed three month default probability is a function of its lagged value. The results of such a linear regression using the Z[t] transformed default probabilities as the dependent variable:
The naïve regression explains 54.93% of the variation of Wells Fargo’s transformed 3 month default probabilities on a quarterly basis from January 1990 through September 2015. The error term is implicitly assumed to have a mean of zero and a standard error (deviation) of 0.88595 (note that this is the standard error for the transformed Z[t] variable, not the raw default probability itself). The lagged transformed default probability is statistically significant. When we reverse the transformation of Z[t], the equation we use to simulate the 3 month default probability one quarter forward is this:
Note the error term on the right hand side of the equation. The error term u(t)] will be simulated as a normally distributed variable with a mean of zero and a standard deviation of 0.88595. This specification has the same bias that we discuss below in a more general context.
We now compare this naïve Method 2 model with Method 3, which uses only macro-economic factors for inputs. For our candidate macro factors, we use the macro-economic factors named by the Board of Governors of the Federal Reserve for their
Comprehesive Capital Analysis and Review stress-testing process in 2016. We generate a number of explanatory variables that are “derived variables” from these macro factors. A simple step-wise regression which explains the transformed Wells Fargo & Co. 3 month default probability as a function of these macro factors using this Method 3 (macro factors only) via an ordinary least squares regression is given here:
Note that we have reduced the root mean squared error on the transformed variable Z[t] to 0.6756 and increased the adjusted r-squared to 0.7171, much higher than our naïve model in Method 2. The explanatory variables are the 1 year change in home prices, the VIX volatility index, and the 1 year change in the unemployment rate. The difference between actual and forecasted 3 month Wells Fargo default probabilities is the idiosyncratic component of Wells Fargo & Co. default risk, 28.29% of the total risk of the firm on a transformed basis. Jarrow, Lando and Yu explain that, if our list of candidate macro factors is complete, this idiosyncratic risk of Wells Fargo will be uncorrelated with the idiosyncratic risk of any other firm.
Next, we fit the same equation using fractional regression and the logistic distribution. Unlike the ordinary least squares approach, which fits the transformed variable Z[t], fractional regression fits the time series of the default probability P[t] directly, so the accuracy statistics are not directly comparable. The results of the fractional regression using the macro factor variables as the only candidate variables are shown here:
The fractional regression results reveal that there are in fact 15 explanatory variables, not 3, that are statistically significant when fitting the level of default probabilities directly rather than fitting the transformed variable Z[t]. We can compare the ordinary least squares method and the fractional regression method by generating the fitted default probabilities for both methods and then running a simple ordinary least squares regression of the form
Actual Default Probability = A + B (Fitted Default Probability)
The results for the fractional regression case are shown here:
Measured on a sum of squared errors basis, fractional regression using the logistic distribution gives us an adjusted r-squared of 0.9952 and a root mean squared error of 0.00035 (0.035%). Moreover, the coefficient on the fitted Kamakura default probability is very close to 1 at 1.041806. For the ordinary least squares basis, we get the following results:
The adjusted r-squared for the ordinary least squares case, after transforming the fitted variable Z[t] to get a fitted default probability P[t], is only 0.3650, more than 60 percentage points lower than the fractional regression case. There are two other (related) problems. First, the root mean squared error is ten times higher at 0.00403 (0.403%). Second, the coefficient on the fitted default probability is far larger than one at 1.775374. This means that are fitted default probabilities (and the related macro factor sensitivities) are biased low to a significant degree.
This bias is a fatal flaw in the ordinary least squares technique and we reject it for this reason. In the next section we explain the source of the bias, and then proceed to test the incremental accuracy of adding company specific inputs as of time zero. We close this section with a graph of the fitted and actual (green) default probabilities for Wells Fargo & Co. using macro factors as the only explanatory variables. The fitted values using fractional regression are in blue, and the fitted values using ordinary least squares are in red:
Why is the Ordinary Least Squares Estimate of the Reduced Reduced Form Equation Biased?
The bias that stems from transforming the fitted variable Z[t] to a fitted default probability P[t] was noted by Papke and Wooldridge [1996] in their paper proposing fractional regression for fitting a time series variable that always falls in the range from zero to one, like our time series of default probabilities. In this section, we follow Chapter 8 of Stock and Watson’s Introduction to Econometrics (page 277, third edition update, 2015) to explain why the ordinary least squares methodology used in the example above produces biased estimates of the absolute level of default probabilities.
When we reverse the transformation of the ordinary least squares estimates for the default probability P[t], we get the following (with the error term u explicitly included):
Regression estimated:
Inverted relationship for P[t]
Note that we can rewrite this so an exponential term in u[t] is multiplied by the other exponential term in the denominator:
We need an estimate of the mean value of P[t] at each point in time t, conditional on the values of the n explanatory variables X1, X2, …Xn. The expected value of P[t] conditional on the values of the inputs X1, X2, ...Xn is
In our work above, we assumed that
But this assumption is false even when
That is because
For example, consider the case where u[t] is normally distributed with mean m and standard deviation of σ. We know from the appendix to Chapter 4 of Jarrow and Turnbull [2000, second edition, pages 112-113] that the expected value of a lognormally distributed variable can be written
In our example, we have implicitly assumed
This assumption is false, and the result is a biased estimate of P[t]. In KRIS Version 6.0, we avoid this bias by adopting the fractional regression (logistic) approach. We now turn to the 4th modeling option, where we include company specific variables known at time zero and test their significance using a different econometric relationship for each time horizon, from 1 quarter through 13 quarters in CCAR.
Method 4: Adding Time Zero Company Specific Explanatory Variables
We now turn to Method 4, where we add time zero financial statements as candidate explanatory variables. We focus most heavily on the fractional regression approach, now that the bias in the ordinary least squares approach has been measured and explained. Method 4 (and a strict implementation of Method 2 if the analyst has high standards) works like this. We have a first regression that forecasts the 3 month default probability one quarter forward. This regression uses the financial and stock price inputs lagged one quarter, but the macro factors are their contemporary values because they are “known” as part of the simulation. The second regression forecasts the default probability two quarters ahead using financial and stock price inputs lagged two quarters, but again the macro factors are not lagged. The third regression uses macro factors with the other inputs lagged three quarters and so on. For a simulation of 13 quarters, as in the Federal Reserve’s CCAR stress test program, we need 13 regressions for Wells Fargo & Co. instead of one equation as in Method 3. The first equation for Wells Fargo & Co. using fractional regression for Method 4 looks like this:
There are 6 variables with a prefix of “l1” (for lagged one quarter) that are statistically significant, including an excess return variable, a stock price volatility variable, three accounting ratios, and a stock price ratio. Because these variables are statistically significant but were omitted from the “macro factors only” specification in Method 2, we know that there is omitted variable bias at least at a 1 quarter forecasting horizon. We compare our fitted fractional regression estimates of the default probability with an ordinary least squares estimate in this log scale graph:
The results for forecasting default probabilities two quarters ahead using macro factors and company specific inputs lagged two quarters are given here:
Again, six company specific variables are statistically significant, and four of those six were also significant at a one quarter forecasting time horizon. The log scale graph of fitted versus actual default probabilities two quarters forward is shown here:
In the 13th quarter, we use financial ratios and stock price inputs that are lagged 13 quarters. The results look like this:
There are 4 time zero variables that remain statistically significant: company size, stock price volatility, the net income to total assets ratio, and the KRIS statistically implied credit rating ordinal variable. We now compare methods 3 and 4 for both the fractional regression and ordinary least squares method.
Comparing Method 3 and Method 4
The graph below shows the adjusted r-squared for both the fractional regression approach and the ordinary least squares approach using two different sets of explanatory variables: macro factors only, and macro factors plus time zero company specific inputs. The adjusted r-squareds are from the regression specification
Actual Default Probability = A + B (Fitted Default Probability)
Here are the results:
Both fractional regression specifications overwhelmingly dominate the ordinary least squares specifications for the reasons explained above. The differential between the fractional regression equation with both macro factors and time zero company specific inputs and the fractional regression with macro factors only is not distinguishable for Wells Fargo & Co., but it is very significant for some firms, like PNC Financial Services Group Inc. shown below:
The results confirm that using company specific inputs does add incremental accuracy for most firms, but at a cost of adding a different (but related) regression for each forecasting horizon.
IMPORTANT CONCLUSION: Fractional regression (logistic) succeeds in an effective challenge of a more conventional approach, using ordinary least squares on the transformed default probabilities. Fractional regression is clearly the best practice approach from an accuracy point of view.
We now turn to the implied movements in default probabilities using the actual Federal Reserve CCAR scenarios and the historical statistical relationships for Wells Fargo & Co.
Implied CCAR Stress Test Results for Wells Fargo & Co.
Using both macro factors and time zero company specific inputs for the stress test gives the following default probability movements for Wells Fargo & Co. using the CCAR 2016 scenarios and a log scale:
A regular scale shows the same default probability movements, and it makes it clear that history implies a high degree of default risk for Wells Fargo & Co. in the Federal Reserve’s severely adverse scenario (in red). The stress test results that an insider would generate, using full information of all assets and liabilities of the firm, will of course be different. The degree of difference depends on the degree to which the future for the firm is like its past.
Conclusion
We have conducted an effective challenge of the commonly used ordinary least squares approach for linking macro-economic factors to historical default probability movements. The effective challenge is provided by fractional regression using the logistic distribution. We conclude that fractional regression is much more accurate than the ordinary least squares approach. We also show why the ordinary least squares approach is a biased estimator of the historical default probabilities. We conclude that the fractional regression approach is the best practice estimator of default probabilities for stress testing purposes. We illustrate its use on historical three month default probabilities for Wells Fargo & Co.
References
Angrist, Joshua D. and Jorn-Steffen Pischke, Mostly Harmless Econometrics: An Empiricist’s Companion, Princeton University Press, Princeton, 2009.
Baum, Christopher F. “Modeling Proportions,” The Stata Journal, Volume 8, Number 2, 299-303, 2008.
Bharath, Sreedhar and Tyler Shumway, "Forecasting Default with the Merton Distance to Default Model," Review of Financial Studies, May 2008, pp. 1339-1369.
Campbell, John Y, Andrew W. Lo, and A. Craig McKinley, The Econometrics of Financial Markets, Princeton University Press, 1997.
Goldberger, Arthur S. A Course in Econometrics, Harvard University Press, 1991.
Hamilton, James D. Times Series Analysis, Princeton University Press, 1994.
Hansen, Bruce E.
Econometrics, University of Wisconsin, January 15, 2015.
Jarrow, Robert, Donald R. van Deventer and Xiaoming Wang, “A Robust Test of Merton’s Structural Model for Credit Risk,” Journal of Risk, fall 2003, pp. 39-58.
Johnston, J. Econometric Methods, McGraw-Hill, 1972
Maddala, G. S. Introduction to Econometrics, third edition, John Wiley & Sons, 2005.
Papke, Leslie E. and Jeffrey M. Wooldridge, “Econometric Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates,” Journal of Applied Econometrics, Volume 11, 619-632, 1996.
Stock, James H. and Mark W. Watson, Introduction to Econometrics, third edition, Pearson/Addison Wesley, 2015.
Studenmund, A. H. Using Econometrics: A Practical Guide, Addison-Wesley Educational Publishers, 1997.
Theil, Henri. Principles of Econometrics, John Wiley & Sons, 1971.
Woolridge, Jeffrey M. Econometric Analysis of Cross Section and Panel Data, The MIT Press, 2002.