16 Independence Assumptions

Author

Jean-Stanislas Denain, Jacob Steinhardt, Rahul Shah

16.1 Warm-up Exercise

What is the probability that I (Jacob) eat cereal for breakfast tomorrow?

Students responded with:

What is the probability that I eat cereal at least once in the next week?

and:

respectively.

You’ll note that students assumed these would not be independent (except for maybe the last group).

The reasoning is that if you have cereal one day, then you’re more likely to either run out of cereal the next day, or even want to add diversity to your diet and have a different breakfast.

We can condition on different outcomes by defining: \[ Z_i = \begin{cases} 1 \text{ if Jacob likes cereal} \\ 0 \text{ otherwise} \end{cases} \] and calculate probabilities!

Exercise: Calculate probabilities with \[Z_i = \begin{cases}2 \text{ if Jacob likes cereal a lot} \\ 1 \text{ if Jacob likes cereal, but will get sick of it after awhile} \\ 0 \text{ otherwise}\end{cases}\] for a better forecast!

16.2 Other Non-Independence Examples

16.2.1 Latent Variables

With each of the following examples, we consider specific latent variables that induce causation:

Covid lockdowns per state
- Are some states more politically inclined to do/not do lockdowns?
- Is there herd behavior: when one state does something big, other states will follow to save face.
Rain in next week vs. on Wednesday
Celtics losing next game to Nets vs. next 3 games
- Injuries
- If they lose, they will try harder (team rivalries)
- How well have they been playing this season (team chemistry)
- Psychological factors (if they start off on a bad foot, they will be playing catchup and that may be a prior for them to do worse)
- Tiredness of player (often seen with back-to-back events, or far traveling)

Brainstorming: other examples?

Flights leaving airport in a week -> bad weather could make all days have few flights -> border closure
Performance of overall stock market (and correlation across different stock markets)

Think about some yourself!

16.3 Two Consequential Wrong Predictions

2008 financial crisis
2016 US presidential election

Both were (partly) failures to account for non-independence!

16.3.1 2016 Election – A Simple Model

Each state \(\mathrm{s}\) has \(\mathrm{N}_{\mathrm{s}}\) polls conducted in that state
Poll \(i\) in state \(s\) has sample size \(n_{i, s}\) and \(k_{i, s}\) respondents who will vote for Clinton
Aggregate polls together - total margin of error is approximately \(\frac{1}{\sqrt{\text{total sample size}}}\).
Assume each state’s vote share has Gaussian error around the polling results
Simulate draws of all 50 states, look at how often Clinton wins across many different draws
This gives \(>99\%\) probability to Clinton winning

Even if we change the distribution we are modeling each state as from lognormal to the Student-\(t\) distribution, we still give \(>99\%\) probability to Clinton winning.

16.3.2 2008 Financial Crisis

Basic background: many people were offered “subprime” (i.e. risky) mortgages on their houses. Or in other words, they were offered loans with an (initially) low interest rate, despite not having strong finances.
- Rates increased over time, but housing values were also increasing.
The loans were partly financed through retirement accounts.
But retirement accounts are supposed to be low risk – so how was this possible?

16.3.2.1 Collateralized Debt Obligations (CDOs)

CDOs are a way to turn risky financial instruments (bets) into a less risky bet
Simplest way to reduce risk: take \(N\) bets (mortgages) and average
But can do better, with tranches, ranked from senior to junior
If a mortgage defaults, most junior tranches take losses first

The main idea was that Senior tranches should be very unlikely to ever take heavy losses.

These risk assessments assumed mortgage defaults were independent, or at least not too correlated. But if national housing prices dropped, many people would default at once. Even senior tranches might not pay out.

This happened, and highly leveraged investment banks collapsed (+ many other bad things).

16.4 Gaussian Copulas

Unfortunately real-life isn’t always Gaussian, it’s more often closer to something that resembles a Bernoulli distribution (you either get back everything or you don’t get back anything).

Consider a random vector \(\left(X_1, X_2, \ldots, X_d\right)\). Suppose its marginals are continuous, i.e. the marginal CDFs \(F_i(x)=\operatorname{Pr}\left[X_i \leq x\right]\) are continuous functions. By applying the probability integral transform to each component, the random vector \[ \left(U_1, U_2, \ldots, U_d\right)=\left(F_1\left(X_1\right), F_2\left(X_2\right), \ldots, F_d\left(X_d\right)\right) \] has marginals that are uniformly distributed on the interval \([0,1]\). The copula of \(\left(X_1, X_2, \ldots, X_d\right)\) is defined as the joint cumulative distribution function of \(\left(U_1, U_2, \ldots, U_d\right)\) : \[ C\left(u_1, u_2, \ldots, u_d\right)=\operatorname{Pr}\left[U_1 \leq u_1, U_2 \leq u_2, \ldots, U_d \leq u_d\right] \]

The main idea behind copulas is to simulate the correlation structure you want with a Gaussian and then apply a transform to get the shape you want.

Even though, this gets correlation correct, it messes up in the tails of the distribution wrong. Thus in practice, \(t\)-copulas are often used instead of Gaussian copulas.

16.4.1 Tail dependence

Despite correlation in bulk, copulas do not have Tail dependence: extremes in one variable don’t imply extremes in other.