20  Kelly Betting

Author

Rahul Shah, Jean-Stanislas Denain, Jacob Steinhardt

In this lecture, we will describe Kelly Betting, which is a reasonable heuristic to decide how much of your money to invest in an asset that is profitable in expectation, but risky.

As a first example, consider the following situation:

Coin Flip

You pay \(n\) dollars, and a fair coin is flipped:

  • if the outcome is Heads, you receive \(3n\) dollars: your net earnings are \(2n\) dollars
  • if the outcome is Tails, you receive nothing: your net earnings are \(-n\) dollars

Question: What value of \(n\) do you choose?

This is a good bet in expectation: your expected wealth is \(\frac 1 2 \times (3n) + \frac 1 2 \times (-n) = +n\) dollars. Therefore, if you wanted to maximize your expected amount of dollars, you should choose the highest possible value of \(n\) and bet all of your money.

However, this is very risky: if you bet all of your money, you will lose everything with probability \(\frac 1 2\), if the coin comes up Tails. Thus, most of you are unlikely to bet all of your money, but rather some fraction of your money. The Kelly Criterion provides a useful rule of thumb to choose this fraction.

To introduce Kelly betting, we will first introduce a sequential version of the Coin Flip situation above, and show how maximizing the most likely or median value of your wealth can lead to more reasonable choices decisions than maximizing your expected wealth. Then, we will introduce the Kelly Criterion in a more general setting. Finally, we will explain how to generalize Kelly betting to situations where you have to choose between multiple bets.

20.1 Maximizing the typical value of your wealth

Consider the following iterated version of the Coin Flip example above:

Iterated Coin Flip

There are 100 days. Every day, you get the same offer:

  • You pay some fraction \(\alpha\) of your current wealth
  • A fair coin is flipped
    • if the outcome is Heads, you receive \(3\) times the amount you paid
    • if the outcome is Tails, you receive nothing

We make the simplifying assumption that \(\alpha\) is the same for every day: you have to choose it once and for all.

Question: What fraction \(\alpha\) of your current wealth should you bet on each day?

Let \(x\) be your wealth at the beginning of the first day. After the first coin flip, your wealth becomes:

  • \(x - \alpha x + 3 \alpha x = (1 + 2 \alpha) x\) if the coin comes up Heads
  • \(x - \alpha x = (1 - \alpha) x\) if the coin comes up Tails

After 100 days, your wealth is

\[ (1 + 2 \alpha)^k (1 - \alpha)^{100 - k} x, \text{ with } k \sim \text{ Binomial}(n = 100, p = 1/2). \]

As in the previous example, the expected value of this quantity is greatest when \(\alpha = 1\): to maximize your expected wealth, you should bet all of your money every day. However, this means that every day you have a probability of one half of losing all of your money. Even though you have a tiny chance of becoming enourmously rich, you will be ruined with overwhelming probability \(1 - (1 / 2)^{100}\).

Instead of looking at the expectation of your wealth after 100 days, you could focus on its value in the typical case, by considering the modal or median value of your wealth. This corresponds to taking \(k = 50\), hence a typical wealth of

\[ [(1 + 2 \alpha)(1 - \alpha)]^{50}. \]

You can then maximize this modal value over \(\alpha\): this is tantamount to finding the maximum of the second degree polynomial \((1 + 2 \alpha)(1 - \alpha)\), which is reached at \(\alpha = \frac 1 4\), leading to a median wealth of \(\left(\frac 9 8\right)^{50}\).

Let us now consider a different bet:

Iterated Biased Coin Flip

Once again, there are 100 days, and every day you get the same offer:

  • You pay some fraction \(\alpha\) of your current wealth
  • With probability \(\frac 2 3\), you receive twice the amount you paid
  • With probability \(\frac 1 3\), you receive nothing

Question: What fraction \(\alpha\) of your current wealth should you bet on each day?

The same reasoning as before shows that your wealth after 100 days is

\[ (1 + \alpha)^k (1 - \alpha)^{100 - k}, \text{ with } k \sim \text{ Binomial}(n = 100, p = 2/3). \]

The median and most likely value of this random variable is \((1 + \alpha)^{\frac 2 3 \cdot 100} (1 - \alpha)^{\frac 1 3 \cdot 100}\). So we want to find \(\alpha \in [0, 1]\) that maximizes the cubic \((1 + \alpha)^2 (1 - \alpha)\): this corresponds to \(\alpha = \frac 1 3\). This makes us approximately 290x our wealth!

In both these examples, we saw how maximizing the modal or median value of your wealth leads to more intuitive levels of risk-taking than maximizing the expected value. In the next section, we will introduce the Kelly Criterion in a more general setting.

Code
import numpy as np
import numpy.random as rnd
import matplotlib.pyplot as plt

def simulate(choices, probs, alpha, time_steps=100, num_trials=1000):
    def outcome():
        return rnd.choice(choices, size=num_trials, p=probs)
    wealth = np.ones(num_trials)
    for t in range(time_steps):
        wealth = wealth + alpha * wealth * outcome()
    theoretical_wealth = np.exp(time_steps * np.sum(np.array(probs) * np.log(1 + alpha * np.array(choices))))
    return wealth, theoretical_wealth

def make_plot(wealth, theoretical_wealth, bins=np.geomspace(1e-4, 1e6, num=25)):
    print('Median payoff: %.2e (Theory = %.2e)' % (np.median(wealth), theoretical_wealth))
    plt.hist(wealth, bins=bins)
    plt.axvline(x = theoretical_wealth, color = 'r', label = 'Theory')
    plt.semilogx()
    plt.show()
    
choices = [-1, 2]
probs = [0.5, 0.5]
wealth, theoretical_wealth = simulate(choices, probs, 0.8)
make_plot(wealth, theoretical_wealth)
Median payoff: 6.31e-15 (Theory = 6.31e-15)

Code
choices = [-1, 1]
probs = [1.0/3, 2.0/3]
wealth, theoretical_wealth = simulate(choices, probs, -10)
make_plot(wealth, theoretical_wealth)
/var/folders/1d/cjs9g5m16s957s_0r5w_7pcc0000gn/T/ipykernel_83400/2423885937.py:11: RuntimeWarning: invalid value encountered in log
  theoretical_wealth = np.exp(time_steps * np.sum(np.array(probs) * np.log(1 + alpha * np.array(choices))))
Median payoff: 3.28e+97 (Theory = nan)

Code
choices = 2   ** np.array([-np.inf, 0, 1, 2, 3, 4, 5, 6, 7]) - 1
probs   = 0.5 ** np.array([1, 2, 3, 4, 5, 6, 7, 8, 8])
wealth, theoretical_wealth = simulate(choices, probs, 0.12)
make_plot(wealth, theoretical_wealth, bins=np.geomspace(1e-5, 1e10, num=25))
Median payoff: 2.93e+01 (Theory = 3.66e+01)

20.2 Kelly Betting: the general case

20.2.1 Maximizing the Expected Logarithm

Suppose that:

  • Your initial wealth is \(S\) dollars
  • Every day \(t\):
    • You invest some fraction \(\alpha\) of your current wealth
    • You receive \(X_t\) times the amount you paid

Moreover, we make the following assumptions:

  • The \(X_t\) are sampled independently from the same distribution: \(X_t \overset{iid}{\sim} p\)
  • \(\alpha X_t > -1\) almost surely: you cannot lose more than what you invested

Then, after \(T\) days, your wealth is:

\[ S (1 + \alpha X_1)(1 + \alpha X_2) \dots (1 + \alpha X_T). \]

In general, we understand the limiting behavior of sums of random variables better than the limiting behavior of products. This motivates taking the logarithm, and forming the average:

\[ \frac 1 T \sum\limits_{t=1}^T \log \left(1 + \alpha X_t\right). \]

As \(T\) tends to infinity, we can apply the Law of Large Numbers to this average, which converges almost surely to the expected logarithm:

\[ \mathbb{E}_{X \sim p}\left[ \log (1 + \alpha X) \right]. \]

We can then pick \(\alpha\) to maximize this expectation, which is called the Kelly Criterion.

20.2.2 The Kelly Criterion for binary bets

The following bet is a common use case of the Kelly Criterion:

  • With probability \(p\), you win and are returned \(b + 1\) times your investment, so you net \(b\) times your investment.
  • With probability \(q = 1- p\), you lose and receive nothing.

In this case, the expected logarithm is

\[ p \log \left(1 + b\alpha\right) + q \log \left(1 - \alpha\right) \]

The derivative vanishes when:

\[ \frac {p b} {1 + b \alpha} - \frac {q} {1 - \alpha} = 0, \]

which is equivalent to:

\[ \alpha = p - \frac q b. \]

This fraction increases when when your probability of winning \(p\) increases, and when \(b\) increases. It vanishes when \(b = q / p\): in this case, the bet has zero expectation and you have no edge, so the Kelly criterion recommends betting nothing.

As an example, consider the Iterated Biased Coin Flip example from the previous section. In that case, \(p = 2/3\), \(q = 1/3\) and \(b = 1\), thus the Kelly criterion suggests \(\alpha = 1/3\). This is the same as the result obtained by maximizing the median value of the wealth.

20.3 Choosing between multiple bets

So far, we have considered cases where there is only one possible bet: you can either invest your money in this bet, or keep it. However, a more realistic situation is that at any point in time you have to choose between multiple bets. In these cases, how should you split your money between these bets?

For example, suppose you could invest a fraction \(\alpha_1\) of your wealth in a first bet with return \(X^{(1)} \sim p_1\), and a fraction \(\alpha_2\) of your wealth in a second bet with return \(X^{(2)} \sim p_2\). Using the same reasoning as the previous section, you could maximize the following quantity over \((\alpha_1, \alpha_2)\):

\[ \mathbb{E}_{X^{(1)} \sim p_1, X^{(2)} \sim p_2}\left[ \log \left(1 + \alpha_1 X^{(1)} + \alpha_2 X^{(2)}\right) \right]. \]

However, this optimization problem often becomes intractable as the number of possible bets becomes large. For example, think about the enormous number of possible investments a trader can make at any point in time.

One idea could be to use the Taylor approximation \(\log(1 + z) \simeq z - \frac {z^2} 2\). For a single bet, this yields:

\[ \mathbb{E}_{X \sim p}\left[ \log (1 + \alpha X) \right] \simeq \alpha \mathbb{E}(X) - \frac {\alpha^2} 2 \mathbb{E}(X^2), \]

which is maximized by \(\alpha = \frac {\mathbb{E}(X)} {\mathbb{E}(X^2)}.\) Intuitively, this approximation favors bets whose returns have a high expected value but a low variance.

When choosing between \(n\) bets with returns \(X^{(i)} \sim p_i\), a first simpler case is when these bets are all independent. Then, the same reasoning applies to each bet independently, and

\[ \alpha_i \propto \frac {\mathbb{E}(X^{(i)})} {\mathbb{E}(X^{(i)^2})}. \]

The formula becomes more complicated when the \(X^{(i)}\) are no longer independent:

\[ \mathbb{E}_{\forall i, X^{(i)} \sim p_i}\left[ \log \left(1 + \sum\limits_{i=1}^n \alpha_i X_i\right) \right] \simeq \alpha^\top \mathbb{E}(X) - \frac 1 2 \alpha^\top \mathbb{E}(XX^\top) \alpha, \]

which is maximized by \(\alpha = \mathbb{E}(XX^\top)^{-1} \mathbb{E}(X)\) (assuming that \(\mathbb{E}(XX^\top)\) is invertible). Qualitatively, this approximation favors diversification, by assigning greater fractions \(\alpha_i\) to bets that are less dependent on the others.