Module 0.2: Probability, Statistics & Risk Basics

Investment success requires thinking in probabilities, not certainties. Warren Buffett describes risk as permanent loss of capital, not volatility. Yet most investors conflate the two. This module equips you with the statistical language and reasoning framework used by professional investors to understand, measure, and manage risk. We begin with descriptive statistics-measures of return and volatility-and advance to probability distributions, correlation, and Bayesian thinking.

Lesson 1: Descriptive Statistics: Mean, Median, Standard Deviation

The Mean (Average) Return

The mean (or average) return is the sum of all returns divided by the number of periods. It is the most common measure of expected return.

Suppose a stock had annual returns of: +15%, +8%, -5%, +22%, +3%, -2%, +18%, +12%

The mean return is: (15 + 8 - 5 + 22 + 3 - 2 + 18 + 12) / 8 = 71 / 8 = 8.875%

This is the "average" return you earned per year. If you invested $10,000 at the beginning and earned 8.875% annually for 8 years:

$10,000 × (1.08875)^8 = $21,545 (approximately)

But did you actually earn that rate each year? No-you earned +15%, then +8%, etc. The mean is useful for comparison but can be misleading if the distribution is volatile.

The Median Return

The median is the middle value when returns are sorted in order. Using the previous example, sorted: -5%, -2%, +3%, +8%, +12%, +15%, +18%, +22%

With 8 values, the median is the average of the 4th and 5th values: (8 + 12) / 2 = 10%

The median return (10%) is higher than the mean return (8.875%) because the -5% and -2% outliers on the low end pull the mean down. In skewed distributions, the median is often more representative of a "typical" year.

Mean vs. Median If a portfolio has returns of: -50%, +200%, these average to +75% mean, but the median is +75% as well. However, if you invested $100 initially, you'd have $50 after year one, then $100 after year two-back to the start. The median return (+75%) feels more honest about a "typical" return than the mean in this extreme case.

Standard Deviation and Variance

Variance measures how spread out returns are from the mean. Standard deviation is the square root of variance and is expressed in the same units as returns (percentage points).

Using our previous 8-year return series (mean = 8.875%):

|------|--------|----------------------|-------------------|

| 1 | 15% | 6.125% | 37.5% |

| 2 | 8% | -0.875% | 0.77% |

| 3 | -5% | -13.875% | 192.5% |

| 4 | 22% | 13.125% | 172.3% |

| 5 | 3% | -5.875% | 34.5% |

| 6 | -2% | -10.875% | 118.3% |

| 7 | 18% | 9.125% | 83.3% |

| 8 | 12% | 3.125% | 9.8% |

Variance = (37.5 + 0.77 + 192.5 + 172.3 + 34.5 + 118.3 + 83.3 + 9.8) / 8 = 81.4%

Standard Deviation = √81.4% ≈ 9.0%

This means returns deviated from the mean by an average of 9.0 percentage points. A stock with a standard deviation of 9% is more volatile than one with 5% but less volatile than one with 15%.

Real Example: Apple vs. Coca-Cola Return Volatility

Over the past 10 years (simplified):

Apple: mean return 21% annually, standard deviation 35%
Coca-Cola: mean return 12% annually, standard deviation 16%

Apple offered higher returns (21% vs. 12%) but with much higher volatility (35% vs. 16%). An investor seeking lower volatility would prefer Coca-Cola despite lower average returns. One seeking growth might accept Apple's volatility for its upside potential.

This is the core trade-off in investing: higher returns require accepting higher volatility (risk).

Lesson 2: Probability Distributions and Market Behavior

The Normal (Gaussian) Distribution

The normal distribution is bell-shaped: most observations cluster near the mean, with fewer observations in the tails (extreme values). Many natural phenomena follow normal distributions.

For a normally distributed variable with mean μ and standard deviation σ:

68% of observations fall within 1σ of the mean
95% fall within 2σ of the mean
99.7% fall within 3σ of the mean

In investing, suppose a stock has an expected return of 12% and standard deviation of 20%. Under a normal distribution:

68% probability the return is between -8% and +32% (12% ± 20%)
95% probability the return is between -28% and +52% (12% ± 40%)
5% probability the return is below -28% or above +52%

This is useful for understanding typical ranges, but the normal distribution has a critical flaw: it underestimates the frequency of extreme events.

Fat Tails and Non-Normal Distributions

Stock market returns do NOT follow normal distributions. Real markets exhibit "fat tails"-more extreme events than the normal distribution predicts.

Historical evidence:

Normal distribution predicts returns below -20% should occur roughly once every 1,000 years
In reality, the S&P 500 experiences -20% or worse annually about once per 20 years
The crash of 1987 (-22% in one day) was 20 standard deviations from the mean-literally impossible under a normal distribution

This is called "kurtosis" (fat tails) and "skewness" (asymmetry). Investment risk is NOT normally distributed. Black swan events (market crashes, financial crises) are more frequent than mathematics suggests.

Fat Tails Require Buffer Capital If you are comfortable with "normal" volatility, you are still unprepared for crash scenarios. Successful investors build cash buffers precisely because they expect non-normal tail events. The 2008 financial crisis, the 2020 pandemic crash, and the 2022 rate shock all featured "impossible" tail events.

Comparing Distributions with Expected Value

Expected value (or expected return) combines probability with outcomes. If a stock has a 60% chance of returning +20% and a 40% chance of returning -10%:

Expected return = (0.60 × +20%) + (0.40 × -10%) = 12% - 4% = 8%

This is more sophisticated than just taking the mean of historical returns. You are weighting outcomes by their probability.

Lesson 3: Correlation, Diversification, and the Two-Asset Portfolio

Correlation Basics

Correlation measures how two investments move together. Correlation ranges from -1 (perfect negative: they move opposite) to +1 (perfect positive: they move together) to 0 (uncorrelated).

Examples of correlation:

Apple and Microsoft: highly positive correlation (~0.8), both tech stocks
Real estate and bonds: low correlation (~0.2)
Stocks and gold: negative correlation (~-0.15), gold often rises when stocks crash
Two different gold mining stocks: positive correlation (~0.6)

To calculate correlation, you measure the covariance (how returns move together) and normalize by each asset's standard deviation:

Correlation = Covariance / (StdDev₁ × StdDev₂)

Covariance is the average product of deviations from means for both assets.

Two-Asset Portfolio Risk and Diversification

Suppose you have two stocks:

Stock A: expected return 12%, standard deviation 25%
Stock B: expected return 8%, standard deviation 15%
Correlation between A and B: 0.3 (low correlation, good for diversification)

If you invest 50% in each stock, the portfolio expected return is:

E(Portfolio) = (0.50 × 12%) + (0.50 × 8%) = 10%

The portfolio standard deviation (volatility) is:

σ(Portfolio) = √[(0.50)² × (25%)² + (0.50)² × (15%)² + 2 × 0.50 × 0.50 × 0.3 × 25% × 15%]

σ(Portfolio) = √[0.0625 + 0.0225 + 0.028125]

σ(Portfolio) = √0.1131 ≈ 33.6%

Now consider if the correlation were perfect (+1.0) instead:

σ(Portfolio) = √[(0.50)² × (25%)² + (0.50)² × (15%)² + 2 × 0.50 × 0.50 × 1.0 × 25% × 15%]

σ(Portfolio) = √[0.0625 + 0.0225 + 0.075]

σ(Portfolio) = √0.16 = 20%

With perfect correlation, the portfolio volatility is just the weighted average of individual volatilities: (50% × 25%) + (50% × 15%) = 20%.

With low correlation (0.3), the portfolio volatility is only 16.8%-lower than the weighted average! This is the magic of diversification: combining assets with low correlation reduces portfolio risk without reducing expected return.

Diversification is "Free Lunch" When Correlation is Low A 50/50 portfolio of uncorrelated assets (correlation 0) has standard deviation of exactly 14.5%, lower than either individual asset. You get the benefits of both assets while reducing total risk. This is why correlation is critical-high correlation destroys diversification benefits.

Real Diversification Example: Stocks and Bonds

A portfolio is 70% stocks (expected return 10%, std dev 18%) and 30% bonds (expected return 4%, std dev 6%).

Expected return: (0.70 × 10%) + (0.30 × 4%) = 7.0% + 1.2% = 8.2%

Standard deviation (assuming stock-bond correlation of -0.1):

σ = √[(0.70)² × (18%)² + (0.30)² × (6%)² + 2 × 0.70 × 0.30 × (-0.1) × 18% × 6%]

σ = √[0.0158 + 0.0032 - 0.0045]

σ = √0.0145 ≈ 12.0%

A pure stock portfolio would have 18% volatility. By mixing in 30% bonds (which move opposite to stocks in many scenarios), the portfolio volatility drops to 12% while maintaining decent return (8.2%). This is why professional investors mix asset classes.

Lesson 4: Understanding Risk: Volatility vs. Permanent Loss of Capital

Volatility Is Not Risk (Buffett's Insight)

Warren Buffett distinguishes between:

Volatility: short-term price fluctuations
Risk: permanent loss of capital

A high-quality company might have a volatile stock price but low risk of capital loss if fundamentals are solid. A weak company might have a stable price but high risk if underlying business is deteriorating.

For example:

Coca-Cola's stock might drop 20% in a market crash (high volatility), but its business is strong and you'll recover your money (low risk)
A penny stock might be stable, then collapse 80% when the business fails (low volatility, high risk)

Academic finance measures volatility (standard deviation). Value investors measure business risk: Will the company still have a strong moat and profitable business in 10 years?

Volatility Creates Opportunity for Value Investors When good businesses become volatile and cheaper, they present buying opportunities. If Coca-Cola drops 40% but the business hasn't changed, the price decline represents risk to short-term traders but opportunity to long-term investors. Confusing volatility with risk is the most expensive mistake retail investors make.

Value at Risk (VaR) and Downside Risk

Value at Risk (VaR) measures the maximum expected loss over a time period at a given confidence level. It attempts to quantify tail risk.

Using the earlier example: stock with 12% expected return and 20% volatility. What is the 95% VaR (worst-case scenario 5% of the time)?

Assuming normal distribution:

95% VaR = Expected return - 1.96 × Standard deviation

95% VaR = 12% - (1.96 × 20%) = 12% - 39.2% = -27.2%

This suggests that 5% of the time, returns could be as bad as -27.2%. However, we know distributions have fat tails, so actual downside is worse than VaR suggests.

Lesson 5: Bayesian Thinking and Base Rate Neglect

Bayes' Theorem in Plain Language

Bayes' Theorem is: P(A|B) = P(B|A) × P(A) / P(B)

In plain language: the probability of A given B equals the probability of B given A times the prior probability of A, divided by the overall probability of B.

This matters enormously in investing. Suppose:

A = "This company will go bankrupt"
B = "The company missed earnings guidance"

Investors often see B (missed earnings) and overestimate P(A|B) because they ignore the base rate P(A). Most companies don't go bankrupt. When a good company misses earnings once, P(bankruptcy|missed earnings) might be 5%, not 50%.

Real Investing Example: Growth Stock Valuations

Suppose:

Base rate: 95% of high-growth companies produce long-term shareholder value
Observation: A growth stock has dropped 40% after missing revenue guidance
Question: What is the probability it will produce good returns from here?

Many investors see the 40% drop and assume the probability of future returns is now low. But using Bayesian reasoning:

Base rate: 95% of growth companies succeed long-term
The 40% drop is new information, but it doesn't overturn a 95% base rate
Probability of future success given miss: might be 75% instead of 95%, but still highly likely

Investors who ignore base rates make two mistakes:

Selling good companies after temporary setbacks (recency bias)
Chasing poor companies with recent strength (momentum bias without fundamental analysis)

Anchor to Base Rates Before analyzing a specific company, know the base rate for its category. If you're analyzing a biotech startup, the base rate of failure is 80%+. If you're analyzing a Fortune 500 company, the base rate of business continuity is 99%+. These base rates should anchor your probability estimates.

The Hot Hand Fallacy and Mean Reversion

Investors often believe in "hot hands"-that recent winners will continue winning. But regression to the mean is powerful in markets.

Suppose you observe:

Fund A returned 25% last year (vs. 10% market average)
Fund B returned 6% last year (vs. 10% market average)

Base rate: Most outperformance is due to luck, not skill. The probability that Fund A will beat the market again next year is roughly 50-50 (random luck).

Research confirms: funds with high recent returns do NOT reliably outperform in the future. In fact, high past returns often predict mean reversion (disappointment). This is why "past performance does not indicate future results" is more than a legal disclaimer-it's mathematically true.

Lesson 6: Expected Value and Decision-Making Under Uncertainty

Expected Value Framework

Expected value (EV) is the probability-weighted sum of all outcomes. Professional investors use EV for every decision.

Suppose you are considering two investments:

Investment A:

70% probability of +30% return
30% probability of -10% return
EV = (0.70 × 30%) + (0.30 × -10%) = 21% - 3% = 18%

Investment B:

50% probability of +25% return
50% probability of +10% return
EV = (0.50 × 25%) + (0.50 × 10%) = 12.5% + 5% = 17.5%

Investment A has higher expected value (18% vs. 17.5%), but it also has a 30% chance of a 10% loss. Investment B is more certain but slightly lower EV.

The choice depends on your risk tolerance and portfolio size. If you have a large, diversified portfolio, the higher EV of A likely dominates. If you have concentrated wealth, the certainty of B might be preferable.

Applying Expected Value to Stock Valuation

Suppose a stock is trading at $50. You believe:

40% probability the business thrives, and the stock reaches $100 in 5 years
50% probability the business is stable, and the stock reaches $60 in 5 years
10% probability the business struggles, and the stock reaches $20 in 5 years

Expected value in 5 years: (0.40 × $100) + (0.50 × $60) + (0.10 × $20) = $40 + $30 + $2 = $72

If you require a 10% annual return, the present value of $72 in 5 years is:

PV = $72 / (1.10)^5 = $72 / 1.611 = $44.72

The fair value is $44.72, but the stock trades at $50. This is overpriced, and you should not buy it using expected value analysis.

Alternatively, if the stock traded at $40, it would be underpriced ($40 < $44.72 fair value), and you should consider buying.

Lesson 7: Practical Risk Management and Scenario Analysis

Scenario Analysis: Three Cases

Rather than a single forecast, professional investors model multiple scenarios:

Company: A software SaaS business trading at $100 per share

Bear Case (30% probability):

Competitors increase prices, customer churn accelerates
Revenue grows only 5% annually for 5 years
Margins compress by 2 percentage points
Valuation multiple contracts from 20x to 15x sales
Stock price reaches $60

Base Case (50% probability):

Market grows 15% annually, company takes share
Revenue grows 20% annually for 5 years
Margins stable
Valuation multiple stays at 20x sales
Stock price reaches $150

Bull Case (20% probability):

Company expands into adjacent markets, dominates
Revenue grows 40% annually for 5 years
Margins expand by 3 percentage points
Valuation multiple increases to 25x sales
Stock price reaches $250

Expected value: (0.30 × $60) + (0.50 × $150) + (0.20 × $250) = $18 + $75 + $50 = $143

If the stock trades at $100, the expected value is $143, suggesting an attractive opportunity. But the bear case risk (30% probability of a $40 loss) is real and must be acceptable to the investor.

Position Sizing Based on Conviction

Position sizing is the most powerful risk management tool available to individual investors.

High conviction (70%+ probability of success): 5-10% of portfolio
Medium conviction (50-70%): 3-5% of portfolio
Low conviction (40-50%): 1-3% of portfolio
Avoid positions with less than 40% expected success

By concentrating capital in highest-conviction ideas while maintaining diversification across 20-30 positions, you optimize risk-adjusted returns.

Warren Buffett often holds concentrated positions because his conviction is extraordinarily high. Most investors should maintain broader diversification given lower certainty.

Lesson 8: Practice Problems and Self-Assessment

Problem 1: Portfolio Standard Deviation

You have a 60/40 portfolio: 60% stocks (20% std dev) and 40% bonds (5% std dev). Assume correlation of -0.2 between stocks and bonds. Calculate the portfolio standard deviation.

Answer: σ = √[(0.60)² × (20%)² + (0.40)² × (5%)² + 2 × 0.60 × 0.40 × (-0.2) × 20% × 5%]

= √[0.0144 + 0.0004 - 0.0048]

= √0.01 = 10%

Problem 2: Bayesian Update

A company has a 1% base rate of bankruptcy. You observe a major customer left (bad news). This information makes bankruptcy 5x more likely (5% vs. 1%). What should your new probability estimate be?

Answer: This is already given: 5%. However, note that even with bad news, 95% probability of survival remains high. This is Bayesian updating: base rate (1%) updated by new information (5x more likely) yields 5%.

Problem 3: Expected Value Decision

Investment A: 60% chance of +40%, 40% chance of -20%. Investment B: 80% chance of +15%, 20% chance of 0%. Calculate EV for both and recommend.

Answer: EV(A) = 0.60 × 40% + 0.40 × (-20%) = 24% - 8% = 16%. EV(B) = 0.80 × 15% + 0.20 × 0% = 12%. Investment A has higher EV (16% vs. 12%), but also higher risk. Choose based on risk tolerance.

Self-Practice Prompt 1: Find a stock you own or are considering. Estimate three scenarios (bear, base, bull) with probabilities, and calculate expected value.

Self-Practice Prompt 2: Calculate the correlation between two stocks you follow (use 10 years of annual returns). How does this correlation affect portfolio diversification if you owned both?

Self-Practice Prompt 3: Identify a recent news event affecting a company you follow. Use Bayesian reasoning: what was the base rate of success, and how does this news change your probability estimates?

Probability, Statistics & Risk Basics

Who This Is For

What You Will Learn