AP Statistics Unit 4: Random Variables
Advanced Placement Statistics: Random Variables
Discrete Random Variables and Probability Distributions
In AP Statistics, understooding how to quantify uncertain outcomes is fundamental. A Random Variable serves as a numerical description of the outcome of a statistical experiment. It connects probability theory to numerical data.
Definitions & Categories
A Random Variable (usually denoted by capital letters like $X$ or $Y$) takes numerical values that describe the outcomes of some chance process. There are two distinct types you must be able to identify:
- Discrete Random Variable: Has a countable number of possible values. There are gaps between the values.
- Example: The number of heads in 3 coin flips ($0, 1, 2, 3$). You cannot flip 1.5 heads.
- Example: Shoe size ($6, 6.5, 7, ext{etc.}$). Even though there are decimals, the options are distinct steps.
- Continuous Random Variable: Can take any value within an interval on the number line. There are a countable infinity of possible values.
- Example: The exact time it takes to run a mile ($5.432…$ minutes).
- Example: The height of a randomly selected student.
Discrete Probability Distributions
The probability distribution of a discrete random variable lists all possible values the variable can take and their corresponding probabilities. This is often presented as a table.
Requirements for a Valid Distribution:
- Every probability $pi$ must be between 0 and 1: 0 \leq pi \leq 1
- The sum of all probabilities must equal 1: \sum p_i = 1

Continuous Random Variables and Density Curves
While this section focuses heavily on discrete math, remember that continuous random variables are described by Density Curves.
- The probability is the area under the curve.
- The probability of any single, exact point is always 0 (e.g., $P(X = 3.51200…) = 0$). We only calculate probabilities for intervals (e.g., $P(X < 5)$).
Mean and Standard Deviation of Random Variables
Just like we calculate summary statistics ($ar{x}$ and $sx$) for sample data, we calculate parameters ($\muX$ and $\sigma_X$) for theoretical probability distributions.
The Expected Value (Mean)
The mean of a discrete random variable $X$ is also called its Expected Value, denoted as $E(X)$. It represents the long-run average outcome if the chance process were repeated many times.
\muX = E(X) = \sum xi p_i
To calculate: Multiply each possible value by its probability and sum them up.
Variance and Standard Deviation
These measure the variability (spread) of the random variable. The Variance is the weighted average of the squared deviations from the mean.
Variance Formula:
Var(X) = \sigmaX^2 = \sum (xi - \muX)^2 pi
Standard Deviation Formula:
\sigmaX = \sqrt{\sum (xi - \muX)^2 pi}
TI-84 Tip: You can calculate these quickly by entering the values in List 1 (L1) and probabilities in List 2 (L2), then running
1-Var Stats L1, L2.
Worked Example: The Raffle Ticket
A school raffle sells tickets for \$10.
- 1 prize of \$500 (Probability = 0.001)
- 10 prizes of \$50 (Probability = 0.01)
- Everyone else wins \$0 (Probability = 0.989)
Let $X$ be the net gain from one ticket.
1. Set up the Probability Distribution:
| Outcome | Net Gain ($x_i$) | Probability ($p_i$) |
|---|---|---|
| Grand Prize | $500 - 10 = 490$ | 0.001 |
| Runner Up | $50 - 10 = 40$ | 0.010 |
| Loser | $0 - 10 = -10$ | 0.989 |
2. Calculate Expected Value:
E(X) = (490)(0.001) + (40)(0.010) + (-10)(0.989)
E(X) = 0.49 + 0.40 - 9.89 = -\$9.00
Interpretation: On average, for every ticket purchased, the buyer expects to lose \$9.00.
Combining Random Variables
In AP Statistics, you often need to analyze what happens when you modify a random variable or combine two of them together.
Linear Transformations
A linear transformation occurs when you apply the equation $Y = a + bX$ to a random variable set.
- $a$: Adds/subtracts a constant (shifts the distribution).
- $b$: Multiplies/divides by a constant (scales the distribution).
Effects on Summary Statistics:
| Statistic | Effect of adding $a$ | Effect of multiplying by $b$ | Formula for $Y = a + bX$ |
|---|---|---|---|
| Center (Mean) | Changes | Changes | $\muY = a + b\muX$ |
| Spread (SD) | No Change | Changes | $\sigma_Y = |
| Spread (Variance) | No Change | Changes by $b^2$ | $\sigma^2Y = b^2\sigma^2X$ |

Key Takeaway: Adding a constant shifts the mean but does not affect the spread. If everyone in the class gets 5 bonus points, the average goes up, but the gap between the highest and lowest score remains the same.
Sums and Differences of Random Variables
When we combine two independent random variables $X$ and $Y$:
1. Means (Expected Values):
The mean of the sum (or difference) is the sum (or difference) of the means. Independence is not required for this rule.
- $\mu{X+Y} = \muX + \mu_Y$
- $\mu{X-Y} = \muX - \mu_Y$
2. Variances (Spread):
The variance of the sum (or difference) is the SUM of the variances.
\sigma^2{X \pm Y} = \sigma^2X + \sigma^2_Y
Conditions & Caveats:
- Condition: $X$ and $Y$ must be INDEPENDENT.
- Standard Deviation: You CANNOT add standard deviations ($ \sigma{X+Y} \neq \sigmaX + \sigma_Y $). You must convert to variance first, add, and then take the square root.
- Subtraction: Even when finding the difference ($X - Y$), the variances ADD. Variability always accumulates; it never cancels out.
Memory Aid: The Pythagorean Theorum of Statistics
Just as $c = \sqrt{a^2 + b^2}$, the standard deviation of the combination is:
\sigma{X \pm Y} = \sqrt{\sigma^2X + \sigma^2_Y}

Worked Example: Commute Time
You take a bus and then walk to school.
- Bus time ($B$): Mean = 20 min, SD = 4 min.
- Walk time ($W$): Mean = 10 min, SD = 2 min.
- Assume times are independent.
Total time $T = B + W$
- Mean: $\mu_T = 20 + 10 = 30$ minutes.
- SD: $\sigma_T = \sqrt{4^2 + 2^2} = \sqrt{16 + 4} = \sqrt{20} \approx 4.47$ minutes.
- Note that simple addition (4+2=6) would be incorrect.
Common Mistakes & Pitfalls
Adding Standard Deviations Directly:
- Mistake: $\sigma{X+Y} = \sigmaX + \sigma_Y$.
- Correction: Always square them to get Variances, add the Variances, then square root. $\sigma{T} = \sqrt{\sigmaX^2 + \sigma_Y^2}$.
Subtracting Variances:
- Mistake: For $D = X - Y$, calculating $\sigma^2D = \sigma^2X - \sigma^2_Y$.
- Correction: Unpredictability always grows. Even if you subtract variables, you ADD their variances: $\sigma^2D = \sigma^2X + \sigma^2_Y$.
Confusion on Linear Transformations:
- Mistake: Thinking adding a constant increases standard deviation.
- Correction: Adding a constant shifts the graph left/right but does not stretch it. Spread stays the same.
Assuming Independence:
- Mistake: Calculating combined variance without checking if events are independent.
- Correction: In FRQs (Free Response Questions), explicitly state "Assuming X and Y are independent…" before combining variances.
Discrete vs. Continuous Probability:
- Mistake: Trying to find $P(X=5)$ for a continuous variable.
- Correction: In continuous distributions, probability at a specific point is 0. You must calculate the probability of a range, like $P(4.9 < X < 5.1)$.