AP Statistics Unit 7: Confidence Intervals for Quantitative Means (One-Sample and Two-Sample)
Introduction to t-Distributions
Why you need something other than the normal distribution
When you build a confidence interval for a population mean, you’re trying to estimate an unknown parameter: the true mean of a population, written as . In the “best case,” you would know the population standard deviation , which tells you how spread out individual observations are around . If were known, the standardized statistic
would follow a standard normal distribution (under the usual random sampling conditions), and you could build a confidence interval using a normal critical value.
In real life (and on AP Statistics), is almost never known. Instead, you estimate it with the sample standard deviation . That seemingly small substitution creates extra variability: different random samples produce different values, and that uncertainty must be reflected in your model for the standardized statistic.
That is exactly why the t-distribution exists.
What a t-distribution is
A t-distribution is a family of distributions used to model the standardized sample mean when is unknown and you use instead. The corresponding standardized statistic is
This statistic follows a t-distribution (assuming the conditions for inference are met). Unlike the standard normal distribution, the t-distribution depends on degrees of freedom.
Degrees of freedom (df): what they mean here
For the one-sample mean setting, the degrees of freedom are
Conceptually, degrees of freedom measure how much “independent information” is available to estimate variability. Because is computed using (the sample mean), you lose one degree of freedom—hence .
How t-distributions compare to the standard normal
All t-distributions are:
- Centered at 0 and symmetric like the standard normal.
- More spread out than the standard normal, especially in the tails.
Those heavier tails matter: they make confidence intervals wider (more cautious) to account for the extra uncertainty from estimating with .
As increases, becomes a better estimate of , and the t-distribution approaches the standard normal distribution. In practice, for large , t critical values are close to z critical values.
Critical values and notation
For confidence intervals, you use a critical value from a t-distribution. The notation
means “the t critical value that captures the middle C of the distribution,” where is your confidence level (like 0.95 for a 95% interval), using the correct degrees of freedom.
For example, for a 95% confidence interval, is chosen so that 95% of the t-distribution lies between and .
Notation reference (common symbols you must read fluently)
| Quantity | Meaning | Typical notation |
|---|---|---|
| Population mean | Parameter you want | |
| Sample mean | Statistic from data | |
| Population standard deviation | Usually unknown | |
| Sample standard deviation | Used when unknown | |
| Sample size | Number of observations | |
| Degrees of freedom (one-sample) | Determines t-shape | |
| Standard error of | Estimated SD of | |
| t statistic | Standardized mean using |
Example: seeing how df changes the critical value
Suppose you want a 95% confidence interval.
- With , is noticeably larger than 1.96 (the z critical value).
- With , is only slightly larger than 1.96.
The key takeaway is not memorizing specific numbers—it’s understanding the direction: smaller samples mean larger , which means wider intervals.
Exam Focus
- Typical question patterns:
- Identify whether to use z or t (AP almost always expects t for means because is unknown).
- Determine degrees of freedom and select/interpret .
- Compare the shape/spread of t-distributions as changes.
- Common mistakes:
- Using a normal critical value (like 1.96) when is not given.
- Using instead of for one-sample mean procedures.
- Thinking the t-distribution is skewed; it is symmetric—its difference is heavier tails.
Constructing a Confidence Interval for a Population Mean
What a confidence interval for a mean is (and what it is not)
A confidence interval for a population mean is a range of plausible values for , based on sample data. The interval is built from:
- a point estimate (usually ), and
- a margin of error that accounts for sampling variability.
A 95% confidence interval does not mean there is a 95% probability that is in your computed interval. After you compute it, the interval either contains or it doesn’t. The “95%” refers to the long-run success rate of the method: if you repeated the sampling process many times, about 95% of those intervals would capture .
Why the t-interval works
Because you don’t know , you use to estimate the spread of the sampling distribution of . The estimated standard deviation of is called the standard error:
Then you take your point estimate and go out standard errors in both directions.
The one-sample t confidence interval formula
A one-sample t interval for the population mean is
Equivalently, you can write it as
Where:
- is the sample mean.
- is the sample standard deviation.
- is the sample size.
- comes from the t-distribution with at the chosen confidence level.
Conditions for using a one-sample t interval (what AP wants you to check)
You’re expected to justify inference with conditions. A common AP-friendly structure is:
- Random: Data come from a random sample or a randomized experiment.
- Normal (or approximately normal sampling distribution of ):
- If the population is approximately normal, you’re fine.
- If is large, the Central Limit Theorem supports that is approximately normal.
- If is small, you should check that the sample distribution looks roughly symmetric with no strong outliers.
- Independence: Observations are independent. If sampling without replacement, a common check is the 10% condition:
where is the population size.
These conditions matter because the t-interval is derived assuming the t statistic behaves like a t-distribution. Strong skewness with a small sample or extreme outliers can break that approximation.
How confidence level, sample size, and variability affect the interval
It helps to predict what happens before calculating.
- Increasing the confidence level (like 90% to 95%) increases , so the interval gets wider.
- Increasing sample size decreases , so the interval gets narrower.
- Increasing variability increases the standard error, so the interval gets wider.
This is the logic behind margin of error:
Interpreting the interval in context (the wording AP expects)
A correct interpretation:
- Names the confidence level.
- Refers to the parameter (not ).
- Uses the context (what the mean represents).
Template you can adapt:
“We are % confident that the true mean (context) for (population) is between (lower) and (upper).”
Worked example: one-sample t interval
A nutrition researcher takes a random sample of energy bars of a certain brand and measures calories per bar. The sample mean is calories and the sample standard deviation is calories. Construct a 95% confidence interval for the true mean calories per bar .
Step 1: Identify the procedure and check conditions
- We want and is not given, so we use a one-sample t interval.
- Random: stated random sample.
- Independence: reasonable if the sample is less than 10% of all bars produced (assume yes).
- Normal: is moderately sized; if no strong skew/outliers are indicated, t procedures are typically considered reasonable.
Step 2: Degrees of freedom
Step 3: Find the critical value
For 95% confidence with , use from a t table or technology. (You do not need to memorize it; you must select it appropriately.)
Step 4: Compute the standard error
Step 5: Compute the interval
If is approximately 2.064 for , then
So the interval is
Interpretation (in context)
“We are 95% confident that the true mean calories per bar for this brand is between about 209.9 and 218.1 calories.”
What can go wrong (common conceptual traps)
- Confusing standard deviation and standard error: describes spread of individual data; describes spread of across samples.
- Ignoring outliers: A single extreme value can inflate and distort , making the interval less meaningful.
- Thinking higher confidence means ‘more accurate’: Higher confidence increases reliability but widens the interval. You gain certainty by giving a wider range.
Exam Focus
- Typical question patterns:
- “Construct and interpret a 95% confidence interval for ” given , , and .
- “Check conditions” using a graph (dotplot/histogram/boxplot) and sampling description.
- Compute or interpret the margin of error and explain how to make it smaller.
- Common mistakes:
- Interpreting the interval as a probability statement about .
- Using formulas or z critical values when only is available.
- Forgetting to include context and the population/parameter in the interpretation.
Confidence Interval for a Difference of Two Means
The goal: comparing two population means
Often you’re not just estimating one mean—you’re comparing two groups. For example:
- Do students who sleep at least 8 hours have a higher mean test score than those who sleep less?
- Is the mean recovery time different for two medical treatments?
Here, you want the difference between two population means:
A confidence interval for a difference of two means gives a plausible range of values for that parameter.
Two-sample setting: what data structure you need
To use a two-sample t interval (for independent groups), you need:
- Two independent samples (or two independent randomized groups).
- A quantitative variable measured for both groups.
Each group has its own sample statistics:
- Group 1: , ,
- Group 2: , ,
Your point estimate of is
Why the standard error is different for two means
A sample mean varies from sample to sample. When you subtract two sample means, the variability adds (in a variance sense), so the standard error for the difference is
This captures two sources of sampling variability—one from each group.
The two-sample t confidence interval (independent samples)
A two-sample t interval for is
The main new practical issue is degrees of freedom. Many technologies compute an approximate automatically (often associated with Welch’s method). On AP Statistics, using technology for and is acceptable; if you must approximate by hand, a common conservative choice sometimes taught is
If your course emphasizes calculator-based inference, you’ll typically report the interval produced by the two-sample t-interval procedure.
Conditions (what you must justify)
You typically justify conditions for each group plus independence between groups.
- Random: Each sample is random, or treatments were randomly assigned.
- Independence within each group: If sampling without replacement, check the 10% condition separately:
- Independent groups: The two samples/groups do not influence each other (no pairing, no repeated measures on the same individuals).
- Normal (within each group): Each group’s data are approximately normal, or each sample size is large enough for the sampling distribution of each mean to be approximately normal.
A major warning sign is outliers or strong skew in either group when sample sizes are small.
Paired data is a different procedure (important distinction)
Students often confuse “two groups” with “two-sample.” If measurements are naturally matched (same subjects before/after, twins, matched pairs), you do not use the two-sample t interval above. Instead, you compute differences for each pair and run a one-sample t interval on the differences. In this section, the focus is the independent two-sample interval, but you should always ask: “Are the observations paired?”
Interpreting a two-mean interval (including the ‘zero check’)
A correct interpretation names and uses context:
“We are % confident that the true difference in mean (context) between (population 1) and (population 2) is between (lower) and (upper), where the difference is defined as .”
A powerful insight comes from checking whether the interval contains 0:
- If 0 is inside the interval, then a difference of 0 is plausible, so there is not clear evidence of a difference (at a level consistent with that confidence).
- If 0 is not inside the interval, then 0 is not plausible, suggesting a real difference in means.
Be careful: this is a relationship to hypothesis testing ideas, but the interpretation is still about plausible values for .
Worked example: two-sample t interval for
A school compares mean weekly study time for students in two programs.
- Program 1: , hours, hours
- Program 2: , hours, hours
Construct a 95% confidence interval for , where is the mean study time for Program 1 and is the mean study time for Program 2.
Step 1: Choose procedure and check conditions
- We are estimating with and unknown, so use a two-sample t interval.
- Random: assume each group is a random sample from its program (or students were randomly sampled).
- Independence within groups: reasonable if each sample is less than 10% of its program population.
- Independent groups: two different sets of students.
- Normal: both sample sizes are fairly large (35 and 40), so inference is typically reasonable.
Step 2: Compute the point estimate
Step 3: Compute the standard error
Compute pieces:
So
Step 4: Find
Using technology (or a conservative df approximation), will be around the smaller of 39 and 34 if using , giving . For 95% confidence, is a bit above 2.
Step 5: Build the interval
If is approximately 2.03, then
So the interval is
Interpretation
“We are 95% confident that the true mean difference in weekly study time (Program 1 minus Program 2) is between about -0.22 and 3.42 hours.”
Because 0 is inside the interval, a true difference of 0 hours is plausible based on these data.
What goes wrong most often in two-mean intervals
- Mixing up the order of subtraction: You must define the parameter as and then compute in the same order. If you swap, your interval changes sign.
- Treating paired data as independent: If the same individuals are measured twice, independence is violated; use a one-sample interval on differences instead.
- Forgetting the square root or squaring incorrectly in the standard error:
Students sometimes compute instead of , which can drastically shrink the interval incorrectly.
Exam Focus
- Typical question patterns:
- “Construct and interpret a confidence interval for ” and state what it suggests about a difference.
- Determine whether the situation is independent two-sample or paired, and justify the correct method.
- Explain how changing or affects the margin of error.
- Common mistakes:
- Reporting an interval but interpreting it as “most sample means will fall here” instead of interpreting it as plausible values for .
- Using automatically (that df corresponds to a pooled approach under equal-variance assumptions, which is not the default in many AP Stats courses).
- Failing to address conditions for both groups (especially checking for skew/outliers separately).