Unit 7: Inference for Quantitative Data: Means
Why Inference for Means Uses the t Distribution
Parameters, statistics, and the problem we’re solving
Inference is about using data from a sample to make a justified claim about a population. For quantitative data (numbers you can average), the population feature you usually care about is the population mean, written as \mu. Because you almost never know \mu, you collect a sample and compute the sample mean \bar{x} as an estimate.
The key challenge is uncertainty: \bar{x} changes from sample to sample. This unit is about measuring that uncertainty when your parameter is a mean, and using it to build:
- Confidence intervals (estimate \mu with a range of plausible values)
- Significance tests (evaluate whether data provide convincing evidence against a claim about \mu)
Why we can’t usually use the normal (z) procedures
If you somehow knew the population standard deviation \sigma, then the standardized statistic
z=\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}
would follow a standard normal model (approximately, under appropriate conditions). But in real life, \sigma is almost never known. Instead, you estimate it with the sample standard deviation s.
That substitution changes the sampling behavior: when you estimate variability from the same sample you’re using to estimate the mean, you add extra uncertainty. That extra uncertainty is exactly why we use Student’s t distribution instead of the normal distribution.
The t distribution: what it is, where it came from, and how it behaves
The Student’s t-distribution was introduced in 1908 by W. S. Gosset, a British mathematician employed by the Guinness Breweries.
A t distribution is a family of bell-shaped, symmetric distributions centered at 0. Like the normal, it’s used for standardized statistics, but it has heavier tails and is a bit lower near the mean. Heavier tails matter because they assign more probability to extreme values. This reflects the fact that s itself fluctuates from sample to sample, so your standardized statistic is “less stable” than z.
When doing inference for means with unknown \sigma, the standardized statistic becomes:
t=\frac{\bar{x}-\mu}{s/\sqrt{n}}
If the population is normally distributed, then this statistic follows a t distribution.
The exact t distribution you use depends on the degrees of freedom (df). For a one-sample mean procedure,
df=n-1
The smaller the df value, the more spread out the t distribution is. As n gets larger (df increases), the t distribution gets closer and closer to the standard normal. In practice:
- Small n: t has noticeably heavier tails (larger critical values, wider intervals)
- Large n: t is very close to z
Because \sigma is almost always unknown in the real world, t procedures are the default choice for inference about means.
When t procedures are valid: the conditions you must check
AP Statistics emphasizes that inference procedures are only trustworthy when certain conditions are met. For t procedures about means, think in three categories.
1) Random: Data come from a random sample or a randomized experiment.
- Random sampling supports generalizing to the population.
- Random assignment (in an experiment) supports cause-and-effect conclusions.
2) Independence: Observations are independent.
- A common check for sampling without replacement is the 10% condition: the sample size n should be no more than 10% of the population size.
3) Normal (or large sample): The sampling distribution of \bar{x} (or of differences) is approximately normal.
- If the population distribution is roughly normal, you’re fine even with small n.
- If the population is not normal, you typically want a “large enough” sample so the Central Limit Theorem makes \bar{x} approximately normal.
- A common rule of thumb is n\ge 30 for CLT-based reasoning, but strong skewness or outliers can still cause trouble.
A practical way to handle the Normal condition is to look at graphs (dotplot, histogram, boxplot) and ask: is the distribution roughly symmetric and free of extreme outliers? If not, do you have a large sample size to rely on the CLT?
Exam Focus
- Typical question patterns:
- Explain why t procedures are used instead of z procedures when \sigma is unknown.
- Identify the parameter and state the appropriate df for a given situation.
- Check conditions (Random, 10%, Normal/large sample) from a description and/or graph.
- Common mistakes:
- Using z critical values (or saying “normal”) when the problem clearly indicates \sigma is unknown.
- Forgetting the 10% condition when sampling without replacement.
- Treating “n is kind of big” as automatic permission even when there are extreme outliers.
One-Sample t Confidence Intervals for a Population Mean
What a confidence interval means (and what it doesn’t)
A confidence interval for \mu is a range of values that are plausible for the true population mean, based on your sample. The logic is:
- The sample mean \bar{x} is your best point estimate of \mu.
- The sample-to-sample variation in \bar{x} is summarized by the standard error.
- You create an interval by taking \bar{x} and moving out by a margin of error.
The key interpretation (the one AP wants) is about the long-run performance of the method:
If we repeatedly took random samples of size n from the same population and built a t interval each time, then about C% of those intervals would capture the true mean \mu.
A very common misconception is to say “there is a C% probability that \mu is in this interval.” After you compute an interval, \mu is fixed; the interval either contains it or not. The confidence level describes the method, not the one interval.
The sampling distribution ideas behind the interval
Your sample mean is just one of a whole universe of sample means. If n is sufficiently large (or if the population is normally distributed), then:
- The set of all sample means is approximately normally distributed.
- The mean of the set of sample means equals \mu, the population mean.
- The standard deviation of the set of sample means is
\frac{\sigma}{\sqrt{n}}
Because we typically do not know \sigma, we estimate it using s. In that case, the estimated standard deviation of \bar{x} is the standard error:
SE=\frac{s}{\sqrt{n}}
The structure of a one-sample t interval
The one-sample t confidence interval for \mu is:
\bar{x}\pm t^*\left(\frac{s}{\sqrt{n}}\right)
Here’s what each piece does:
- \bar{x}: center of the interval (your estimate)
- \frac{s}{\sqrt{n}}: standard error of \bar{x}
- t^*: critical t value based on the confidence level and df=n-1
The margin of error is:
ME=t^*\left(\frac{s}{\sqrt{n}}\right)
Why sample size and confidence level matter
The standard error shrinks like 1/\sqrt{n}, so larger samples give more precise estimates.
If you increase n:
- \frac{s}{\sqrt{n}} tends to decrease
- df increases, making t^* smaller
- the interval gets narrower
If you raise the confidence level (say 90% to 95%):
- t^* increases
- margin of error increases
- the interval gets wider
Example: building and interpreting a one-sample t interval
Scenario. A random sample of n=25 customers is surveyed about the amount of time (in minutes) they spend in a store. The sample mean is \bar{x}=72.4 minutes and the sample standard deviation is s=8.0 minutes. Construct and interpret a 95% confidence interval for the true mean time \mu.
Step 1: Identify the parameter.
- \mu = the true mean time spent in the store by all customers in the population of interest.
Step 2: Check conditions.
- Random: stated as a random sample.
- Independence: assume the population is at least 10 times 25.
- Normal/large sample: with n=25, you’d want the distribution to be roughly symmetric with no extreme outliers.
Step 3: Compute the interval.
df=25-1=24
SE=\frac{8.0}{\sqrt{25}}=\frac{8.0}{5}=1.6
For 95% confidence and df=24, t^*\approx 2.064.
ME=2.064(1.6)=3.3024
So:
72.4\pm 3.3024
Interval endpoints:
- Lower: 72.4-3.3024=69.0976
- Upper: 72.4+3.3024=75.7024
Rounded reasonably: (69.1,75.7) minutes.
Step 4: Interpret in context.
We are 95% confident that the true mean time \mu that all customers spend in the store is between about 69.1 and 75.7 minutes.
Example 7.1: gas mileage confidence intervals and “what confidence is this margin?”
When a random sample of 10 cars of a new model was tested for gas mileage, the results showed a mean of 27.2 miles per gallon with a standard deviation of 1.8 miles per gallon.
1) What is a 95% confidence interval estimate for the mean gas mileage achieved by this model? Assume the population of mpg results for all new-model cars is approximately normally distributed.
Parameter: Let \mu represent the mean gas mileage (mpg) in the population of cars of this new model.
Procedure: One-sample t-interval for a population mean.
Checks: Random sample is stated, n=10 is assumed less than 10% of all such cars, and the population is approximately normal.
Mechanics (technology): A calculator t-interval gives (25.912,28.488).
Conclusion: We are 95% confident that the true mean gas mileage is between 25.91 and 28.49 mpg.
2) Based on this confidence interval, is the true mean mileage significantly different from 25 mpg?
Yes. Because 25 is not in the interval of plausible values (about 25.9 to 28.5), there is convincing evidence that the true mean mileage differs from 25 mpg.
3) Determine a 99% confidence interval.
Mechanics (technology): (25.35,29.05).
Conclusion: We are 99% confident the true mean mpg is between 25.35 and 29.05 mpg. Notice the higher confidence produces a wider interval (less specific).
4) What would the 95% confidence interval be if the same sample mean of 27.2 and standard deviation of 1.8 had come from a sample of 20 cars?
Mechanics (technology): (26.36,28.04).
Conclusion: We are 95% confident the true mean mpg is between 26.36 and 28.04 mpg. Notice the larger sample size produces a narrower interval.
5) With the original data, with what confidence can we assert that the true mean gas mileage is 27.2\pm 1.04?
The interval 27.2\pm 1.04 is (26.16,28.24). Convert the margin of error to a t critical value using SE=s/\sqrt{n}.
SE=\frac{1.8}{\sqrt{10}}\approx 0.569
t^*=\frac{1.04}{0.569}\approx 1.827
With df=9, the central area between -1.827 and 1.827 is about 0.899 (using a calculator t CDF), so the confidence level is about 89.9%.
What can go wrong with one-sample t intervals
Confusing s and \sigma is a major issue: t intervals use s because \sigma is unknown. Also, ignoring outliers with small samples can be disastrous because one extreme outlier can dramatically affect both \bar{x} and s. Finally, df errors are common: for a one-sample interval, df=n-1.
Exam Focus
- Typical question patterns:
- Construct a one-sample t interval from summary statistics or calculator output.
- Interpret a confidence interval correctly in context (including what \mu represents).
- Explain how changing n or confidence level affects the margin of error.
- Use an interval to argue whether a particular value (like 25 mpg) is plausible.
- Determine what confidence level corresponds to a given “estimate ± margin of error.”
- Common mistakes:
- Saying “95% of data fall in the interval” instead of “95% confident about \mu.”
- Using z^* instead of t^*.
- Reporting an interval for individuals rather than for the population mean.
One-Sample t Tests for a Population Mean
What a significance test is really asking
A significance test uses sample data to evaluate a claim about a population parameter. For a one-sample mean test, the parameter is \mu and the hypotheses look like:
H_0: \mu=\mu_0
and one of these alternatives:
H_a: \mu\ne\mu_0
H_a: \mu>\mu_0
H_a: \mu