Unit 9: Inference for Quantitative Data: Slopes

Inference about a Regression Slope: Parameters, Purpose, and Causation

When you fit a least-squares regression line to data, you get a sample-based line that summarizes the relationship between an explanatory variable (often called x) and a response variable (often called y). The sample slope tells you how y tends to change as x increases. Because it comes from one sample, that slope can vary from sample to sample.

Inference for slopes is about moving from “what we see in this sample” to “what we can reasonably claim about the population relationship.” In AP Statistics, that population relationship is described by the population regression line.

\mu_y = \beta_0 + \beta_1 x

Some textbooks/software use different symbols for the same population model, so you may also see:

\mu_y = \alpha + \beta x

These are equivalent notations:

  • The population intercept is either beta0 or alpha.
  • The population slope is either beta1 or beta.

Your sample regression line looks similar:

\hat{y} = b_0 + b_1 x

The central inference questions in this unit are:

  • Do the data provide evidence that the population slope (beta1) is different from 0?
  • If so, what plausible values might beta1 take?

Why does testing a slope of 0 matter so much? If the population slope is 0, the population regression line is flat, meaning the population mean response does not change as x changes. In that case, a linear relationship is not supported (even though individual points can still vary widely).

Link to correlation

Earlier, correlation measured the direction and strength of linear association. Here, you focus on slope because slope has a direct real-world interpretation in context (units of y per 1 unit of x). In simple linear regression with one explanatory variable, testing whether there is a linear relationship is equivalent to testing whether the population correlation is 0, but when the prompt is about regression you should use slope language.

What inference can (and cannot) say

Slope inference supports conclusions about association in a population. Whether you can claim causation depends on how the data were produced:

  • If the data come from a randomized experiment, a significant slope can support a cause-and-effect conclusion (within the scope of the experiment).
  • If the data come from an observational study, even a significant slope supports only association; lurking variables may explain the pattern.
Exam Focus

Typical question patterns:

  • “Do these data provide convincing evidence of a linear relationship between x and y in the population?”
  • “Interpret the slope (or a confidence interval for slope) in context.”
  • “Given computer output, identify and interpret the slope test and its conclusion.”

Common mistakes:

  • Treating the sample slope as if it were the population slope without accounting for sampling variability.
  • Claiming causation from a non-randomized observational study.
  • Reversing variables in an interpretation (“for every 1 unit increase in y, x increases…”).

Sampling Distribution of the Slope and Why We Use a t Model

To do inference, you need a model for how the sample slope varies from sample to sample.

The big idea: slope is a statistic with variability

If you repeatedly took samples (or repeated an experiment) and fit a least-squares line each time, you would not get the same slope every time. Some slopes would be larger, some smaller, due to random scatter.

When certain conditions are met, the sampling distribution of the sample slope can be modeled approximately by a normal distribution with mean equal to the true population slope and standard deviation equal to the true standard deviation of sample slopes. Using common “sampling distribution language,” you may see:

  • mean of the sampling distribution: mu-sub-b (the mean of sample slopes)
  • standard deviation of the sampling distribution: sigma-sub-b (the true SD of sample slopes)

In practice, we do not know sigma-sub-b, so we estimate it with a standard error computed from sample data. That estimation step is what leads to a t model.

Why a t distribution appears

In many AP Statistics inference settings, you use a t distribution when:
1) You estimate an unknown population standard deviation from the data, and
2) The sampling distribution is approximately normal.

For slope inference, you do not know the true variability of errors around the population line, so you estimate it from the sample residuals. Using the estimated standard error of the slope (often denoted SE for slope, and sometimes written as s-sub-b) leads to a t distribution.

Notation and what the symbols mean

One common source of confusion is mixing up parameters, statistics, and measures of variability. The table helps keep roles clear.

ConceptPopulation (parameter)Sample (statistic)Estimated variability
Intercept\beta_0b_0SE_{b_0}
Slope\beta_1b_1SE_{b_1}
Error (scatter around line)(unknown)(estimated)s

The residual standard deviation (standard deviation of residuals) is:

s = \sqrt{\frac{\sum (y - \hat{y})^2}{n - 2}}

The degrees of freedom are:

df = n - 2

The n minus 2 appears because two parameters (slope and intercept) are estimated from the data.

Although you typically do not compute the standard error of the slope by hand in AP Statistics (technology usually gives it), it responds to data features in predictable ways:

  • More scatter around the line (larger residual SD) makes the slope standard error larger.
  • More data (larger sample size) tends to make the slope standard error smaller.
  • More spread in x values (larger SD of x, often written as s-sub-x) tends to make the slope standard error smaller because the slope becomes easier to estimate.

The t statistic for slope

To test a claim about the population slope or to build a confidence interval, you standardize the difference between the sample slope and the hypothesized slope.

t = \frac{b_1 - \beta_1}{SE_{b_1}}

Under the usual conditions, this statistic follows a t distribution with:

df = n - 2

In hypothesis testing, the most common null value is a slope of 0, so the test statistic becomes:

t = \frac{b_1 - 0}{SE_{b_1}}

Many resources write the sample slope as b and the population slope as beta, and the standard error as s-sub-b. In that notation, the same idea is:

t = \frac{b - \beta}{s_b}

Exam Focus

Typical question patterns:

  • “Calculate the test statistic for slope given the slope estimate and its standard error.”
  • “State the degrees of freedom for the slope t test.”
  • “Explain (conceptually) what makes the standard error of the slope large or small.”

Common mistakes:

  • Using the wrong degrees of freedom.
  • Thinking a large correlation automatically implies a statistically significant slope (sample size matters).
  • Treating the residual standard deviation as the same thing as the slope standard error.

Conditions for Valid Slope Inference (and How to Check Them)

The t procedures for regression slope rely on assumptions about how the data were produced and how the residuals behave. It helps to separate theoretical model conditions (what is assumed about the population relationship) from practical sample-based checks (what you can verify with one data set).

Theoretical regression conditions (population-level)

The classic theoretical assumptions for inference on slope are:
1) The true relationship between the response and explanatory variables is linear.
2) The standard deviation of the response values does not vary with x (constant variance).
3) For each fixed x, the response values are approximately normally distributed.

Practical conditions you check with sample graphs and study design

Because you only have a single sample, you approximate these assumptions using the fitted line, residuals, and what you know about data collection.

1) Linear relationship

Inference about a linear slope only makes sense if the relationship between x and the mean of y is reasonably linear. Check this using a scatterplot of y versus x and/or a residual plot. A curved pattern in the residual plot indicates the linear model is missing structure.

2) Independence and random sampling/assignment

Observations (and residuals) should be independent. This is primarily about how the data were collected.

  • For a random sample, independence is supported when the sample is less than 10% of the population size (often called the 10% condition when sampling without replacement).
  • For an experiment, independence is supported by random assignment and by ensuring one subject’s outcome does not affect another’s.

A special warning: data collected over time (daily prices, temperatures, etc.) often show time correlation that violates independence.

3) Normality of residuals

The t procedures assume that, around the line, the deviations are approximately normal. In practice, you check whether residuals are roughly normal using a histogram or a normal probability plot of the residuals. You do not need the original y values to be normal; you need residuals to be approximately normal.

4) Equal variance (constant spread)

The residual spread should be roughly constant across the range of x. A “fan” or “funnel” pattern in a residual plot indicates nonconstant variance. A related practical phrasing you’ll see is “there should be no apparent pattern in the residual plot.”

5) No influential outliers

Outliers can strongly affect regression results. A point extreme in x (high leverage) can pull the line; a point with an unusually large residual can distort the fit. Check the scatterplot and residual plot for unusual points.

A common AP justification paragraph

When justifying inference, it’s common to reference design and graphs in context:

  • Linear: scatterplot/residual plot shows a roughly linear pattern.
  • Independent: random sample or randomized experiment; mention the 10% condition when appropriate.
  • Normal: residual histogram/normal plot looks approximately normal.
  • Equal variance: residual plot shows roughly constant spread (no fanning).
  • Outliers: no influential outliers (or you address why they’re a concern).
Exam Focus

Typical question patterns:

  • “Check conditions for inference for regression slope using the provided scatterplot and residual plot.”
  • “Explain which condition is violated when the residual plot shows a curved pattern or fanning.”
  • “Would it be reasonable to use a t test for slope here? Justify.”

Common mistakes:

  • Checking normality of y instead of normality of residuals.
  • Forgetting to mention independence/data collection.
  • Saying “since n is large, conditions are met” without addressing linearity or influential points.

Confidence Intervals for the Population Slope

A confidence interval for the population slope gives a range of plausible values for the true population slope, interpreted as the long-run average change in the population mean response for a 1-unit increase in x.

What a confidence interval is doing conceptually

Instead of asking whether the slope is exactly 0, a confidence interval asks which slopes are consistent with the data, given random sampling variation. Narrow intervals indicate more precision (often due to larger n, less scatter, and more spread in x); wide intervals indicate more uncertainty.

The interval formula

A t interval for the population slope uses:

b_1 \pm t^* SE_{b_1}

The degrees of freedom are:

df = n - 2

If you are given raw data, you can typically find the confidence interval directly using statistical software on a calculator.

Interpreting the interval correctly (in context)

A correct interpretation must:
1) Refer to the parameter (the population slope), not the sample slope.
2) Use context (what x and y represent).
3) Use units (units of y per unit of x).

A strong template is:

We are C% confident that for each 1-unit increase in x, the population mean of y changes by between (lower bound) and (upper bound) units, on average.

Avoid saying “There is a C% chance the true slope is in the interval.” After you compute an interval, the true slope is fixed; the interval is what would vary across samples.

Connection between confidence intervals and significance tests

A key link:

  • If a C% confidence interval for the population slope does not contain 0, then a two-sided test at significance level alpha equal to 1 minus C would reject the null hypothesis of slope 0.
  • If the interval contains 0, the corresponding two-sided test would fail to reject.

Worked example (confidence interval)

A study examines weekly study time x (hours) and exam score y (points). Regression output gives: sample slope 2.40, slope standard error 0.75, and sample size 18.

Degrees of freedom:

df = 18 - 2

A 95% critical value is approximately:

t^* \approx 2.12

Margin of error:

ME = 2.12(0.75)

Interval:

2.40 \pm 1.59

So the interval is approximately 0.81 to 3.99. Interpretation: you are 95% confident that for each additional hour studied per week, the population mean exam score increases by between about 0.81 and 3.99 points, on average. This is about the mean response, not guaranteed change for an individual student.

Example 9.1 (SAT verbal and SAT math)

Information concerning SAT verbal scores (x) and SAT math scores (y) was collected from 15 randomly selected students. A linear regression printout was produced.

1) The regression equation uses the y-intercept and slope found in the “Coef” column of the printout.

2) A 95% confidence interval for the slope is found using a t interval with:

df = 15 - 2

The residual standard deviation reported was 16.69, and the standard error of the slope was given in the output. The 95% critical values were found as plus/minus invT(0.975, 13), which is plus/minus 2.160. The resulting 95% confidence interval for the true slope was 0.64 to 0.89.

Conclusion in context: We are 95% confident that the interval from 0.64 to 0.89 captures the slope of the true regression line relating SAT math score (y) to SAT verbal score (x). Equivalently, for every 1-point increase in verbal SAT score, the average increase in math SAT score is between 0.64 and 0.89.

3) A slope of 0 would mean the model predicts the same math score no matter the verbal score (no linear relationship). Because 0 is not in the interval 0.64 to 0.89, there is convincing evidence that SAT math and SAT verbal scores are linearly related.

Exam Focus

Typical question patterns:

  • “Construct and interpret a C% confidence interval for the population slope given regression output.”
  • “Does the interval provide evidence of a positive linear relationship? Explain.”
  • “Use the interval to make a decision about a slope test at a given significance level.”

Common mistakes:

  • Interpreting the slope interval as describing individual change rather than change in the mean response.
  • Using the wrong degrees of freedom.
  • Reversing variables or omitting units in the interpretation.

Significance Tests for the Population Slope

A significance test for slope checks whether the data provide convincing evidence that the true slope differs from a hypothesized value, most commonly 0.

Standard hypothesis test setup

Most AP problems use:

H_0: \beta_1 = 0

With an alternative chosen based on context:

H_a: \beta_1 \ne 0

or

H_a: \beta_1 > 0

or

H_a: \beta_1 < 0

A two-sided alternative is common when you are simply asking whether there is a linear relationship. A one-sided alternative is appropriate only when a direction is justified in advance by context.

Test statistic and p-value

The test statistic is:

t = \frac{b_1 - 0}{SE_{b_1}}

with:

df = n - 2

The p-value is the probability, assuming the null hypothesis is true, of observing a sample slope at least as extreme as the one observed (in the direction(s) specified by the alternative). A low p-value tells you that if the two variables truly had no linear relationship in the population, it would be highly unlikely to obtain a sample slope as extreme as the one found.

Important: strong evidence that there is some linear association does not mean the association is strong.

“Significant” does not mean “important”

Statistical significance answers “Is there evidence the effect exists beyond random noise?” It does not automatically answer “Is the effect large enough to matter?” With very large samples, even tiny slopes can be statistically significant.

Worked example (hypothesis test)

Using the earlier study-time example (slope 2.40, SE 0.75, n 18), test at alpha 0.05 for a positive linear relationship.

Hypotheses:

H_0: \beta_1 = 0

H_a: \beta_1 > 0

Test statistic:

t = \frac{2.40}{0.75}

Degrees of freedom:

df = 18 - 2

With t equal to 3.20 and 16 degrees of freedom, the p-value is well below 0.01. Because the p-value is less than 0.05, reject the null hypothesis. There is convincing evidence that the population slope is positive, meaning greater weekly study time is associated with a higher population mean exam score.

Writing strong conclusions

A complete conclusion includes:

  • A decision (reject or fail to reject the null).
  • A statement about evidence.
  • The parameter and context.

Avoid saying “accept the null.” You either reject it or fail to reject it.

Example 9.3 (tennis racket serving speeds)

Ten randomly selected professional tennis players were measured for serving speed in mph using an old racket (x) and after using a newly developed racket (y).

1) Is there evidence of a straight-line relationship with positive slope?

Parameter: Let beta represent the slope of the true regression line for predicting serving speed after using the new racket from serving speed before using the new racket.

Hypotheses:

H_0: \beta = 0

H_a: \beta > 0

Procedure: t test for the slope of a regression line.

Checks: The data come from a random sample, the scatterplot is approximately linear, there is no apparent pattern in the residual plot, the histogram of residuals is approximately normal, and the sample size 10 is less than 10% of all professional players.

Mechanics: Using regression inference software (for example, LinRegTTest on the TI-84 or LinearReg tTest on the Casio Prizm) gives a p-value of 0.00019.

Conclusion in context: Because 0.00019 is less than 0.05, reject the null hypothesis. There is convincing evidence of a straight-line relationship with positive slope between serving speeds using the old and new rackets.

2) Interpret in context the least squares line.

With a slope of approximately 1 and a y-intercept of 8.76, the regression line indicates that use of the new racket increases serving speed on average by 8.76 mph regardless of the old racket speed. Players with lower and higher old-racket speeds experience, on average, the same numerical (rather than percentage) increase when using the new racket.

Exam Focus

Typical question patterns:

  • “Do these data provide convincing evidence of a linear relationship? Perform a test for the population slope.”
  • “Given computer output (t value and p-value), state hypotheses and draw a conclusion.”
  • “Decide whether to use a one- or two-sided alternative and justify.”

Common mistakes:

  • Misinterpreting the p-value as the probability the null hypothesis is true.
  • Forgetting to connect the conclusion back to the population mean of y changing with x.
  • Using the wrong tail (two-sided vs one-sided) without context justification.

Reading and Using Regression Computer Output

On many AP questions, you are given regression output from a calculator or software. Your job is to connect the output to inference procedures and interpret results in context.

Common pieces of output

Typical output includes:

  • Coefficients: sample intercept and sample slope.
  • Standard errors: standard errors of intercept and slope.
  • t statistics and p-values: tests of whether each coefficient differs from 0.
  • Residual standard deviation.
  • The coefficient of determination (r-squared): proportion of variation in y explained by the linear model.

In this unit, the inference focus is usually the row corresponding to the explanatory variable (the slope row).

How the slope test appears in output

Most outputs effectively perform:

H_0: \beta_1 = 0

They provide the slope estimate, its standard error, the t statistic, and the p-value. If asked to “show the test statistic,” you can compute:

t = \frac{b_1}{SE_{b_1}}

Interpreting r-squared versus interpreting slope

These answer different questions:

  • Slope: how much the predicted (or mean) y changes per 1 unit of x; units matter.
  • r-squared: what proportion of the variability in y is explained by a linear relationship with x; it is unitless.

A slope can be statistically significant with a small r-squared (real trend but lots of scatter). A high r-squared does not imply causation.

Example: interpreting output (fuel efficiency and weight)

Suppose output for predicting fuel efficiency y (mpg) from vehicle weight x (in thousands of pounds) reports: slope estimate -6.8, slope SE 1.1, t -6.18, and p-value less than 0.001.

Slope interpretation: For each additional 1000 pounds of vehicle weight, the model predicts that the population mean fuel efficiency decreases by about 6.8 mpg, on average.

Inference conclusion: Because the p-value is very small, there is convincing evidence that the population slope is negative, meaning heavier vehicles are associated with lower mean fuel efficiency in the population.

Don’t ignore the intercept, but interpret carefully

The intercept is the predicted y when x is 0. Sometimes x equal to 0 is outside the data’s meaningful range (for example, a vehicle weight of 0). In that case, the intercept may not be meaningful to interpret even though it is needed to write the regression equation.

Exam Focus

Typical question patterns:

  • “Identify the slope estimate and interpret it.”
  • “Use the output to test the null hypothesis of slope 0 and state a conclusion.”
  • “Construct a confidence interval for the population slope using output values.”

Common mistakes:

  • Interpreting r-squared as “percent of points on the line” or confusing it with slope.
  • Interpreting the intercept when x equal to 0 is not meaningful.
  • Treating a tiny p-value as evidence of a strong relationship (strength is about effect size and scatter, not just significance).

Communicating Conclusions Responsibly: Scope, Extrapolation, Outliers, and Transformations

Even with perfect mechanics, you can lose points by overclaiming or misinterpreting what regression inference can tell you.

Association versus causation

A significant slope indicates evidence of a linear association in the population.

  • If x was manipulated in a randomized experiment with random assignment, a significant slope can support a causal conclusion (within the experiment’s scope).
  • If the data are observational, you should explicitly avoid causal language because lurking variables may be responsible.

A strong habit is to add a design sentence such as: “Because this was an observational study, we cannot conclude that changing x causes y to change.”

Extrapolation

Regression (and slope inference) is only trustworthy within the range of x values actually observed. Predicting far outside that range is extrapolation, and even a significant slope does not make extrapolated predictions reliable.

Mean-response language

Inference for slope is about the population mean response:

\mu_y

Your interpretation should sound like “the population mean response changes by …,” not “an individual’s response changes by …” Individual outcomes vary; that variability is exactly what residuals measure.

Outliers and influential points

An influential point can change the slope estimate, its standard error, the p-value, and even the direction of a conclusion. When plots show a suspicious point, it is appropriate to warn that inference may not be reliable without further investigation.

Transformations

If the scatterplot or residual plot shows curvature or a fan shape, transforming variables (for example, using a logarithm) can sometimes make the relationship more linear and stabilize variance. You are expected to recognize that slope inference depends on linearity and roughly constant variance, and to suggest that transformations or different models may be needed when those conditions fail.

Exam Focus

Typical question patterns:

  • “Is it reasonable to interpret the slope as cause-and-effect? Explain based on study design.”
  • “Is this an extrapolation? Is the prediction trustworthy?”
  • “How would an influential point affect the slope and inference?”

Common mistakes:

  • Claiming causation from observational data.
  • Forgetting mean-response language.
  • Ignoring clear violations of linearity or constant variance shown in residual plots.

Full AP-Style Inference Write-Up (Putting Everything Together)

On free-response questions, you are graded on communication as well as calculations. A strong solution reads like a structured argument.

Example prompt (generic)

A random sample of n observations is collected to investigate the relationship between an explanatory variable and a response variable. A regression analysis is performed.

Question: “Do the data provide convincing evidence of a linear relationship between x and y in the population? Perform an appropriate test at significance level alpha.”

Step 1: Identify the parameter

State clearly that the parameter of interest is the true slope of the population regression line relating x to the mean of y.

Step 2: State hypotheses

Most commonly:

H_0: \beta_1 = 0

H_a: \beta_1 \ne 0

Use one-sided alternatives only with context justification.

Step 3: Check conditions

Write in context, referencing plots and design:

  • Linear: scatterplot/residual plot shows a roughly linear pattern.
  • Independent: random sample (and 10% condition if sampling without replacement) or randomized experiment.
  • Normal: residuals approximately normal (based on residual histogram/normal plot).
  • Equal variance: residual plot shows roughly constant spread.
  • No extreme outliers/influential points evident (or explicitly address them).

Step 4: Calculate test statistic and p-value

Use:

t = \frac{b_1 - 0}{SE_{b_1}}

with:

df = n - 2

Then obtain the p-value from the t distribution.

Step 5: Decide and conclude in context

Decision rule:

  • If p-value is less than alpha, reject the null hypothesis.
  • Otherwise, fail to reject the null hypothesis.

Conclusion template: “There is (or is not) convincing evidence that the population slope is (positive/negative/different from 0), so there (is/is not) evidence of a linear relationship between x and the population mean of y.” If observational, add a sentence warning against causal claims.

What goes wrong most often

  • Conditions are listed without referencing actual graphs.
  • The conclusion talks about the sample slope instead of the population slope.
  • The conclusion is about individuals rather than the population mean.
  • The alternative hypothesis doesn’t match the p-value used (one-sided vs two-sided).
Exam Focus

Typical question patterns:

  • “Write a complete significance test for slope including conditions, calculations, and conclusion.”
  • “Explain what the p-value means in the context of slope.”
  • “Use a confidence interval to support a conclusion about significance.”

Common mistakes:

  • Skipping conditions or checking the wrong plots.
  • Using regression output numbers without identifying what they represent.
  • Writing an interpretation that is not in context or lacks units.

Additional Worked Problem: Using Output to Build Both a Test and an Interval

This example shows how the same regression output can support multiple inference tasks.

Scenario

A researcher studies the relationship between outside temperature x (degrees) and daily electricity usage y (kilowatt-hours) for a random sample of 25 days. Technology gives: slope estimate -1.50 and slope standard error 0.60. Assume plots indicate a roughly linear relationship, residuals are roughly normal, and spread is roughly constant.

(A) Test for a negative linear relationship at alpha 0.05

Hypotheses:

H_0: \beta_1 = 0

H_a: \beta_1 < 0

Test statistic:

t = \frac{-1.50}{0.60}

Degrees of freedom:

df = 25 - 2

The p-value for t equal to -2.50 with 23 degrees of freedom is around 0.01 (between 0.01 and 0.02). Since the p-value is less than 0.05, reject the null hypothesis. There is convincing evidence that the population slope is negative; as temperature increases, the population mean daily electricity usage tends to decrease.

(B) Construct and interpret a 95% confidence interval

Critical value for 95% confidence with 23 degrees of freedom:

t^* \approx 2.07

Margin of error:

ME = 2.07(0.60)

Interval:

-1.50 \pm 1.242

So the interval is approximately -2.742 to -0.258. Interpretation: You are 95% confident that for each 1-degree increase in temperature, the population mean daily electricity usage decreases by between about 0.258 and 2.742 kilowatt-hours, on average.

Connection check: the interval is entirely negative, so a two-sided test at alpha 0.05 would reject a slope of 0, consistent with the test evidence.

Exam Focus

Typical question patterns:

  • “Use the same regression output to do a test and then a confidence interval.”
  • “Show that the test and interval agree about whether 0 is plausible.”
  • “Interpret a negative slope interval correctly in context.”

Common mistakes:

  • Dropping the negative sign when interpreting a negative slope.
  • Treating the confidence interval as a probability statement about the true slope.
  • Mixing up units (temperature units vs usage units).

Interpreting Results in Context: What Your Conclusion Should Sound Like

Inference is ultimately about communication. Two students can compute the same correct statistic, but only one earns full credit if the interpretation is correct and contextual.

Strong slope interpretation (estimate)

When interpreting the sample slope, mention predicted/mean-response language, direction, and units.

Template: “For each 1-unit increase in x, the predicted value of y increases/decreases by about (slope estimate) units of y, on average.”

Strong inference conclusion (test)

Template: “Because the p-value is (less/greater) than alpha, we (reject/fail to reject) the null hypothesis. The data (do/do not) provide convincing evidence that the population slope is (positive/negative/different from 0). Therefore, there (is/is not) evidence of a linear relationship between x and the population mean of y.”

Strong inference interpretation (interval)

Template: “We are C% confident that for each 1-unit increase in x, the population mean of y changes by between (lower) and (upper) units of y, on average.”

A quick but powerful self-check

Before finalizing, ask:

  • Did I refer to the population slope parameter for inference?
  • Did I specify population mean response (not individuals)?
  • Are the variables in the right order?
  • Are the units consistent?
Exam Focus

Typical question patterns:

  • “Interpret the slope / interpret the p-value / interpret the confidence interval.”
  • “Write a conclusion in context that matches the decision.”
  • “Explain the difference between statistical significance and practical importance.”

Common mistakes:

  • Writing a conclusion about individual outcomes rather than the mean response.
  • Forgetting context entirely (answering with symbols only).
  • Confusing “statistically significant” with “strong association” or “large effect.”