1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Inference
Using sample data to make a justified claim about a population parameter, accounting for sampling variability (uncertainty).
Parameter
A numerical characteristic of a population (e.g., the population mean μ).
Statistic
A numerical characteristic computed from a sample (e.g., the sample mean x̄), used to estimate a parameter.
Population Mean (μ)
The true average of a quantitative variable for the entire population of interest.
Sample Mean (x̄)
The average of the sample data; a point estimate of the population mean μ.
Population Standard Deviation (σ)
The true standard deviation of the population; usually unknown in real applications.
Sample Standard Deviation (s)
The standard deviation computed from sample data; used to estimate the unknown population standard deviation σ.
Standard Error (SE)
An estimate of the standard deviation of a statistic; for a one-sample mean, SE = s/√n.
z Statistic
Standardized statistic for a mean when σ is known: z = (x̄ − μ)/(σ/√n), which follows the standard normal model under conditions.
t Statistic
Standardized statistic for a mean when σ is unknown: t = (x̄ − μ)/(s/√n).
Student’s t Distribution
A family of symmetric, bell-shaped distributions centered at 0 with heavier tails than the normal; used for inference about means when σ is unknown.
Degrees of Freedom (df)
A parameter that determines the exact shape/spread of a t distribution; smaller df gives heavier tails.
One-Sample Degrees of Freedom (n − 1)
For one-sample t procedures, df = n − 1.
Heavier Tails
A distribution feature giving more probability to extreme values; t distributions have heavier tails than normal because s varies from sample to sample.
Critical Value (t*)
The cutoff from a t distribution used to create a confidence interval, based on confidence level and df.
Confidence Interval
An interval of plausible values for a population parameter, computed from sample data.
Confidence Level (C%)
The long-run success rate of the interval method: about C% of intervals from repeated random samples would capture the true parameter.
Margin of Error (ME)
How far the confidence interval extends from the point estimate; for a one-sample mean, ME = t*·(s/√n).
One-Sample t Confidence Interval
Interval for a population mean μ (σ unknown): x̄ ± t*·(s/√n), with df = n − 1.
Significance Test
A procedure that uses sample data to evaluate evidence against a claim about a population parameter (e.g., a claim about μ).
Null Hypothesis (H0)
The default claim tested, stated with equality (e.g., H0: μ = μ0).
Alternative Hypothesis (Ha)
The competing claim you seek evidence for (e.g., Ha: μ ≠ μ0, μ > μ0, or μ < μ0).
Two-Sided Alternative (μ ≠ μ0)
An alternative hypothesis looking for a difference in either direction; leads to a two-sided p-value.
One-Sided Alternative (μ > μ0 or μ < μ0)
An alternative hypothesis specifying a direction; p-value is computed in the corresponding tail.
p-Value
Assuming H0 is true, the probability of getting a test statistic at least as extreme as the observed one (in the direction(s) of Ha).
Significance Level (α)
The cutoff probability for deciding statistical significance (commonly 0.05); compared to the p-value.
Reject H0
Decision when p ≤ α; conclude the data provide convincing evidence against H0 (supporting Ha).
Fail to Reject H0
Decision when p > α; conclude the data do not provide convincing evidence against H0 (not the same as “accept H0”).
Type I Error
Rejecting a true null hypothesis (a “false positive”).
Type II Error
Failing to reject a false null hypothesis (a “false negative”).
Power (1 − β)
The probability of rejecting H0 when H0 is false; power = 1 − (probability of Type II error).
Random Condition
Requirement that data come from a random sample or randomized experiment; supports generalization (random sampling) or cause-and-effect (random assignment).
Independence
Condition that observations do not influence each other; often supported by sampling design and the 10% condition when sampling without replacement.
10% Condition
When sampling without replacement, require n ≤ 0.10(population size) to justify independence.
Normal/Large Sample Condition
For t procedures, the sampling distribution is approximately normal if the population is roughly normal or n is large enough for CLT; watch for strong skewness/outliers.
Central Limit Theorem (CLT)
For sufficiently large n, the sampling distribution of x̄ is approximately normal even if the population is not (often summarized by n ≥ 30, but outliers/skewness can still matter).
Outlier
An extreme data value that can strongly affect x̄ and s, especially when n is small, potentially undermining t procedures.
Robustness
The idea that t procedures often work reasonably well even when normality is not perfect, especially for moderate/large n without extreme outliers.
Statistical Significance
A result considered unlikely under H0 (typically p ≤ α); indicates evidence against H0, not necessarily a large or important effect.
Practical Importance
Whether an effect size is meaningful in context; a result can be statistically significant but practically trivial (especially with large n).
Two-Sample t Procedures
Inference methods for comparing means of two independent groups using x̄1 − x̄2 and an unpooled (Welch) standard error when σ1 and σ2 are unknown.
Parameter (μ1 − μ2)
The true difference in population means between group 1 and group 2 (often defined as population 1 minus population 2).
Statistic (x̄1 − x̄2)
The observed difference between sample means, used to estimate μ1 − μ2.
Two-Sample Standard Error
Estimated SD of x̄1 − x̄2 when σ’s are unknown: SE = √(s1²/n1 + s2²/n2).
Welch–Satterthwaite Approximation
Technology-based method for estimating df in two-sample t procedures; df depends on s1, s2, n1, and n2.
Paired (Matched-Pairs) Data
Data where observations come in natural pairs (e.g., before/after on the same person); the two measurements within a pair are not independent.
Difference Variable (d)
For paired data, compute one value per pair: d = (first measurement) − (second measurement); inference is then done on the d’s.
Mean of Differences (μd)
The population mean of the paired differences; the parameter for paired t procedures.
Simulation-Based p-Value
An estimated p-value found by simulating the null model many times and counting the proportion of simulated statistics at least as extreme as the observed statistic.
Median Absolute Deviation (MAD)
A variability measure: the median of the absolute deviations from the median; can be used with simulations to assess unusual variability.