1/24
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Distribution
The pattern of values a variable takes and how often each value occurs; it can be shown with a list, frequency table, histogram, dotplot, or boxplot.
Mean
The average of a data set, found by adding all values and dividing by n; it is sensitive to outliers.
Median
The middle value in an ordered data set; if there are an even number of values, it is the average of the two middle values. It is resistant to outliers.
Interquartile Range (IQR)
A measure of spread for the middle 50% of the data, found by IQR = Q3 - Q1; it is resistant to outliers.
Standard Deviation
A measure of the typical distance of data values from the mean; a larger standard deviation means more spread, and a smaller one means values cluster near the mean.
Skewed Right vs. Skewed Left
Skewed right means the right tail is longer and usually mean > median. Skewed left means the left tail is longer and usually mean < median.
Boxplot
A graph that shows the minimum, Q1, median, Q3, and maximum of a data set; the length of the box is the IQR.
Population vs. Sample
A population is the entire group of interest, while a sample is the subset of that population that is actually measured.
Parameter vs. Statistic
A parameter describes a population, such as the population mean mu, while a statistic describes a sample, such as the sample mean x-bar.
Simple Random Sample (SRS)
A sampling method in which every individual in the population has an equal chance of being selected.
Stratified Sample
A sample made by dividing the population into important groups, called strata, and then randomly sampling within each group.
Convenience Sample
A sample chosen because it is easy to reach; it is often biased and may not represent the population well.
Bias
A systematic error that pushes results away from the truth; common types include selection bias, nonresponse bias, response bias, and wording bias.
Observational Study vs. Experiment
An observational study records outcomes without assigning treatments and can show association. An experiment assigns treatments and can support causal conclusions when well designed.
Random Sample vs. Random Assignment
Random sampling helps you generalize results to a population, while random assignment helps create comparable groups and supports cause-and-effect conclusions in an experiment.
Confounding Variable
A variable related to both the explanatory variable and the response variable that can create a misleading relationship between them.
Scatterplot
A graph of paired (x, y) data used to study the relationship between two variables, including direction, form, strength, and outliers.
Correlation
A number r between -1 and 1 that measures the strength and direction of a linear relationship between two variables; it does not imply causation.
Linear Regression Model
A best-fit line used to predict y from x, written as y = mx + b, where m is the slope and b is the y-intercept.
Residual
The difference between an actual value and a predicted value from a model, found by residual = y - y-hat.
Interpolation vs. Extrapolation
Interpolation predicts within the observed range of x-values, while extrapolation predicts beyond the observed range and is riskier because the pattern may change.
Conditional Probability
The probability of event B given that event A has happened, found by P(B|A) = P(A and B) / P(A); the denominator is limited to the given group.
Permutation vs. Combination
A permutation counts selections where order matters, while a combination counts selections where order does not matter.
z-score
A standardized value that tells how many standard deviations a data point is from the mean, found by z = (x - mu) / sigma.
Empirical Rule (68-95-99.7)
For a normal distribution, about 68% of data lie within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3.