1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Individual
The object described by the data (e.g., a person, product, day, school); also called an observational unit.
Variable
A characteristic recorded for each individual.
One-variable data
Data where each individual contributes one measurement (quantitative) or one category value (categorical).
Categorical (qualitative) variable
A variable whose values are category names or group labels; arithmetic on the values is not meaningful (e.g., blood type, brand, ZIP code).
Quantitative variable
A variable with numerical values that represent a measured or counted quantity; arithmetic on the values is meaningful (e.g., height, number of visits).
Discrete quantitative variable
A quantitative variable that takes a finite or countable set of values with gaps between possible values (often counts).
Continuous quantitative variable
A quantitative variable that can take infinitely many values in an interval (often measurements like height or weight).
Distribution
The pattern of values a variable takes, including what values occur, how often they occur, and overall features (shape, center, spread, unusual features).
Variability
How much data values differ from each other; a central idea in statistics and what distributions are meant to reveal.
Descriptive statistics
Methods for organizing, summarizing, and displaying data (typical values, variability, shape, relative standing).
Inferential statistics
Methods for drawing conclusions about a broader situation (population/process) from limited data (a sample).
Frequency
A count of observations in a category (categorical data) or in a bin/class interval (quantitative data).
Relative frequency
A proportion (or percent) of observations in a category/bin: c/n, where c is the count and n is the total.
Frequency table
A table listing each category (or bin) and its count (frequency).
Relative frequency table
A table listing each category (or bin) and its proportion or percent of the total (relative frequency).
Bar chart
A graph for categorical data that uses separated bars to show counts or relative frequencies for each category.
Dotplot
A quantitative display that places a dot above each data value on a number line (stacking repeats), showing individual values clearly.
Stemplot (stem-and-leaf plot)
A display that splits each value into a stem (leading digits) and a leaf (last digit) to organize data while preserving exact values.
Histogram
A quantitative display that groups data into intervals (bins) and uses touching bars to show frequencies or relative frequencies.
Bin width
The size of each histogram interval; changing bin width can make the same data look smoother or more jagged.
Relative frequency histogram
A histogram with the vertical axis showing proportions (frequency ÷ total) instead of counts; the shape matches the frequency histogram.
Cumulative relative frequency plot (ogive)
A graph showing, for each value/class boundary, the proportion of observations at or below that value; useful for medians and quartiles.
Boxplot (box-and-whisker plot)
A graph based on the five-number summary; the box spans Q1 to Q3 with the median marked, whiskers extend to non-outliers, and outliers may be plotted separately.
SOCS
A standard way to describe quantitative distributions: Shape, Outliers (and other unusual features), Center, Spread.
Bimodal
A distribution with two distinct peaks (often indicating two clusters or subgroups).
Skewed right
A distribution with a long tail to the right (toward larger values); the mean is often greater than the median.
Skewed left
A distribution with a long tail to the left (toward smaller values); the mean is often less than the median.
Cluster
A region of the distribution where many values are concentrated, suggesting a natural subgroup.
Gap
A noticeable interval in the distribution where no data values occur.
Outlier
An observation unusually far from the rest of the data; may indicate error, a special case, or a different process and can strongly affect mean and SD.
1.5×IQR rule
A method to flag potential outliers: values < Q1 − 1.5(IQR) or > Q3 + 1.5(IQR).
Mean (sample mean, x̄)
The arithmetic average of a sample: x̄ = (1/n)∑x_i; uses all values but is not resistant to outliers/skew.
Population mean (μ)
The mean of an entire population, typically denoted by the Greek letter mu (μ).
Median
The middle value in ordered data (or the average of the two middle values if n is even); resistant to outliers.
Resistant statistic
A numerical summary that is not strongly affected by extreme values (e.g., median and IQR are resistant; mean and SD are not).
Quartiles (Q1, Q2, Q3)
Values that split ordered data into four roughly equal parts: Q1 ≈ 25th percentile, Q2 = median (50th), Q3 ≈ 75th percentile.
Five-number summary
Minimum, Q1, median, Q3, maximum.
Interquartile range (IQR)
A resistant measure of spread for the middle 50% of the data: IQR = Q3 − Q1.
Variance (sample variance, s^2)
Average squared deviation from the sample mean (using n−1): s^2 = (1/(n−1))∑(x_i − x̄)^2.
Standard deviation (sample, s)
The square root of variance: s = √[(1/(n−1))∑(x_i − x̄)^2]; a typical distance from the mean (not resistant).
Percentile (percentile rank)
The percent of observations at or below a given value (e.g., 80th percentile means about 80% are at or below).
z-score
A standardized value giving the number of standard deviations an observation is from the mean: z = (x−μ)/σ (population) or z = (x−x̄)/s (sample).
Linear transformation (y = a + bx)
A shift-and-rescale transformation. Mean transforms as ȳ = a + b x̄; standard deviation transforms as sy = |b| sx. Shifts change centers but not spreads; rescaling changes both.
Density curve
A smooth model of a distribution where total area under the curve equals 1 and area over an interval represents a proportion.
Normal distribution (N(μ, σ))
A bell-shaped, symmetric density model determined by mean μ and standard deviation σ; mean = median = mode in a perfect Normal model.
Empirical Rule (68–95–99.7)
For (approximately) Normal data: about 68% within μ±σ, 95% within μ±2σ, and 99.7% within μ±3σ.
Standard Normal distribution (N(0,1))
The Normal distribution with mean 0 and standard deviation 1; often denoted Z. Any Normal X can be standardized to Z using z-scores.
Normal probability plot (Normal quantile plot)
A plot of ordered data against expected Normal quantiles; points near a straight line suggest a Normal model is reasonable, curves suggest non-Normality.
normalcdf
Calculator/technology command that finds a Normal probability (area under the Normal curve) for an interval, e.g., P(a ≤ X ≤ b).
invNorm
Calculator/technology command that finds a Normal percentile (cutoff value x) for a given left-tail probability, using x = μ + zσ.