1/123
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
10% Condition
A guideline stating that when sampling without replacement, the sample size should be no more than 10% of the population size for the observations to be treated as approximately independent.
Similar definitions: 10% rule
Example: "The was satisfied because the sample of 200 was less than 10% of the 5,000-student school."
Addition Rule
For any two events, P(A or B) = P(A) + P(B) − P(A and B). For mutually exclusive events, P(A or B) = P(A) + P(B).
Similar definitions: sum rule
Example: "The was applied to find the probability that a card drawn was a heart or a face card."
Alternative Hypothesis
The hypothesis that contradicts the null hypothesis and represents what the researcher is trying to find evidence for. It can be one-sided or two-sided.
Similar definitions: research hypothesis, Hₐ
Example: "The stated that students using the new software would score higher than those using traditional methods."
Bar Chart
A graphical display using rectangular bars to show the frequency or relative frequency of each category in a categorical variable.
Similar definitions: bar graph
Example: "The displayed the number of students in each grade, with each grade represented by a separate bar."
Bias
A systematic error in a statistic or study design that consistently overestimates or underestimates the true population parameter.
Similar definitions: systematic error
Example: "Because the sample excluded lower-income households, the estimate of average income contained ."
Binomial Distribution
A probability distribution modeling the number of successes in a fixed number of independent trials, each with the same probability of success.
Similar definitions: binomial model
Example: "The number of heads in 10 coin flips follows a with n = 10 and p = 0.5."
Blinding
An experimental design technique in which subjects, researchers, or both are unaware of which treatment a subject is receiving, in order to reduce bias in measurement.
Similar definitions: masking
Example: "The study used double so that neither patients nor evaluators knew who received the real drug."
Blocking
A technique in experimental design that groups similar experimental subjects together into blocks before random assignment, in order to control for variation from a known source.
Similar definitions: block design
Example: "The researcher used by separating male and female participants before randomly assigning treatments."
Boxplot
A graphical display showing the five-number summary (minimum, Q1, median, Q3, maximum) of a dataset, useful for comparing distributions and identifying outliers.
Similar definitions: box-and-whisker plot
Example: "Comparing s for the two classes revealed that Class A had a higher median but greater spread."
Categorical Data
Data that consists of distinct groups or categories such as gender, race, or political affiliation. Categorical variables can be nominal (no inherent order) or ordinal (with a natural ranking).
Similar definitions: qualitative data
Example: "The researcher recorded favorite music genre as because the values were labels rather than numbers."
Census
A study that attempts to collect data from every member of the population rather than from a sample.
Similar definitions: complete enumeration
Example: "The national counted every resident in the country to determine population totals."
Central Limit Theorem
A theorem stating that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution, provided n is sufficiently large.
Similar definitions: CLT
Example: "Because of the , the distribution of sample means from a skewed population became approximately normal when n = 50."
Chi-Square Goodness of Fit Test
A hypothesis test that compares the observed frequencies of a single categorical variable to expected frequencies based on a hypothesized distribution.
Similar definitions: GoF test
Example: "A was used to determine whether the observed counts of M&M colors matched the company's stated distribution."
Chi-Square Statistic
A test statistic calculated as Σ(O − E)² / E, where O is the observed frequency and E is the expected frequency for each cell or category.
Similar definitions: χ² statistic
Example: "The of 12.4 with 3 degrees of freedom produced a p-value small enough to reject the null hypothesis."
Chi-Square Test of Homogeneity
A hypothesis test that compares the distribution of a categorical variable across two or more distinct populations to determine if the distributions are the same.
Similar definitions: test of equal distributions
Example: "The was used to compare the political party preferences of voters in three different states."
Chi-Square Test of Independence
A hypothesis test that determines whether two categorical variables measured on the same group of individuals are associated or independent.
Similar definitions: test of association
Example: "The revealed a statistically significant association between gender and preferred study method."
Coefficient of Determination
A value from 0 to 1 that measures the proportion of variation in the response variable that is explained by the linear regression model with the explanatory variable.
Similar definitions: r-squared, R²
Example: "An of 0.81 meant that 81% of the variation in test scores was explained by hours of study."
Complement
The set of all outcomes in the sample space that are NOT in the event of interest. The probability of the complement equals 1 minus the probability of the event.
Similar definitions: complementary event
Example: "The of rolling a 6 is rolling any number from 1 to 5, with probability 5/6."
Conditional Distribution
The distribution of one variable in a two-way table restricted to a particular value or category of the other variable.
Similar definitions: conditional proportion
Example: "The of subject preference among females showed that 60% preferred science over humanities."
Conditional Probability
The probability that an event A occurs given that another event B has already occurred, written as P(A|B).
Similar definitions: given probability
Example: "The of drawing a king given the first card drawn was an ace was calculated using the reduced sample space."
Confidence Interval
A range of plausible values for a population parameter, constructed from sample data at a specified confidence level (e.g., 95%), meaning that in repeated sampling the interval would capture the true parameter that percentage of the time.
Similar definitions: interval estimate
Example: "The 95% for the mean was (42.1, 47.9), suggesting the true population mean was likely between those values."
Confidence Level
The long-run percentage of confidence intervals, constructed using the same method from repeated random samples, that would contain the true population parameter. Common levels are 90%, 95%, and 99%.
Similar definitions: confidence percentage
Example: "A of 95% means that if the sampling procedure were repeated many times, 95% of the resulting intervals would contain the true parameter."
Confounding Variable
An extraneous variable that is associated with both the explanatory and response variables, making it difficult to determine the true cause-and-effect relationship.
Similar definitions: confounder
Example: "Ice cream sales and drowning rates both increase in summer, making temperature a that explains the apparent association."
Continuous Random Variable
A random variable that can take on any value within an interval or range, with no gaps between possible values.
Similar definitions: continuous variable
Example: "The exact weight of a randomly selected apple is a because it can take any value within a range."
Control Group
The group in an experiment that does not receive the treatment, serving as a baseline for comparison with the treatment group.
Similar definitions: baseline group, comparison group
Example: "The received a sugar pill so that any difference in outcomes could be attributed to the actual drug."
Convenience Sample
A non-random sample consisting of individuals who are easiest to reach or most readily available, which typically produces biased results.
Similar definitions: haphazard sample
Example: "Asking only students in the front row for opinions is an example of a because it excludes most of the class."
Correlation Coefficient
A numerical value ranging from -1 to 1 that measures the strength and direction of the linear association between two quantitative variables. Values near ±1 indicate a strong relationship; values near 0 indicate a weak or no linear relationship.
Similar definitions: Pearson r, r-value
Example: "With a of 0.92, there was a strong positive linear relationship between study hours and exam scores."
Critical Value
The value from a reference distribution (z* or t*) that corresponds to the desired confidence level or significance level, used to construct confidence intervals or define rejection regions.
Similar definitions: z*, t*
Example: "For a 95% confidence interval using the normal distribution, the is z* = 1.96."
Degrees of Freedom
A parameter controlling the shape of the t-distribution or chi-square distribution. For a one-sample t-test, degrees of freedom equal n − 1.
Similar definitions: df
Example: "With a sample size of 20, the for the one-sample t-test was 19."
Discrete Random Variable
A random variable that can take on a countable number of distinct values, typically whole numbers.
Similar definitions: discrete variable
Example: "The number of customers entering a store each hour is a because it can only be a whole number."
Dot Plot
A simple graph that places a dot above a number line for each data value, useful for showing the shape, center, and spread of small datasets.
Similar definitions: dot chart
Example: "The made it easy to see that most students scored between 75 and 85."
Double-Blind Experiment
An experiment in which neither the subjects nor the researchers measuring outcomes know which treatment each subject is receiving, eliminating both subject and researcher bias.
Similar definitions: double-masked study
Example: "The clinical trial was a because neither the patients nor the doctors evaluating them knew who received the active drug."
Empirical Rule
A rule stating that for a normal distribution, approximately 68% of data falls within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3.
Similar definitions: 68-95-99.7 rule, three-sigma rule
Example: "Using the , a teacher predicted that about 95% of students would score within 2 standard deviations of the mean."
Event
A subset of the sample space containing one or more outcomes of interest in a probability experiment.
Similar definitions: outcome set
Example: "Getting an even number when rolling a die is an consisting of the outcomes {2, 4, 6}."
Expected Frequency
The count predicted for a cell in a chi-square test under the null hypothesis, calculated as (row total × column total) / grand total.
Similar definitions: expected count
Example: "The for the cell representing female students who preferred math was 24."
Expected Value
The long-run average value of a random variable over many repetitions of the random process, calculated as the sum of each value multiplied by its probability.
Similar definitions: mean of a random variable, E(X)
Example: "The of the number of heads in 4 coin flips is 2."
Experiment
A study in which researchers deliberately impose a treatment on subjects and observe the response in order to establish cause-and-effect relationships.
Similar definitions: controlled experiment, randomized experiment
Example: "In the , the researchers randomly assigned subjects to receive either the new drug or a placebo."
Explanatory Variable
The input variable (plotted on the x-axis) that is used to explain or predict changes in the response variable. Also called the independent variable.
Similar definitions: independent variable, predictor variable
Example: "In the study of fertilizer and plant growth, the amount of fertilizer applied was the ."
Extrapolation
Using a regression model to make predictions for values of the explanatory variable outside the range of the observed data, which can produce unreliable or misleading results.
Similar definitions: out-of-range prediction
Example: "Predicting a student's college GPA from a kindergarten test score would be an example of dangerous ."
Five-Number Summary
A set of five descriptive statistics — minimum, first quartile (Q1), median, third quartile (Q3), and maximum — that describe the distribution of a dataset.
Similar definitions: quartile summary
Example: "The for the test scores was {52, 71, 83, 91, 100}."
Geometric Distribution
A probability distribution modeling the number of trials needed to obtain the first success in a sequence of independent Bernoulli trials.
Similar definitions: geometric model
Example: "The number of rolls of a die before rolling a 6 for the first time follows a ."
Histogram
A bar graph used for quantitative data in which each bar represents the frequency or relative frequency of data values falling within a specified interval (bin).
Similar definitions: frequency histogram
Example: "The of exam scores revealed a roughly symmetric, bell-shaped distribution."
Independence Condition
A condition for inference requiring that individual observations are independent of one another. It is satisfied by random sampling or random assignment and verified using the 10% condition when sampling without replacement.
Similar definitions: independence requirement
Example: "The was met because students were randomly selected and the sample was less than 10% of the population."
Independent Events
Two events are independent if the occurrence of one does not affect the probability of the other.
Similar definitions: statistically independent events
Example: "Flipping a coin and rolling a die are because the result of one does not change the probability of the other."
Inference for Regression
The use of hypothesis tests and confidence intervals to determine whether a linear relationship exists in the population and to estimate the population slope.
Similar definitions: regression inference
Example: "Using , the student tested whether there was a significant linear relationship between temperature and ice cream sales."
Influential Point
A data point that has a large impact on the position or slope of the regression line, typically a point with an extreme x-value.
Similar definitions: high-leverage point
Example: "Removing the single changed the slope of the regression line from 0.8 to 0.3."
Intercept (Regression)
The predicted value of the response variable when the explanatory variable equals zero. It is the point where the regression line crosses the y-axis, denoted b₀ in the sample and β₀ in the population.
Similar definitions: y-intercept, b₀
Example: "The of 15 suggested that even with zero hours of study, the model predicted a score of 15."
Interquartile Range (IQR)
The difference between the third quartile (Q3) and the first quartile (Q1), representing the spread of the middle 50% of data. It is resistant to outliers.
Similar definitions: IQR, midspread
Example: "The of 18 indicated that the middle 50% of scores spanned 18 points."
Large Counts Condition
A condition for using the normal approximation for proportions, requiring that both np ≥ 10 and n(1 − p) ≥ 10.
Similar definitions: success-failure condition
Example: "With n = 100 and p = 0.4, the was met since both 40 and 60 are at least 10."
Law of Large Numbers
A principle stating that as the number of trials of a random process increases, the relative frequency of an event approaches its true probability.
Similar definitions: LLN
Example: "The explains why the proportion of heads approaches 0.5 as more coin flips are observed."
Least-Squares Regression Line
The line that minimizes the sum of the squared vertical distances (residuals) between the observed data points and the line, used to model the linear relationship between two quantitative variables.
Similar definitions: line of best fit, LSRL, regression line
Example: "The ŷ = 3.2x + 10 was used to predict exam scores from hours of study."
Linear Regression Slope
The value representing the change in the predicted response variable for each one-unit increase in the explanatory variable, denoted b₁ in the sample and β₁ in the population.
Similar definitions: regression coefficient, b₁
Example: "The estimated of 4.5 means that each additional hour of study is associated with a 4.5-point increase in predicted exam score."
Lurking Variable
A variable that influences both the explanatory and response variables but is not included in the study, potentially creating a false impression of association.
Similar definitions: hidden variable
Example: "The apparent relationship between shoe size and reading ability in children was explained by age, a ."
Margin of Error
The maximum likely difference between the sample estimate and the true population parameter, equal to the critical value multiplied by the standard error.
Similar definitions: ME
Example: "The poll reported 54% support with a of ±3%, meaning the true proportion was likely between 51% and 57%."
Marginal Distribution
The distribution of one variable in a two-way table, found by looking at the row totals or column totals.
Similar definitions: marginal frequency, marginal proportion
Example: "The of preferred subject showed that 55% of all students preferred math over English."
Matched Pairs Design
An experimental or observational design in which subjects are paired based on similar characteristics and each member receives a different treatment, or in which each subject receives both treatments in random order. Differences within each pair are analyzed.
Similar definitions: paired design, within-subjects design
Example: "In the , each participant tasted both brands of coffee and the differences in ratings were analyzed."
Mean
The arithmetic average of a dataset, calculated by summing all values and dividing by the number of values. It is sensitive to extreme values and outliers.
Similar definitions: arithmetic mean, average
Example: "The salary of the employees was pulled upward by a few very high earners."
Median
The middle value in an ordered dataset that divides the distribution into two equal halves. It is resistant to outliers, making it preferable for skewed distributions.
Similar definitions: middle value
Example: "Because the data was right-skewed, the teacher reported the rather than the mean as a better measure of center."
Multiplication Rule
For independent events A and B, P(A and B) = P(A) × P(B). For dependent events, P(A and B) = P(A) × P(B|A).
Similar definitions: product rule
Example: "The was used to find the probability of drawing two aces in a row without replacement."
Mutually Exclusive Events
Two or more events that cannot occur at the same time; if one event occurs, the other cannot.
Similar definitions: disjoint events
Example: "Rolling a 3 and rolling a 5 on a single die roll are because only one number can appear."
Nonresponse Bias
A type of bias that occurs when individuals selected for a survey systematically differ from those who do not respond, affecting the representativeness of results.
Similar definitions: nonparticipation bias
Example: "The survey suffered from because the people most likely to skip the survey had very different opinions from those who responded."
Normal Distribution
A continuous, symmetric, bell-shaped probability distribution completely described by its mean and standard deviation. Many natural phenomena and sampling distributions are approximately normal.
Similar definitions: Gaussian distribution, bell curve
Example: "Adult heights follow a roughly with a mean of 5'9" and a standard deviation of 3 inches."
Normal Probability Plot
A graph used to assess whether a dataset follows a normal distribution. If the points fall approximately along a straight line, the data can be considered approximately normal.
Similar definitions: normal quantile plot, Q-Q plot
Example: "The showed points falling close to a straight line, supporting the use of a t-procedure."
Null Hypothesis
The default assumption in a hypothesis test, typically stating that there is no effect, no difference, or no association between variables.
Similar definitions: H₀, H-naught, default hypothesis
Example: "The stated that the mean test score of students using the new curriculum was equal to 75."
Observational Study
A study in which researchers observe subjects and measure variables of interest without attempting to influence or manipulate the subjects.
Similar definitions: survey study
Example: "The researchers conducted an by recording students' sleep habits and grades without assigning any sleep schedules."
One-Proportion z-Interval
A confidence interval used to estimate a population proportion, computed as p̂ ± z*(√(p̂(1−p̂)/n)), when the Large Counts and 10% conditions are met.
Similar definitions: one-sample z-interval for proportion
Example: "The researcher constructed a to estimate the true proportion of voters who supported the measure."
One-Proportion z-Test
A hypothesis test used to determine whether a population proportion differs from a specified value, using the z-distribution when the Large Counts and 10% conditions are met.
Similar definitions: one-sample z-test for proportion
Example: "The quality control manager used a to test whether the defect rate exceeded the claimed 2%."
One-Sample t-Interval
A confidence interval used to estimate a population mean when the population standard deviation is unknown, computed as x̄ ± t*(s/√n) with n − 1 degrees of freedom.
Similar definitions: t-interval for mean
Example: "The researcher used a to estimate the average commute time in the city based on a sample of 40 commuters."
One-Sample t-Test
A hypothesis test used to determine whether the mean of a single population differs from a specified value, using the t-distribution when the population standard deviation is unknown.
Similar definitions: single-sample t-test
Example: "The researcher used a to determine whether the average daily calorie intake differed from the recommended 2,000."
Outlier
A data point that is significantly different from the rest of the dataset, typically identified using the 1.5 × IQR rule or z-scores.
Similar definitions: extreme value, anomaly
Example: "The value of 98 was flagged as an because it fell more than 1.5 × IQR above Q3."
P-Value
The probability of obtaining a test statistic as extreme as or more extreme than the one observed, assuming the null hypothesis is true. A small p-value provides evidence against the null hypothesis.
Similar definitions: observed significance level
Example: "With a of 0.03, the researchers rejected the null hypothesis at the 0.05 significance level."
Paired t-Test
A hypothesis test for comparing two related measurements or treatments by analyzing the differences within each pair, treating them as a single sample.
Similar definitions: dependent samples t-test, matched pairs t-test
Example: "The researcher used a to determine whether students scored differently on a test before and after a tutoring session."
Parameter
A numerical value that describes a characteristic of an entire population, such as the population mean (μ) or population standard deviation (σ).
Similar definitions: population characteristic
Example: "The average height of all adults in a country is a because it describes the entire population."
Percentile
The value below which a given percentage of observations fall. For example, the 80th percentile is the value below which 80% of the data lies.
Similar definitions: centile
Example: "A score at the 90th means that 90% of all scores were below that value."
Placebo
An inactive treatment (such as a sugar pill) given to the control group in an experiment to account for the psychological effect of receiving a treatment.
Similar definitions: inert treatment, dummy treatment
Example: "The allowed researchers to separate the true drug effect from the psychological effect of receiving any treatment."
Placebo Effect
The phenomenon in which subjects show a measurable response to an inactive treatment simply because they believe they are receiving a real treatment.
Similar definitions: suggestion effect
Example: "The researchers accounted for the by giving the control group a sugar pill that looked identical to the actual medication."
Pooled Proportion
A combined estimate of the common proportion under the null hypothesis in a two-sample proportion test, calculated by combining the successes and sample sizes from both groups.
Similar definitions: combined proportion
Example: "The was used as the estimate of p when calculating the standard error in the two-proportion z-test."
Population
The entire group of individuals, objects, or events that a researcher is interested in studying.
Similar definitions: universe of subjects
Example: "The for the study was all registered voters in the United States."
Power
The probability that a hypothesis test correctly rejects a false null hypothesis. Power equals 1 − β and increases with larger sample sizes and larger effect sizes.
Similar definitions: statistical power
Example: "Increasing the sample size from 30 to 200 greatly increased the of the test to detect a real difference."
Probability
A number between 0 and 1 that quantifies the likelihood of an event occurring, where 0 means impossible and 1 means certain.
Similar definitions: likelihood, chance
Example: "The of rolling a six on a fair die is 1/6."
Probability Distribution
A description (table, graph, or formula) of all possible values of a random variable along with the probability associated with each value.
Similar definitions: probability model
Example: "The for a fair six-sided die assigns probability 1/6 to each of the values 1 through 6."
Quantitative Data
Data that involves numerical measurements or counts such as height, weight, or number of siblings. It can be discrete (whole numbers only) or continuous (any value within a range).
Similar definitions: numerical data
Example: "The students' exam scores were collected as because they are numerical and can be compared meaningfully."
Quartile
Values that divide an ordered dataset into four equal parts. Q1 is the 25th percentile, Q2 is the median, and Q3 is the 75th percentile.
Similar definitions: quartile values
Example: "The first (Q1) of 23 indicated that 25% of students scored below 23."
Random Assignment
The process of using chance to allocate subjects to treatment groups in an experiment, ensuring that differences in outcomes are due to the treatment rather than preexisting differences.
Similar definitions: randomization
Example: "The researchers used to place participants into either the exercise or no-exercise group."
Random Variable
A variable whose value is determined by the outcome of a random process, assigning a numerical value to each outcome in a sample space.
Similar definitions: stochastic variable
Example: "Let X be the number of heads in three coin flips; X is a because its value depends on chance."
Randomness Condition
A condition for inference requiring that the data were produced by a random sample or random assignment, ensuring results can be generalized and that inference is valid.
Similar definitions: random sample condition
Example: "The was satisfied because participants were randomly selected from the school's enrollment list."
Range
The difference between the maximum and minimum values in a dataset. It provides a simple measure of total spread but is sensitive to outliers.
Similar definitions: total spread
Example: "The of the test scores was 45, calculated by subtracting the lowest score from the highest."
Replication
The use of multiple subjects in each treatment group of an experiment to reduce the effect of chance variation and increase the reliability of results.
Similar definitions: repetition
Example: "The experiment used by assigning 50 subjects to each treatment group rather than just one."
Residual
The difference between an observed value of the response variable and the value predicted by the regression line. Positive residuals indicate underprediction; negative residuals indicate overprediction.
Similar definitions: error, prediction error
Example: "The student's actual score was 5 points above the predicted value, giving a of +5."
Residual Plot
A graph of residuals versus the explanatory variable or predicted values, used to check whether a linear model is appropriate. A random scatter with no pattern indicates a good fit.
Similar definitions: diagnostic plot
Example: "The showed a curved pattern, indicating that a linear model was not appropriate for the data."
Response Variable
The output variable (plotted on the y-axis) that is being explained or predicted by the explanatory variable. Also called the dependent variable.
Similar definitions: dependent variable, outcome variable
Example: "The plant's height was the because it was what the researchers measured as the outcome."
Sample
A subset of the population that is selected for study and used to make inferences about the population.
Similar definitions: subset
Example: "The researchers selected a of 500 students from the school's enrollment list."
Sample Proportion
The fraction of individuals in a sample having a particular characteristic, calculated as the number of successes divided by the sample size (x/n).
Similar definitions: p-hat, p̂
Example: "In a sample of 200 voters, 110 favored the measure, giving a of 0.55."
Sample Space
The set of all possible outcomes of a random experiment or process.
Similar definitions: outcome space, universal set
Example: "The for flipping a coin twice is {HH, HT, TH, TT}."
Sampling Bias
A systematic error that occurs when some members of the population are more likely to be selected for the sample than others, leading to results that do not accurately represent the population.
Similar definitions: selection bias
Example: "Only surveying students in the cafeteria introduced , since students who bring lunch from home were not represented."
Sampling Distribution
The probability distribution of a statistic (such as the sample mean or sample proportion) calculated from all possible samples of a given size taken from a population.
Similar definitions: distribution of a sample statistic
Example: "The of the sample mean becomes approximately normal as sample size increases, by the Central Limit Theorem."
Sampling Variability
The natural variation in sample statistics from one random sample to another, even when all samples are drawn from the same population.
Similar definitions: natural variation
Example: "The in survey results means that different random samples of the same size will produce slightly different estimates."
Scatterplot
A graph that displays the relationship between two quantitative variables, with each point representing a pair of measurements for one individual.
Similar definitions: scatter diagram, scatter graph
Example: "The revealed a moderately strong positive linear association between shoe size and height."
Significance Level
The threshold probability set before conducting a hypothesis test, below which the p-value leads to rejection of the null hypothesis. Commonly set at 0.05 or 0.01.
Similar definitions: alpha level, α
Example: "At a of 0.05, any p-value below 0.05 would result in rejecting the null hypothesis."
Simple Random Sample (SRS)
A sampling method in which every member of the population has an equal chance of being selected, and every possible sample of a given size is equally likely.
Similar definitions: SRS, random sample
Example: "The teacher used a by numbering all students and using a random number generator to select 30."