1/49
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Least Squares Regression Line (LSRL)
The line that minimizes the sum of the squares of the residuals between the observed and predicted values.
Inference
The process of drawing conclusions about a population based on sample data.
Population Parameters
Theoretical values that describe the true characteristics of a population.
Sample Statistics
Calculated values derived from sample data to estimate population parameters.
True Population Regression Line
The actual linear relationship that describes how the response variable changes with the explanatory variable in the entire population.
Residuals
The difference between observed values and predicted values from the regression line.
Standard Error (SE)
An estimate of the standard deviation of the sampling distribution of a statistic.
Sampling Distribution
The distribution of a statistic (like the sample slope) computed from all possible samples.
Shape of Sampling Distribution
Approximately Normal if the conditions of the inference are met.
Center of Sampling Distribution
The mean of the sample slopes, which is equal to the true population slope ($eta$).
Spread of Sampling Distribution
Described by the standard deviation of the slope ($eta$), indicating variability across samples.
Standard Error of the Slope (SE_b)
An estimate of the standard deviation of the sample slope, derived from sample data.
Degrees of Freedom (df)
In this context, calculated as $df = n - 2$ for the regression slope.
LINER
Conditions necessary for performing inference: Linear, Independent, Normal, Equal Variance, Random.
Linear Relationship
The relationship between $x$ and $y$ must be represented by a straight line.
Independent Observations
Each observation in the sample must be independent of the others.
Normality Condition
The residuals must be normally distributed around the true regression line.
Homogeneity of Variance (Homoscedasticity)
The standard deviation of $y$ should be constant across all levels of $x$.
Random Sampling
Data must come from a random sample or a randomized experiment.
P-Value
The probability of observing the test statistic or something more extreme under the null hypothesis.
Null Hypothesis (H_0)
A statement that there is no effect or no difference, used for hypothesis testing.
Alternative Hypothesis (H_a)
The statement that contradicts the null hypothesis, indicating evidence of an effect.
Test Statistic (t)
A standardized value used to determine how far the sample statistic is from the null hypothesis.
Confidence Interval for the Slope
A range of values constructed to estimate the true slope of the population regression line.
Critical Value (t*)
The value from the t-distribution used to calculate the margin of error in confidence intervals.
Interpretation of CI
Expresses a level of confidence that the true parameter lies within the calculated interval.
Standard Deviation of Residuals (S)
A measure of how much the observed values deviate from the predicted values.
Coefficient of Determination (r^2)
A statistic that measures the proportion of variance for the dependent variable that's explained by the independent variable.
Error Term ($eta$)
The part of the model that accounts for the variation in the response variable not explained by the predictor.
Scatterplot
A graphical representation showing the relationship between two quantitative variables.
Residual Plot
A graphical representation of the residuals to check for patterns that might indicate non-linearity.
Standard Error of the Slope Formula
$SEb = \frac{s}{sx\sqrt{n-1}}$ where $s$ is the standard deviation of residuals.
Misinterpretation of Confidence Intervals
Saying sample slope falls in the interval instead of the true slope.
Common Mistakes in Hypothesis Testing
Confusing sample and population statistics, misreading regression output.
Parameter Estimation
Using sample data to guess or estimate the population parameters.
Regression Coefficients
Parameters that represent the relationship between the independent variable and the dependent variable.
Slope ($b$)
Represents the change in the response variable for a one-unit change in the explanatory variable.
Intercept ($a$)
The expected value of the response variable when the explanatory variable is zero.
Statistical Significance
A mathematical indication that the relationship observed in data is unlikely to have occurred by chance.
Uniform Distribution of Residuals
Having evenly varied residuals across the range of fitted values.
Statistical Power
The probability of correctly rejecting a false null hypothesis.
Assumptions in Linear Regression
Conditions that must be met for the results of linear regression to be valid.
Predicted Values ($ar{y}$)
Calculating expected outcomes based on regression coefficients.
Multi-collinearity
A scenario in regression analysis where two or more predictors are highly correlated.
Outlier's Impact
Influence of unusual observations on the overall regression model leading to misleading results.
Statistical Software Output Interpretation
Extracting and understanding key statistics from regression analysis conducted by software.
Identifying Key Statistics
Locating essential values in computerized regression output such as coefficients and their standard errors.
Residual Normality
Checking whether the distribution of residuals follows a normal distribution.
Testing Conditions Verification
Assessing whether conditions for applying inference methods are satisfied before analysis.
Sample Size Effect on Inference
Increasing sample size typically results in more reliable and trustworthy estimation of population parameters.