Unit 6: Inference for Categorical Data: Proportions

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/49

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:11 AM on 3/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

50 Terms

1
New cards

Categorical variable

A variable that places individuals into categories (e.g., yes/no, defective/not defective).

2
New cards

Success

The category of interest in a two-category (binary) setting; a label that does not imply “good.”

3
New cards

Failure

The other category in a two-category (binary) setting; a label that does not imply “bad.”

4
New cards

Population proportion (p)

The true fraction of the entire population that falls in the “success” category.

5
New cards

Sample proportion (p-hat)

The proportion of successes in a sample; computed as p̂ = x/n.

6
New cards

Inference

Using sample data to draw conclusions about a population parameter (here, a proportion).

7
New cards

Sampling variability

The natural variation in a statistic (like p̂) from one random sample to another, even when the true p is fixed.

8
New cards

Sampling distribution

The distribution of a statistic’s values over many repeated random samples from the same population.

9
New cards

Normal approximation for p-hat

For large enough samples, the sampling distribution of p̂ is approximately Normal (bell-shaped).

10
New cards

Association vs causation

Randomized experiments can support cause-and-effect conclusions; observational studies generally support only association due to possible confounding.

11
New cards

Number of successes (x)

The count of observations in the sample that fall in the “success” category.

12
New cards

Sample size (n)

The number of individuals/observations in the sample.

13
New cards

Null value (p0)

The claimed population proportion used in a hypothesis test (the value assumed under H0).

14
New cards

Parameter

A fixed (usually unknown) numerical characteristic of a population, such as p, p1, or p2.

15
New cards

Statistic

A numerical value computed from sample data that varies from sample to sample, such as p̂.

16
New cards

Independence assumption

Observations in a sample (and between groups, if applicable) must be independent for standard inference methods to apply.

17
New cards

Random sampling

Selecting individuals so each sample of a given size has an equal chance of being chosen; helps justify independence.

18
New cards

Random assignment

Assigning individuals to treatments by chance; helps create comparable groups and supports causal conclusions.

19
New cards

10% condition (10% Rule)

When sampling without replacement, independence is reasonable if n ≤ 0.10N, where N is population size.

20
New cards

Normality assumption (for proportions)

Using a Normal model for the sampling distribution of p̂; justified when expected success and failure counts are large enough.

21
New cards

q = 1 − p

The population proportion of failures (the complement of the success proportion p).

22
New cards

Large counts condition

A check that expected numbers of successes and failures are at least 10, supporting the Normal approximation.

23
New cards

Binomial model

A model for the number of successes x when trials are independent and the success probability is constant: x ~ Binomial(n, p).

24
New cards

Unbiased estimator

An estimator whose expected value equals the true parameter; E(p̂) = p under random sampling.

25
New cards

Standard deviation of p-hat (σ_p̂)

The true SD of the sampling distribution of p̂: σ_p̂ = sqrt(p(1−p)/n).

26
New cards

Standard error (SE)

An estimate of a sampling distribution’s standard deviation, computed using sample data or the null value.

27
New cards

SE for one-proportion confidence interval

Estimated SD of p̂ for a CI: SE = sqrt(p̂(1−p̂)/n).

28
New cards

SE under the null (one-proportion test)

SD used in a one-proportion z test, computed with p0: sqrt(p0(1−p0)/n).

29
New cards

One-proportion z confidence interval

An interval estimating p: p̂ ± z* sqrt(p̂(1−p̂)/n).

30
New cards

Critical value (z*)

The z-score multiplier that matches a chosen confidence level (e.g., about 1.96 for 95%).

31
New cards

Margin of error (ME)

The “plus/minus” amount in a confidence interval: ME = z* × SE.

32
New cards

Confidence level

The long-run success rate of the CI method (e.g., 95% of such intervals capture the true parameter).

33
New cards

Correct confidence interval interpretation

A statement about being C% confident that the true population proportion (parameter) lies between the interval bounds, in context.

34
New cards

Significance test

A procedure that assesses whether data provide convincing evidence against a specific null claim about a parameter.

35
New cards

Null hypothesis (H0)

The claim assumed true for the test, stated as an equality for proportions (e.g., H0: p = p0).

36
New cards

Alternative hypothesis (Ha)

The competing claim, stated as an inequality (p ≠ p0, p > p0, or p < p0), chosen from the question context.

37
New cards

One-proportion z test statistic

A standardized measure for testing p: z = (p̂ − p0) / sqrt(p0(1−p0)/n).

38
New cards

p-value

Assuming H0 is true, the probability of getting a result at least as extreme as the observed statistic (in the direction of Ha).

39
New cards

Significance level (α)

A chosen cutoff for deciding whether a p-value is “small” (often 0.05); also the probability of a Type I error.

40
New cards

Type I error

Rejecting H0 when H0 is actually true; its probability is controlled by α.

41
New cards

Type II error

Failing to reject H0 when Ha is actually true.

42
New cards

Beta (β)

The probability of a Type II error for a specific alternative value of the parameter.

43
New cards

Power

1 − β; the probability of rejecting a false H0 when a particular alternative is true.

44
New cards

Two independent samples design

A design with separate random samples from two populations; used to compare p1 and p2.

45
New cards

Randomized experiment design

A design where subjects are randomly assigned to treatments; differences in outcomes can be attributed to the treatment (in that setting).

46
New cards

Two population proportions (p1, p2)

p1 and p2 are the true success proportions in populations or treatment groups 1 and 2.

47
New cards

Difference in sample proportions (p̂1 − p̂2)

A statistic that estimates the parameter p1 − p2; computed from two samples/groups.

48
New cards

Unpooled SE for two-proportion confidence interval

SE for a CI for p1 − p2: sqrt( p̂1(1−p̂1)/n1 + p̂2(1−p̂2)/n2 ).

49
New cards

Pooled proportion (p̂c)

Combined estimate used in two-proportion tests under H0: p̂c = (x1 + x2)/(n1 + n2).

50
New cards

Two-proportion z test statistic

For H0: p1 = p2, z = ((p̂1 − p̂2) − 0) / sqrt( p̂c(1−p̂c)(1/n1 + 1/n2) ).

Explore top notes

note
Notes
Updated 1187d ago
0.0(0)
note
Photons
Updated 900d ago
0.0(0)
note
Biology - Evolution
Updated 1476d ago
0.0(0)
note
RIse of Democracy Vocab Pt. 3
Updated 1499d ago
0.0(0)
note
Indirect Values
Updated 1499d ago
0.0(0)
note
Notes
Updated 1187d ago
0.0(0)
note
Photons
Updated 900d ago
0.0(0)
note
Biology - Evolution
Updated 1476d ago
0.0(0)
note
RIse of Democracy Vocab Pt. 3
Updated 1499d ago
0.0(0)
note
Indirect Values
Updated 1499d ago
0.0(0)

Explore top flashcards

flashcards
faf
40
Updated 957d ago
0.0(0)
flashcards
faf
40
Updated 957d ago
0.0(0)