What Is a Z-Test for Proportions?
A proportion is the fraction of a group possessing a specific attribute — the recovery rate among patients, the pass rate of candidates, or the conversion rate of web visitors. The Z-test for proportions is a parametric hypothesis test that determines whether an observed proportion differs meaningfully from a theoretical value (one-sample), or whether two independently observed proportions differ from each other (two-sample).
The test is grounded in the Central Limit Theorem: for sufficiently large samples, the sampling distribution of a proportion is approximately normal with mean p and standard deviation √(p(1−p)/n). This justifies using the standard normal (Z) distribution as the reference distribution for inference.
Two Variants of the Test
Example: Does our school's 68% graduation rate differ significantly from the national rate of 62%?
Example: Does a treatment group (42% recovered) differ significantly from a control group (31% recovered)?
The Formulas — Fully Explained
One-Sample
Given a random sample of n observations where x have the characteristic of interest:
The denominator \(\sqrt{p_0(1-p_0)/n}\) is the standard error under the null hypothesis. It uses p₀, not p̂, because under H₀ we assume the true proportion equals p₀. Using the observed p̂ in the denominator would be circular and statistically incorrect.
Two-Sample
The pooled proportion p̂c is the combined estimate assuming H₀: p₁ = p₂ is true. Using pooled variance maximises power and gives a properly calibrated standard error. The confidence interval for the difference, however, uses unpooled variance (no H₀ assumption needed for estimation).
Confidence Interval
Effect Size — Cohen's h
Cohen's h uses the arcsine transformation, which stabilises the variance of a proportion across the entire [0,1] range. This makes the magnitude of h interpretable independently of where on the scale the proportions fall — a crucial advantage over raw differences which are misleading near 0 or 1.
Hypothesis Types
Significance Level (α) and Error Types
Critical Assumptions
- Random sampling. Observations are drawn by a random mechanism. Convenience samples invalidate population inference.
- Independence of observations. Each individual's outcome is unrelated to others'. For two-sample tests, the groups must also be independent of each other.
- Large-sample normality. One-sample: np₀ ≥ 5 and n(1−p₀) ≥ 5. Two-sample: n₁p̂c ≥ 5, n₁(1−p̂c) ≥ 5, n₂p̂c ≥ 5, n₂(1−p̂c) ≥ 5.
- 10% condition. Each sample must be less than 10% of its population to ensure near-independence when sampling without replacement.
Interpreting the p-Value Correctly
The p-value is the probability of observing a test statistic at least as extreme as the one obtained, assuming H₀ is true. It is not the probability that H₀ is true, nor the probability that the finding occurred by chance. A p-value below α means the data is inconsistent with H₀ — it justifies rejecting H₀, but does not prove H₁, and does not quantify practical importance. Always report effect size and confidence intervals alongside the p-value.
Frequently Asked Questions
When should I use a Z-test instead of a chi-square test?
For testing one proportion against a known value, use the Z-test. For a 2×2 contingency table, Z² is mathematically identical to the chi-square statistic with 1 degree of freedom. The Z-test is preferred when direction matters, since it supports one-tailed tests while chi-square does not.
What is the minimum sample size needed?
Derived from the large-sample condition: n ≥ 5 / min(p₀, 1−p₀). For p₀ = 0.05, you need n ≥ 100. For two-sample tests, both groups must independently satisfy the condition using the pooled proportion as the estimate.
Can I compare two dependent (paired) proportions?
No. The two-sample Z-test requires independent groups. For paired proportions — such as before-and-after measures on the same individuals — use McNemar's test instead.