The chi-square density is x^(k/2−1)e^(−x/2) divided by 2^(k/2)Γ(k/2), for x≥0 with k degrees of freedom.
Chi-square shows up whenever a statistic is built from squared z-scores. That’s why it keeps turning up in variance work, contingency tables, and goodness-of-fit checks.
This article gives the formula, names every symbol, and shows how to turn it into numbers you can report.
What the distribution is and where it comes from
Start with k independent standard normal variables: Z1, …, Zk. Square each one and add them up:
X = Z12 + Z22 + … + Zk2
That sum is a chi-square random variable with k degrees of freedom. Degrees of freedom is the count of squared standard normals in the sum. It’s the single parameter that sets the curve.
Small k gives a steep rise near 0 and a long right tail. Bigger k shifts the mass right and makes the curve less lopsided.
Chi Squared Distribution Formula with degrees of freedom
For k>0 and x≥0, the probability density function (pdf) is:
f(x; k) = [1 / (2^(k/2) Γ(k/2))] · x^(k/2 − 1) · e^(−x/2)
For x<0, the density is 0. If you want a published cross-check, the NIST Engineering Statistics Handbook lists the same pdf form.
What each symbol means
- x: a non-negative value where you evaluate the density.
- k: degrees of freedom (often written as ν or df).
- Γ(k/2): the gamma function at k/2.
- 2^(k/2): part of the normalizing constant.
- e^(−x/2): the exponential tail term.
Why Γ(k/2) appears
The chi-square distribution is a gamma distribution with shape k/2 and scale 2. That link is why the normalizing constant uses Γ(k/2). You won’t compute Γ by hand in routine work, yet knowing it’s a gamma case helps you spot missing powers of 2 or a flipped exponent.
How to get probabilities and cut points
The pdf gives heights, not areas. Probabilities come from the cumulative distribution function F(x; k), which is the integral of the pdf from 0 to x.
Right-tail areas are common in testing: P(X ≥ x) = 1 − F(x; k). Cut points flip that around: find c so that P(X ≥ c)=α, then c is an upper α critical value.
If you want to confirm the algebra against a neutral source, the NIST chi-square distribution page lists the pdf and its parameter definitions. For computation, Python’s SciPy chi2 documentation shows the functions for pdf, cdf, survival function (right tail), and inverse cdf. In R, the manual page for dchisq, pchisq, and qchisq maps to the same tasks.
Properties that help you check your work
- Mean:
E[X] = k - Variance:
Var(X) = 2k - Mode (for k≥2):
k − 2 - Additivity: independent χ² values add by adding their df
That last bullet is the reason sums of squared pieces keep landing back on a chi-square curve.
Where the formula gets used
Two families of tasks cover most real uses: variance inference for normal samples and χ² statistics built from count tables.
Normal-sample variance
If a sample is normal with variance σ², the scaled sample variance follows:
(n−1)S² / σ² ~ χ²(n−1)
This relationship powers confidence intervals for σ² and σ, and it explains why df is n−1 in that setting.
Counts, goodness-of-fit, and independence
For categorical data, the statistic is a sum of squared deviations between observed and expected counts, each divided by the expected count:
χ² = Σ (O − E)² / E
Degrees of freedom comes from how many independent cells remain after totals and any fitted parameters are fixed. Penn State’s chi-square test notes give a straight explanation of that df idea.
Reference table for common tasks
The same few calculations repeat across textbooks and software output. This table links each task to the quantity you need and the inputs you’ll pass in.
| Task | What You Need | Typical Inputs |
|---|---|---|
| Right-tail p-value | P(X ≥ x) |
x, df |
| Left-tail probability | P(X ≤ x) |
x, df |
| Upper critical value | c where P(X ≥ c)=α |
α, df |
| Lower critical value | c where P(X ≤ c)=α |
α, df |
| Two-sided variance interval | [c1, c2] with tails α/2 |
α, df |
| Goodness-of-fit statistic | Σ (O−E)²/E |
O cells, E cells |
| Independence in r×c table | df=(r−1)(c−1) |
r, c |
| Normal-sample variance link | (n−1)S²/σ² |
n, S², σ² |
Mini walkthrough: a chi-square test from start to finish
This is the clean path for a standard independence test on an r×c contingency table.
- Compute each expected count with
E = (row total × column total) / n. - Compute
χ² = Σ (O − E)² / Eacross all cells. - Set
df = (r−1)(c−1). - Get the right-tail p-value:
p = 1 − F(χ²; df). - Read the result in context of the data collection and any planned follow-ups.
If you ever see a negative χ², stop. A squared-difference sum can’t go below zero, so a math or data step is off.
Table of report-style quantiles and software prompts
Reports often cite cut points at α = 0.10, 0.05, 0.01. Values depend on df, so this table sticks to the form and the matching software call.
| Report Item | Math Form | Software Prompt |
|---|---|---|
| Upper 0.05 cut point | c = F−1(0.95; k) |
qchisq(0.95, df=k) or chi2.ppf(0.95, k) |
| Upper 0.01 cut point | c = F−1(0.99; k) |
qchisq(0.99, df=k) or chi2.ppf(0.99, k) |
| Two-sided variance interval cuts | c1=F−1(α/2; k), c2=F−1(1−α/2; k) |
qchisq(c(α/2, 1−α/2), df=k) |
| Right-tail p-value | p = 1 − F(x; k) |
1 − pchisq(x, df=k) or chi2.sf(x, k) |
| Left-tail probability | p = F(x; k) |
pchisq(x, df=k) or chi2.cdf(x, k) |
Fast checks that prevent common mistakes
- Df: n−1 for normal-sample variance. (r−1)(c−1) for a plain r×c table.
- Tail: standard χ² tests use the right tail.
- Scale: χ² values are non-negative.
- Sense check: x near df usually means a middling right-tail area, not a tiny one.
If those checks pass, your formula, your df, and your software output should line up.
References & Sources
- NIST/SEMATECH.“Chi-Square Distribution.”Provides the pdf and definition from sums of squared standard normals.
- SciPy.“scipy.stats.chi2.”Shows methods for pdf, cdf, survival function, and inverse cdf for χ².
- R Project (R Manual at ETH Zürich mirror).“The (non-central) Chi-Squared Distribution.”Documents dchisq, pchisq, qchisq, and related definitions used for χ² calculations in R.
- Penn State.“Chi-Square Test – Normal Distribution.”Explains degrees of freedom and where chi-square tests show up with count data.