Chi Squared Distribution Formula | Get It Right Every Time

The chi-square density is x^(k/2−1)e^(−x/2) divided by 2^(k/2)Γ(k/2), for x≥0 with k degrees of freedom.

Chi-square shows up whenever a statistic is built from squared z-scores. That’s why it keeps turning up in variance work, contingency tables, and goodness-of-fit checks.

This article gives the formula, names every symbol, and shows how to turn it into numbers you can report.

What the distribution is and where it comes from

Start with k independent standard normal variables: Z1, …, Zk. Square each one and add them up:

X = Z12 + Z22 + … + Zk2

That sum is a chi-square random variable with k degrees of freedom. Degrees of freedom is the count of squared standard normals in the sum. It’s the single parameter that sets the curve.

Small k gives a steep rise near 0 and a long right tail. Bigger k shifts the mass right and makes the curve less lopsided.

Chi Squared Distribution Formula with degrees of freedom

For k>0 and x≥0, the probability density function (pdf) is:

f(x; k) = [1 / (2^(k/2) Γ(k/2))] · x^(k/2 − 1) · e^(−x/2)

For x<0, the density is 0. If you want a published cross-check, the NIST Engineering Statistics Handbook lists the same pdf form.

What each symbol means

  • x: a non-negative value where you evaluate the density.
  • k: degrees of freedom (often written as ν or df).
  • Γ(k/2): the gamma function at k/2.
  • 2^(k/2): part of the normalizing constant.
  • e^(−x/2): the exponential tail term.

Why Γ(k/2) appears

The chi-square distribution is a gamma distribution with shape k/2 and scale 2. That link is why the normalizing constant uses Γ(k/2). You won’t compute Γ by hand in routine work, yet knowing it’s a gamma case helps you spot missing powers of 2 or a flipped exponent.

How to get probabilities and cut points

The pdf gives heights, not areas. Probabilities come from the cumulative distribution function F(x; k), which is the integral of the pdf from 0 to x.

Right-tail areas are common in testing: P(X ≥ x) = 1 − F(x; k). Cut points flip that around: find c so that P(X ≥ c)=α, then c is an upper α critical value.

If you want to confirm the algebra against a neutral source, the NIST chi-square distribution page lists the pdf and its parameter definitions. For computation, Python’s SciPy chi2 documentation shows the functions for pdf, cdf, survival function (right tail), and inverse cdf. In R, the manual page for dchisq, pchisq, and qchisq maps to the same tasks.

Properties that help you check your work

  • Mean: E[X] = k
  • Variance: Var(X) = 2k
  • Mode (for k≥2): k − 2
  • Additivity: independent χ² values add by adding their df

That last bullet is the reason sums of squared pieces keep landing back on a chi-square curve.

Where the formula gets used

Two families of tasks cover most real uses: variance inference for normal samples and χ² statistics built from count tables.

Normal-sample variance

If a sample is normal with variance σ², the scaled sample variance follows:

(n−1)S² / σ² ~ χ²(n−1)

This relationship powers confidence intervals for σ² and σ, and it explains why df is n−1 in that setting.

Counts, goodness-of-fit, and independence

For categorical data, the statistic is a sum of squared deviations between observed and expected counts, each divided by the expected count:

χ² = Σ (O − E)² / E

Degrees of freedom comes from how many independent cells remain after totals and any fitted parameters are fixed. Penn State’s chi-square test notes give a straight explanation of that df idea.

Reference table for common tasks

The same few calculations repeat across textbooks and software output. This table links each task to the quantity you need and the inputs you’ll pass in.

Task What You Need Typical Inputs
Right-tail p-value P(X ≥ x) x, df
Left-tail probability P(X ≤ x) x, df
Upper critical value c where P(X ≥ c)=α α, df
Lower critical value c where P(X ≤ c)=α α, df
Two-sided variance interval [c1, c2] with tails α/2 α, df
Goodness-of-fit statistic Σ (O−E)²/E O cells, E cells
Independence in r×c table df=(r−1)(c−1) r, c
Normal-sample variance link (n−1)S²/σ² n, S², σ²

Mini walkthrough: a chi-square test from start to finish

This is the clean path for a standard independence test on an r×c contingency table.

  1. Compute each expected count with E = (row total × column total) / n.
  2. Compute χ² = Σ (O − E)² / E across all cells.
  3. Set df = (r−1)(c−1).
  4. Get the right-tail p-value: p = 1 − F(χ²; df).
  5. Read the result in context of the data collection and any planned follow-ups.

If you ever see a negative χ², stop. A squared-difference sum can’t go below zero, so a math or data step is off.

Table of report-style quantiles and software prompts

Reports often cite cut points at α = 0.10, 0.05, 0.01. Values depend on df, so this table sticks to the form and the matching software call.

Report Item Math Form Software Prompt
Upper 0.05 cut point c = F−1(0.95; k) qchisq(0.95, df=k) or chi2.ppf(0.95, k)
Upper 0.01 cut point c = F−1(0.99; k) qchisq(0.99, df=k) or chi2.ppf(0.99, k)
Two-sided variance interval cuts c1=F−1(α/2; k), c2=F−1(1−α/2; k) qchisq(c(α/2, 1−α/2), df=k)
Right-tail p-value p = 1 − F(x; k) 1 − pchisq(x, df=k) or chi2.sf(x, k)
Left-tail probability p = F(x; k) pchisq(x, df=k) or chi2.cdf(x, k)

Fast checks that prevent common mistakes

  • Df: n−1 for normal-sample variance. (r−1)(c−1) for a plain r×c table.
  • Tail: standard χ² tests use the right tail.
  • Scale: χ² values are non-negative.
  • Sense check: x near df usually means a middling right-tail area, not a tiny one.

If those checks pass, your formula, your df, and your software output should line up.

References & Sources