Chi Squared Test Equation | Make Counts Make Sense

The chi-square formula compares observed and expected counts, then sums each squared gap divided by expectation.

The chi-square test turns a messy set of category counts into one statistic: χ². That number tells you how far your observed counts sit from the counts you would expect if the null claim were true.

Use this test for counts in categories, not averages, scores, or measurements. It fits questions about colors picked by customers, survey choices by age group, pass/fail counts by training type, or whether a die acts fair across six faces.

What The Equation Measures

The Formula In Plain Terms

The equation is:

χ² = ∑ (O − E)² / E

O means observed count. E means expected count. The symbol ∑ means you add the contribution from each category or each cell in a table.

The logic is direct. Subtract expected from observed, square the gap so negative and positive gaps both count, divide by expected so large categories do not drown out small ones, then add the pieces.

Why Squaring The Gap Matters

Squaring changes how the test treats misses. If two cells have the same expected count, a miss of 10 adds four times as much as a miss of 5. The test reacts more strongly when the table has one or two large gaps instead of many tiny ones.

The division by expected count matters too. A gap of 10 against an expected count of 20 is much larger in relative terms than a gap of 10 against an expected count of 200. That adjustment keeps the statistic fair across small and large categories.

A tiny χ² value means the observed counts sit close to the null claim. A larger value means the gaps are harder to explain as random sampling noise alone. The test does not prove the null claim wrong. It gives a probability-based reason to reject it or not reject it.

When The Test Fits Your Data

Before using the equation, check the shape of your data. The test works with frequencies, not percentages on their own. If you only have percentages, you need the counts behind them.

Use a chi-square test when you have:

One categorical variable compared with a claimed split, called a goodness-of-fit test.
Two categorical variables in a table, called a test of independence or association.
Separate groups compared across the same response categories, often called a homogeneity test.

The counts also need to come from a sensible sampling process. One person should not appear in two cells unless the design was built for paired data, which this equation does not handle. Expected counts should not be tiny, since the chi-square curve works better when each expected cell count has enough weight.

For two-way tables, Penn State’s chi-square test statistic notes show the expected count rule: row total times column total, divided by the full sample size.

Chi Squared Test Equation With Clean Inputs

Good results start before any math happens. Build the table with raw counts, label each category, and write the null claim in plain language. If the null claim says all categories are equal, split the total count evenly. If it gives named proportions, multiply the total count by each proportion.

For a goodness-of-fit test, NIST’s goodness-of-fit test page explains how categories, fitted parameters, and degrees of freedom work together. That detail matters when you estimate values from the same data before running the test.

Item To Check	What To Enter	Why It Matters
Observed Count	The count you actually saw in each category	This is the raw evidence used by the equation
Expected Count	The count predicted by the null claim	Each gap is measured against this value
Cell Gap	Observed minus expected	Shows direction before the gap is squared
Squared Gap	The cell gap multiplied by itself	Makes large misses count more than small misses
Cell Contribution	Squared gap divided by expected count	Shows which category drives the statistic
Degrees Of Freedom	Category count minus constraints	Sets the curve used for the p-value
P-Value	Probability from the matching chi-square curve	Helps decide whether the gap is too large for the null claim
Alpha Level	Your chosen cutoff, often 0.05	Sets the rule before you read the result

Worked Count Example

Say 100 shoppers choose among four package colors, and the claim says all colors should be chosen evenly. The expected count is 25 for each color. The observed counts are 35, 25, 20, and 20.

The contributions are 4, 0, 1, and 1. Add them and the statistic is χ² = 6. With four categories, the degrees of freedom are 3. A calculator gives a p-value near 0.11, so at a 0.05 cutoff, the sample does not give enough reason to reject equal preference.

That wording is careful. It does not say the colors are equal in all buyers. It says this sample did not show enough distance from equal counts under the test rule.

Degrees Of Freedom And Expected Counts

Degrees of freedom tell the test which chi-square curve to use. In a goodness-of-fit test with no fitted parameters, use categories minus one. With a two-way table, use rows minus one times columns minus one.

Expected counts deserve close care. If several expected cells are small, combine sensible categories or choose another method. The issue is not the equation itself. The issue is that the p-value comes from a curve that may not fit sparse data well.

Result Pattern	Plain Reading	Next Step
Small χ², large p-value	Observed counts are close to expected counts	Do not reject the null claim
Large χ², small p-value	Observed counts sit far from expected counts	Reject the null claim under your cutoff
One large cell contribution	One category drives much of the statistic	Review that category and the data entry
Many small expected counts	The p-value may be shaky	Merge categories or use an exact method
Low p-value with huge sample	A tiny real-world gap can test as non-random	Add effect size and plain meaning

Common Mistakes That Break The Calculation

The math is short, but bad inputs can ruin it. Most errors come from mixing data types, rounding too early, or reading the p-value as a measure of size.

Using percentages alone: Convert back to counts before calculating.
Rounding expected counts early: Keep decimals through the calculation.
Testing averages: Use another test for means or measured values.
Changing alpha after seeing results: Pick the cutoff before the test.
Calling non-rejection proof: A large p-value is not proof that the null claim is true.

How To Report The Result

A clean report gives the test type, sample size, degrees of freedom, statistic, p-value, and plain reading. Use a short line like this:

Chi-square test of independence, n = 240, χ²(2) = 8.41, p = 0.015. The counts differ by group under the 0.05 cutoff.

If the test is for a business or class project, add the largest cell contribution. That tells readers where the gap came from, not just whether the full table crossed a cutoff. For a two-way table, a small effect-size note can also help, since a huge sample can make a tiny difference test as non-random.

Use The Equation With Care

The chi-square equation is useful because it turns category gaps into a single statistic, but the setup decides whether the result means anything. Start with counts, build expected values from the null claim, check degrees of freedom, then read the p-value with modest wording.

When the statistic is large, the table is telling you that observed counts and expected counts are far apart. When it is small, the table is telling you that the sample counts line up with the null claim better than expected by chance rules. Either way, the equation works best when the data are clean and the claim is stated before the math begins.

References & Sources

Penn State Statistics Online.“Chi-Square Tests.”Used for the chi-square statistic and expected count rule for two-way tables.
National Institute Of Standards And Technology.“Chi-Square Goodness-Of-Fit Test.”Used for goodness-of-fit setup, category handling, and degrees-of-freedom notes.