StatisticsBeginner

Mean vs Median

The mean and median both describe the "center" of a dataset but they measure it differently and tell different stories. Choosing the wrong one can completely misrepresent your data — and this choice is made incorrectly all the time in news, business, and research.

1 What Each One Measures

Mean (arithmetic average): add all values and divide by the count. If five people earn $30k, $35k, $40k, $45k, and $50k, the mean is ($30k+$35k+$40k+$45k+$50k)/5 = $40k.

Median: the middle value when data is sorted. For the same five incomes sorted ($30k, $35k, $40k, $45k, $50k), the median is $40k — the middle value. With an even number of values, the median is the mean of the two middle values.

Same result, different logic

In a perfectly symmetric distribution, mean = median = mode. In real data, they almost always differ. That difference tells you something important about the shape of your data.

2 How Outliers Affect Them Differently

The mean is sensitive to outliers. The median is resistant to them. This is the single most important difference.

Outlier Effect
Five employees earn $30k, $35k, $40k, $45k, and $200k (the CEO).
1Mean = ($30k+$35k+$40k+$45k+$200k) / 5 = $350k / 5 = $70k
2Median: sorted = $30k, $35k, $40k, $45k, $200k → middle value = $40k
3The mean ($70k) is higher than 4 out of 5 salaries
4The median ($40k) represents what a typical employee actually earns
Answer: Mean is pulled up by the outlier; median is unaffected

Add one more employee at $400k and the mean jumps to $97.5k while the median barely moves to $42.5k. The median describes the typical experience; the mean reflects the mathematical average including extreme values.

3 When to Use Mean vs Median

Use the mean when: data is roughly symmetric and has no extreme outliers. Test scores in a large class. Heights of adults. Daily temperatures. The mean uses all values and is more efficient statistically when the data is well-behaved.

Use the median when: data is skewed or has significant outliers. Income distributions (right-skewed). Home prices. Response times. Any situation where a few extreme values would distort the average and make it unrepresentative of the typical case.

'Average' is ambiguous

In everyday language, 'average' almost always means 'mean.' But when someone says 'the average American household income is $X' — if that figure is higher than what most households earn, they're probably reporting the mean, not the median. Always ask which one.

4 Real-World Examples Where the Choice Matters

Housing prices: median home price is reported rather than mean because a few extremely expensive homes would pull the mean above what most buyers experience. "Median home price" accurately represents the market for typical buyers.

Income: median household income ($75k in the US) is meaningfully lower than mean household income ($102k) because the income distribution is heavily right-skewed. The median better represents the typical household. Politicians choose which to cite based on which supports their argument.

Medical research: survival times after diagnosis are typically reported as medians. If most patients survive 6 months but a few survive 30 years, the mean survival time might be 18 months — suggesting typical patients survive much longer than they actually do. The median gives the better clinical picture.

5 Where Mode Fits In

Mode is the most frequently occurring value. It's the only measure of center that makes sense for categorical data. The most common shoe size, the most popular political party, the most frequently ordered menu item — these are modal questions.

For continuous numerical data, the mode is often not useful (many values occur only once). But for discrete data with repeated values — number of children per household, number of doctor visits per year — mode can be informative alongside mean and median.

Practice Problems

Five test scores: 72, 85, 88, 91, 94. Find the mean and median.
Mean = (72+85+88+91+94)/5 = 430/5 = 86. Median = middle value = 88.
A dataset has mean 50 and median 30. What does this tell you about the distribution?
The mean is much higher than the median, which means the data is right-skewed — there are high outliers pulling the mean up. Most values are closer to 30.
Home prices in a neighborhood: $280k, $310k, $295k, $320k, $1.8M. Which is more representative of typical home prices?
The median ($310k) is more representative. The mean ($601k) is heavily distorted by the $1.8M outlier and doesn't reflect what most homes cost.