StatisticsBeginner

Mean vs Median

The mean and median both describe the "center" of a dataset but they measure it differently and tell different stories. Choosing the wrong one can completely misrepresent your data , and this choice is made incorrectly all the time in news, business, and research.

1 What Each One Measures

The mean is what most people mean when they say "average." Add everything up and divide by how many values there are. Five salaries of $30k, $35k, $40k, $45k, and $50k have a mean of $40k. If five people earn $30k, $35k, $40k, $45k, and $50k, the mean is ($30k+$35k+$40k+$45k+$50k)/5 = $40k.

The median is the middle value when you sort the data. Same five salaries sorted in order: $30k, $35k, $40k, $45k, $50k. The middle one is $40k. When there is an even number of values, you average the two middle ones. For the same five incomes sorted ($30k, $35k, $40k, $45k, $50k), the median is $40k , the middle value. With an even number of values, the median is the mean of the two middle values.

Same result, different logic

In a perfectly symmetric distribution, mean = median = mode. In real data, they almost always differ. That difference tells you something important about the shape of your data.

2 How Outliers Affect Them Differently

Here is the practical difference that matters: the mean gets pulled around by extreme values, the median does not. Add one CEO earning $2 million to that group of five employees and the mean jumps to over $300k, which is higher than every actual employee salary. The median barely moves. Which one accurately represents what a typical person earns?

Outlier Effect
Five employees earn $30k, $35k, $40k, $45k, and $200k (the CEO).
1Mean = ($30k+$35k+$40k+$45k+$200k) / 5 = $350k / 5 = $70k
2Median: sorted = $30k, $35k, $40k, $45k, $200k → middle value = $40k
3The mean ($70k) is higher than 4 out of 5 salaries
4The median ($40k) represents what a typical employee actually earns
Answer: Mean is pulled up by the outlier; median is unaffected

Add one more employee at $400k and the mean jumps to $97.5k while the median barely moves to $42.5k. The median describes the typical experience; the mean reflects the mathematical average including extreme values.

3 When to Use Mean vs Median

Use the mean when: data is roughly symmetric and has no extreme outliers. Test scores in a large class. Heights of adults. Daily temperatures. The mean uses all values and is more efficient statistically when the data is well-behaved.

Use the median when: data is skewed or has significant outliers. Income distributions (right-skewed). Home prices. Response times. Any situation where a few extreme values would distort the average and make it unrepresentative of the typical case.

'Average' is ambiguous

In everyday language, 'average' almost always means 'mean.' But when someone says 'the average American household income is $X' , if that figure is higher than what most households earn, they're probably reporting the mean, not the median. Always ask which one.

4 Real-World Examples Where the Choice Matters

Home prices use the median for this exact reason. A neighborhood with 99 houses worth $300k and one mansion worth $10 million has a mean price of about $397k, which is meaningless for anyone trying to understand what homes actually cost there. The median of $300k is the honest number. because a few extremely expensive homes would pull the mean above what most buyers experience. "Median home price" accurately represents the market for typical buyers.

Income: median household income ($75k in the US) is meaningfully lower than mean household income ($102k) because the income distribution is heavily right-skewed. The median better represents the typical household. Politicians choose which to cite based on which supports their argument.

Medical research: survival times after diagnosis are typically reported as medians. If most patients survive 6 months but a few survive 30 years, the mean survival time might be 18 months , suggesting typical patients survive much longer than they actually do. The median gives the better clinical picture.

5 Where Mode Fits In

Mode is the most frequently occurring value. It's the only measure of center that makes sense for categorical data. The most common shoe size, the most popular political party, the most frequently ordered menu item , these are modal questions.

For continuous numerical data, the mode is often not useful (many values occur only once). But for discrete data with repeated values , number of children per household, number of doctor visits per year , mode can be informative alongside mean and median.

Practice Problems

Five test scores: 72, 85, 88, 91, 94. Find the mean and median.
Mean = (72+85+88+91+94)/5 = 430/5 = 86. Median = middle value = 88.
A dataset has mean 50 and median 30. What does this tell you about the distribution?
The mean is much higher than the median, which means the data is right-skewed , there are high outliers pulling the mean up. Most values are closer to 30.
Home prices in a neighborhood: $280k, $310k, $295k, $320k, $1.8M. Which is more representative of typical home prices?
The median ($310k) is more representative. The mean ($601k) is heavily distorted by the $1.8M outlier and doesn't reflect what most homes cost.

Sources & Further Reading

The explanations on this page draw on the following established sources. We link to primary and secondary sources so you can verify claims and go deeper on any topic.