StatisticsBeginner

What is Standard Deviation?

Standard deviation sounds intimidating but the underlying idea is something you already understand intuitively. It measures how spread out a set of numbers is. That's really it. The formula just makes "spread out" precise enough to calculate and compare.

✔ The short version

Standard deviation tells you how far values typically stray from the average. A small standard deviation means most values cluster tightly around the mean. A large one means they're scattered widely. If a class averages 75 on a test with SD of 2, almost everyone scored between 73 and 77. SD of 20 means scores were all over the place.

What it actually measures

Say two basketball players both average 20 points per game. Player A scores 18, 22, 19, 21, 20 across five games. Player B scores 5, 35, 10, 38, 12. Same average. Completely different players.

Standard deviation is the number that captures that difference. Player A's SD would be small , their scores cluster near 20. Player B's SD would be large , their scores jump around wildly. Two datasets can have identical averages and tell completely different stories. The standard deviation tells you which story is which.

In practical terms, SD answers the question: if I pick one value from this dataset at random, how far from the average should I expect it to be? A standard deviation of 5 means you'd typically be about 5 units away from the mean. SD of 50 means you could be way off.

How to actually calculate it

There are two versions: population standard deviation (when you have data for every single member of a group) and sample standard deviation (when you have data for a subset). In almost every real situation , surveys, experiments, research , you're working with a sample, so use the sample formula. The only difference is dividing by n−1 instead of n.

The steps are the same either way:

Calculating standard deviation step by step
Dataset: 4, 7, 13, 2, 1, 9
1Find the mean: (4+7+13+2+1+9) ÷ 6 = 36 ÷ 6 = 6
2Subtract the mean from each value and square it: (4−6)²=4, (7−6)²=1, (13−6)²=49, (2−6)²=16, (1−6)²=25, (9−6)²=9
3Add those squared differences: 4+1+49+16+25+9 = 104
4Divide by n−1 (sample): 104 ÷ 5 = 20.8. This is the variance.
5Take the square root: √20.8 ≈ 4.56
Standard deviation ≈ 4.56. Values typically stray about 4.56 from the mean of 6.

You might wonder why we square the differences instead of just taking the absolute value. Two reasons: squaring makes all differences positive automatically, and it weights outliers more heavily. A value that's 10 away from the mean contributes 100 to the sum; a value that's 2 away contributes only 4. Squaring makes the formula more sensitive to extreme values, which is often what you want.

Try the Standard Deviation Calculator

Put in your numbers and see the full calculation with every step shown.

Open Calculator →

The rule that makes standard deviation actually useful

For data that's roughly bell-shaped (which covers a lot of real-world data), there's a pattern called the 68-95-99.7 rule:

About 68% of values fall within one standard deviation of the mean. About 95% fall within two. About 99.7% fall within three. This means if you know the mean and standard deviation, you can make pretty accurate predictions about where values will land.

IQ scores are designed with mean 100 and SD 15. So about 68% of people score between 85 and 115. About 95% score between 70 and 130. Scoring above 145 puts you in the top 0.15% , three standard deviations out.

This rule only applies to roughly normal distributions, but that covers enough real situations (heights, test scores, measurement errors) that it's extremely practical.

Where this actually shows up

Finance: Investment risk is measured using standard deviation of returns. Two funds might both average 8% annual returns. One has SD of 3%, the other has SD of 20%. The first is a boring but reliable bond fund. The second could return 28% or lose 12% in a given year. Same average, completely different risk. Standard deviation is literally how financial risk is quantified.

Manufacturing: A factory producing bolts to a target diameter uses standard deviation to monitor consistency. Low SD means the production line is reliable. High SD means parts are coming out inconsistent, which leads to quality failures.

Medicine: Lab reference ranges (the "normal" values on a blood test) are typically set at mean ± 2 standard deviations for healthy populations. If your result falls outside that range, you're in the outer 5%, which is why it gets flagged.

Things people mix up

Variance vs standard deviation: variance is SD squared. It's used in formulas because it has nicer mathematical properties. Standard deviation is what you report to humans because it's in the original units , if you're measuring heights in inches, SD is in inches. Variance would be in square inches, which doesn't mean anything to anyone.

Low standard deviation is not always better. A factory consistently producing defective parts has low standard deviation. Consistency and quality are different things. What matters is whether the mean is in the right place and whether the spread is appropriate for the context.

The population vs sample distinction matters more than people think. Using n instead of n−1 on sample data consistently underestimates the true population standard deviation. This isn't a pedantic detail , it can meaningfully affect conclusions in research. If in doubt, use n−1.

Practice Problems

A dataset has values 10, 20, 30, 40, 50. What is the mean and approximately what would you expect the standard deviation to be , small, medium, or large relative to the mean?
Mean = (10+20+30+40+50)/5 = 30. The values are spread across a range of 40 with the mean in the middle , standard deviation is roughly 14.1. That's medium-large relative to the mean, reflecting the wide spread.
Two funds both return 8% annually. Fund A has SD of 2%, Fund B has SD of 15%. Which is riskier and what does that mean practically?
Fund B is much riskier. In a bad year, Fund B could return 8−30 = −22% or worse (2 SDs down). Fund A's bad year is around 8−4 = 4%. Same average return, dramatically different risk profile.
What percentage of data falls within 2 standard deviations of the mean in a normal distribution?
95% , the 68-95-99.7 rule. This means only 5% of values are more than 2 standard deviations from the mean.