Descriptive Statistics Calculator

Enter your dataset and get every descriptive statistic - central tendency, dispersion, shape, and distribution.

∑ Descriptive Statistics Calculator

Central Tendency

Mean (x̄)
-
Median (x̃)
-
Mode
-
Mid-Range
-

Dispersion

Std Dev (s)
-
Variance (s²)
-
Range (R)
-
IQR
-
Sum of Squares
-
MAD
-
RMS
-
Std Error (SEM)
-
CV
-
RSD
-

Distribution Summary

Count (n)
-
Sum
-
Minimum
-
Maximum
-

Quartiles & Outliers

Q1 (25th percentile)
-
Q2 / Median
-
Q3 (75th percentile)
-
Outliers (Tukey 1.5×IQR)
-

Shape

Skewness (γ₁)
-
Kurtosis (β₂)
-
Excess Kurtosis (α₄)
-

Frequency Table

📖 What are Descriptive Statistics?

Descriptive statistics are numerical measures that summarise and describe the key features of a dataset. Unlike inferential statistics - which draw conclusions about a broader population from a sample - descriptive statistics simply describe what the data shows. They are the first step in any data analysis and are used in every field: business, medicine, education, science, and finance.

Descriptive statistics fall into three main categories. Measures of central tendency (mean, median, mode, mid-range) describe where the data is centred. Measures of dispersion (range, variance, standard deviation, IQR, MAD) describe how spread out the data is. Measures of shape (skewness, kurtosis) describe the asymmetry and tail behaviour of the distribution.

A complete descriptive analysis answers several questions at once: What is a typical value? How much do values vary? Is the distribution symmetric or skewed? Are there any outliers that distort the picture? This calculator answers all of these with a single click, computing 25+ statistics from your dataset.

Descriptive statistics are essential for quality control (is the process within specification?), financial analysis (how volatile is this investment?), academic research (what is the spread of test scores?), and data science (understanding a feature before modelling). Knowing the skewness tells you whether to use the mean or median as a central measure; knowing the IQR tells you whether outliers might be pulling the standard deviation higher than it should be.

📐 Formulas

Mean: x̄ = (Σxᵢ) / n

Median: Middle value when sorted. For even n, average of two middle values.

Mode: Most frequently occurring value(s). A dataset can have zero, one, or many modes.

Mid-Range: MR = (min + max) / 2

Sample Variance: s² = Σ(xᵢ − x̄)² / (n − 1)

Sample Std Dev: s = √s²

Range: R = max − min

Quartiles (Q1, Q2, Q3): Split sorted data at median; Q1 = median of lower half, Q3 = median of upper half.

IQR: Q3 − Q1

Sum of Squares: SS = Σ(xᵢ − x̄)²

MAD (Mean Absolute Deviation): Σ|xᵢ − x̄| / n

RMS (Root Mean Square): √(Σxᵢ² / n)

SEM (Std Error of Mean): s / √n

Skewness (Fisher-Pearson): [n / ((n−1)(n−2))] × Σ[(xᵢ − x̄)/s]³

Excess Kurtosis (Excel KURT): [(n(n+1))/((n−1)(n−2)(n−3))] × Σ[(xᵢ − x̄)/s]⁴ − [3(n−1)²/((n−2)(n−3))]

Kurtosis (β₂): Excess kurtosis + 3. Normal distribution = 3.

CV (Coefficient of Variation): s / x̄

RSD (Relative Std Dev): (s / x̄) × 100%

Outliers: Tukey fences - values below Q1 − 1.5×IQR or above Q3 + 1.5×IQR.

📖 How to Use This Calculator

1
Enter your numbers in the text box, separated by commas, spaces, or new lines. You can paste directly from a spreadsheet column.
2
Click Calculate All Statistics. Results appear instantly grouped by category: central tendency, dispersion, quartiles, shape, and frequency.
3
Review the Frequency Table at the bottom to see how many times each value appears, with its percentage share of the total dataset.
4
Use Copy result to copy all statistics to clipboard, or Print for a clean printout.

💡 Example Calculations

Example 1 - Simple dataset: 10, 20, 30, 40, 50

1
Central tendency: Mean = 30 · Median = 30 · Mode = all values (no mode) · Mid-range = 30
2
Dispersion: Std Dev = 15.811 · Variance = 250 · Range = 40 · IQR = 30
3
Other measures: Sum of Squares = 1000 · MAD = 12 · RMS = 33.166 · SEM = 7.071
4
Shape: Skewness = 0 (perfectly symmetric) · Excess Kurtosis = −1.2 (platykurtic, flatter than normal)
5
Quartiles: Q1 = 15 · Q2 = 30 · Q3 = 45 · Outliers: None
6
CV = 0.527 (52.7%) - high relative variability because values span from 10 to 50 around a mean of 30.
Try this example →

Example 2 - Monthly sales data with outlier: 42, 45, 44, 46, 43, 47, 44, 200

1
Mean = 63.875 - badly distorted by the 200 outlier. Median = 44.5 - much more representative of the typical month.
2
Std Dev = 54.26 - inflated by the outlier. IQR = 2.5 - not affected by the outlier at all.
3
Outlier detected: 200 - via Tukey rule: Q3 (46.25) + 1.5 × 2.5 (IQR) = 50; 200 > 50 → flagged.
4
Lesson: When outliers are present, use median and IQR instead of mean and standard deviation for a more honest summary of the data.
Try this example →

Frequently Asked Questions

What are descriptive statistics?+
Descriptive statistics are numerical measures that summarise a dataset. They fall into three groups: measures of central tendency (mean, median, mode), measures of dispersion (range, variance, standard deviation, IQR), and measures of shape (skewness, kurtosis). Together they give a complete picture of any dataset without needing to look at every individual value.
What is the difference between population and sample standard deviation?+
Population standard deviation (σ) divides by N - use this when your data is the entire population. Sample standard deviation (s) divides by N−1 (Bessel's correction) - use this when your data is a sample estimating a larger population. This calculator uses sample formulas (N−1) for variance, standard deviation, skewness, kurtosis, and SEM, matching Excel and Google Sheets defaults.
How are quartiles calculated?+
This calculator uses the inclusive (textbook) quartile method. The dataset is sorted, split at the median, and Q1 is the median of the lower half while Q3 is the median of the upper half. The interquartile range (IQR = Q3 − Q1) covers the middle 50% of the data. Different software tools may give slightly different quartile values depending on the method used.
What does skewness tell me?+
Skewness measures the asymmetry of the distribution. A value near 0 means the data is roughly symmetric. Positive skewness (right-skewed) means a longer right tail - typical of income data. Negative skewness (left-skewed) means a longer left tail. As a rule of thumb: |skewness| < 0.5 is approximately symmetric; 0.5–1.0 is moderately skewed; > 1.0 is highly skewed.
What is the coefficient of variation and when is it useful?+
The coefficient of variation (CV = SD / Mean) expresses standard deviation as a proportion of the mean. It lets you compare variability across datasets with different units or scales. For example, comparing the variability of blood pressure readings (mean ~120 mmHg) vs. blood glucose levels (mean ~5 mmol/L) is only meaningful using CV, not raw standard deviation.
How does outlier detection work?+
This calculator uses Tukey's method (1.5×IQR rule): any value below Q1 − 1.5×IQR or above Q3 + 1.5×IQR is flagged as an outlier. This is the same method used by box plots. It is robust because it is based on the interquartile range, not the mean, so it is not distorted by the very outliers it is trying to detect.
What is skewness and what does it mean for a dataset?+
Skewness measures the asymmetry of a distribution. Positive skewness (right-skewed): the tail extends to the right, and mean > median. This is common in income data and asset prices where a few very high values pull the mean up. Negative skewness (left-skewed): the tail extends to the left, and mean < median. A skewness of 0 indicates a perfectly symmetric distribution. Skewness between -0.5 and +0.5 is generally considered approximately symmetric.
What is kurtosis and why does it matter?+
Kurtosis measures the heaviness of the tails of a distribution relative to a normal distribution. High kurtosis (leptokurtic, kurtosis > 3) means more data in the tails - fat tails with more extreme values than a normal distribution. Low kurtosis (platykurtic, kurtosis < 3) means thinner tails. In finance, high kurtosis (fat tails) means extreme events occur more frequently than a normal distribution predicts - this is crucial for risk management and why the 2008 financial crisis was underestimated by models assuming normality.