๐ What is the Correlation Coefficient Calculator?
The Pearson correlation coefficient (r) is a number between -1 and +1 that measures how strongly two continuous variables move together in a linear pattern. A value of +1 means a perfect positive linear relationship (when X increases, Y increases by a proportional amount). A value of -1 means a perfect negative linear relationship (when X increases, Y decreases proportionally). A value of 0 means no linear relationship exists between the two variables.
Correlation analysis is used in virtually every quantitative field. Medical researchers use it to study the relationship between a risk factor (e.g. blood pressure) and an outcome (e.g. incidence of stroke). Economists study the correlation between GDP growth and unemployment rates (Okun's Law). Engineers correlate operating temperature with equipment failure rates. Market analysts look at the correlation between two asset prices to assess diversification benefits. Education researchers measure the correlation between study hours and exam performance. In each case, the goal is to quantify how reliably one variable predicts the other.
This calculator computes Pearson r along with four additional outputs that together give a complete picture of the linear relationship. R-squared (r2) shows the proportion of variance in Y explained by X. The least-squares regression line y = mx + b gives the best linear prediction of Y from X and can be used to make forecasts. The t-statistic and two-tailed p-value test whether the observed correlation is statistically significant or could have occurred by chance. All five outputs are computed automatically from raw data pairs.
The Summary Statistics mode is useful when you already have pre-computed sums from a textbook problem or research paper. Instead of re-entering raw data, you enter n, ฮฃx, ฮฃy, ฮฃxยฒ, ฮฃyยฒ, and ฮฃxy to compute r, the regression line, and the significance test directly. This matches the hand-calculation procedure taught in introductory statistics courses and lets you verify published results or complete homework problems efficiently.
๐ก Example Calculations
Example 1 - Very Strong Positive Correlation (Income vs Spending)
Weekly income ($000s): 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 and spending ($000s): 3, 7, 8, 14, 15, 19, 21, 25, 27, 30
1
n = 10. Σx = 110, Σy = 169, Σx² = 1540, Σy² = 3599, Σxy = 2352.
2
r = (10 × 2352 − 110 × 169) ÷ √((10×1540−110²) × (10×3599−169²)) = 4930 ÷ √(3300 × 7429) = 4930 ÷ 4951.3 = 0.9957
3
R² = 0.9914. Slope = 4930/3300 = 1.4939. Intercept = (169 − 1.4939×110)/10 = 0.467. Regression: y = 1.4939x + 0.467. t = 0.9957×√8/√(1−0.9914) = 37.7, p < 0.0001.
r = 0.9957, R² = 0.9914. Very strong positive linear relationship. Highly significant.
Try this example →Example 2 - Strong Negative Correlation (Temperature vs Heating Oil)
Temperature (C): 5, 8, 12, 15, 20, 22, 25, 28, 30, 35 and heating oil used (liters): 90, 85, 75, 68, 55, 50, 38, 30, 22, 10
1
As temperature rises, heating oil usage falls. We expect a negative r.
2
n = 10. Σx = 200, Σy = 523, Σx² = 4876, Σy² = 34027, Σxy = 8050. nΣxy − ΣxΣy = 80500 − 104600 = −24100.
3
denom = √((48760−40000)×(340270−273529)) = √(8760×66741) = √(584,731,160) = 24181.6. r = −24100/24181.6 = −0.9966.
r = −0.9966, R² = 0.9933. Very strong negative linear relationship. Highly significant.
Try this example →Example 3 - Summary Statistics Mode (Exercise vs GPA)
A study of 25 students gave: n=25, SumX=87.5, SumY=82.5, SumX2=340, SumY2=295, SumXY=312
1
X = weekly exercise hours, Y = GPA on a 4.0 scale (scaled to match units). Switch to Summary Stats mode and enter the six values.
2
nΣxy − ΣxΣy = 25×312 − 87.5×82.5 = 7800 − 7218.75 = 581.25.
3
ssX = 25×340 − 87.5² = 8500 − 7656.25 = 843.75. ssY = 25×295 − 82.5² = 7375 − 6806.25 = 568.75. r = 581.25 / √(843.75×568.75) = 581.25/692.73 = 0.8391.
r = 0.8391, R² = 0.7041. Strong positive linear relationship. Statistically significant.
Try this example →Example 4 - Weak Correlation (Age vs Job Satisfaction)
Ages: 30, 42, 55, 28, 65, 38, 51, 74, 45, 60 and job satisfaction (0-100): 72, 85, 78, 90, 82, 68, 75, 80, 71, 88
1
We are testing whether age predicts job satisfaction. The data has a lot of scatter around any potential trend.
2
n = 10. Σx = 488, Σy = 789. Computing all sums: nΣxy − ΣxΣy = 386740 − 385032 = 1708.
3
ssX = 20296, ssY = 4989. r = 1708 / √(20296×4989) = 1708 / 10062.6 = 0.1698. t = 0.1698×√8/√(1−0.0288) = 0.480/0.985 = 0.487. p = 0.64 (not significant).
r = 0.1698, R² = 0.0288. Negligible positive correlation. Not statistically significant.
Try this example →