Linear Regression Calculator
Fit a straight line to your data and get slope, intercept, R², and predictions instantly.
📖 What is Linear Regression?
Linear regression is the most fundamental statistical modelling technique. It models the relationship between a continuous dependent variable (Y) and one or more independent variables (X) by fitting a straight line through the data. The goal is to find the line that best represents the underlying relationship - specifically, the line that minimises the sum of squared vertical distances from each data point to the line.
Developed in the early 19th century by Gauss and Legendre, the method of least squares underpins an enormous range of applications: predicting house prices from square footage, estimating the effect of advertising spend on sales, calibrating scientific instruments, and quantifying the relationship between a drug dose and patient response.
The fitted line y = mx + b makes two key assumptions: linearity (the true relationship is roughly linear) and homoscedasticity (the variance of residuals is roughly constant across X values). When these are violated, alternatives like polynomial regression, weighted regression, or non-linear regression are needed.
This calculator performs simple linear regression (one predictor) using the ordinary least squares method and reports the regression equation, R², correlation coefficient, standard error of regression, and residuals.
📐 Formulas
Slope: m = [n·Σ(xy) − Σx·Σy] / [n·Σ(x²) − (Σx)²]
Intercept: b = ȳ − m·x̄
R² (coefficient of determination): R² = 1 − SS_res/SS_tot
where SS_res = Σ(yᵢ − ŷᵢ)² and SS_tot = Σ(yᵢ − ȳ)²
Correlation coefficient: r = [n·Σ(xy) − Σx·Σy] / √{[n·Σx² − (Σx)²][n·Σy² − (Σy)²]}
Note: r = √R² with the sign of m. In simple regression R² = r².
Standard error of regression: SER = √[SS_res / (n−2)]
Residual for point i: eᵢ = yᵢ − ŷᵢ = yᵢ − (mx_i + b)
📖 How to Use This Calculator
📝 Example Calculations
Example 1 - Study Hours vs Exam Score
X (hours): 1, 2, 3, 4, 5. Y (score): 55, 62, 70, 75, 82.
Σx=15, Σy=344, Σx²=55, Σxy=1083, n=5
m = (5×1083 − 15×344)/(5×55 − 225) = (5415−5160)/(275−225) = 255/50 = 5.10
b = 344/5 − 5.10×15/5 = 68.8 − 15.3 = 53.5
Equation: ŷ = 5.10x + 53.5. R² ≈ 0.993 - excellent fit.
Example 2 - Advertising vs Sales
X (ad spend £000): 10, 20, 30, 40, 50. Y (sales £000): 120, 145, 175, 195, 230.
m ≈ 2.72, b ≈ 95.6. Equation: ŷ = 2.72x + 95.6
R² ≈ 0.993. Prediction at x = 35: ŷ = 2.72×35 + 95.6 = 190.8 (£190,800 expected sales).
Example 3 - Temperature vs Ice Cream Sales
X (°C): 20, 25, 30, 35, 40. Y (units/day): 80, 120, 160, 200, 250.
m ≈ 8.40, b ≈ −84.0. Equation: ŷ = 8.40x − 84.0
R² ≈ 0.995. At 38°C: ŷ = 8.40×38 − 84 = 235 units expected.
Example 4 - Poor Fit Example
X: 1, 2, 3, 4, 5. Y: 10, 40, 20, 45, 15. Random scatter with no trend.
R² ≈ 0.01 - the linear model explains almost none of the variation. A different model (or accepting no relationship) is needed.