Residual Calculator
Analyse regression fit quality by computing residuals, SSR, RMSE, and standardised residuals.
📖 What are Residuals in Regression?
A residual is the difference between an observed data point and the value predicted by a statistical model: eᵢ = yᵢ − ŷᵢ. In linear regression, the model predicts an output value ŷ for each input x. The residual for observation i is how far the actual observed y value lies from that predicted line. A positive residual means the point is above the regression line; a negative residual means it is below.
Residuals are one of the most important diagnostic tools in statistics. They reveal whether the model is appropriate for the data, whether the assumptions of linear regression hold, and whether any observations are unusually influential. Ordinary Least Squares (OLS) regression finds the line that minimises the sum of squared residuals (SSR = Σeᵢ²) - this is literally what "least squares" means.
The sum of squared residuals measures total unexplained variation. The RMSE (root mean square error) is the typical prediction error in original units. Standardised residuals - residuals divided by the standard error - put all residuals on a common scale so that outliers can be identified regardless of the measurement units. A standardised residual beyond ±2 indicates a point that is more than two standard errors away from the predicted value, which occurs in only about 5% of observations under normal assumptions.
Residual analysis is an essential step after fitting any regression model. Random scatter in residual plots confirms that the model is appropriate; systematic patterns signal problems such as non-linearity, heteroscedasticity, or omitted variables that require model revision.
📐 Formulas
Predicted value (from regression equation): ŷᵢ = b₀ + b₁·xᵢ, where b₀ = intercept, b₁ = slope.
Sum of Squared Residuals (SSR): SSR = Σᵢ eᵢ² = Σᵢ (yᵢ − ŷᵢ)²
RMSE (Root Mean Square Error): RMSE = √(SSR / n)
Standard Error of Residuals (SER): SER = √(SSR / (n − 2)) for simple linear regression with one predictor.
Standardised Residual: zᵢ = eᵢ / SER
Outlier flag: |zᵢ| > 2 (potential outlier); |zᵢ| > 3 (strong outlier).
Mean Residual (OLS property): ē = (1/n) Σeᵢ = 0 when regression includes an intercept.
All variables: yᵢ = observed value; ŷᵢ = predicted value; b₀ = intercept; b₁ = slope; n = number of observations; SER = standard error of residuals.