Polynomial Regression Calculator

Fit a polynomial of any degree from 1 to 6 to your data using least squares and get all coefficients, R², and predictions.

∼ Polynomial Regression Calculator
R² (R-squared)
-
Adj. R²
-
Data Points (n)
-
Degree (p)
-

📖 What is Polynomial Regression?

Polynomial regression is a powerful form of regression analysis that fits a polynomial of degree n - y = a₀ + a₁x + a₂x² + ··· + aₙxⁿ - to observed data. It is a generalisation of linear regression: degree 1 gives a straight line, degree 2 a parabola, degree 3 a cubic, and so on up to degree 6 in this calculator.

The key insight is that although the model is non-linear in x (it involves powers of x), it is still linear in the coefficients a₀, a₁, ..., aₙ. This means that the same ordinary least-squares principle applies: minimise the sum of squared residuals with respect to the coefficients, which yields a system of linear equations - the normal equations - that can be solved exactly using Gaussian elimination.

Polynomial regression underpins a huge variety of real-world applications. Analytical chemists use degree-3 to degree-5 polynomials for spectrophotometer calibration curves. Meteorologists fit degree-4 polynomials to hourly temperature profiles over 24 hours. Economists use cubic polynomials for cost functions. Signal processing engineers use high-degree polynomial baselines for background subtraction in spectroscopy. Epidemiologists use polynomial curves to model the shape of infection waves.

This calculator handles degrees 1 through 6, implementing the Vandermonde normal equations solved by Gaussian elimination with partial pivoting. Results include all coefficients, R², adjusted R² (which penalises for extra parameters), and a prediction field for evaluating the fitted polynomial at any X.

A critical principle: always prefer the simplest adequate model. Higher degree = more flexibility but higher risk of overfitting. Start with degree 1 or 2, check residuals, and increase degree only when there is clear systematic curvature remaining in the residuals.

📐 Formulas

ŷ = a₀ + a₁x + a₂x² + ··· + aₙxⁿ

Normal equations (Vandermonde system, (n+1) × (n+1)):

The matrix A has entries A[i][j] = Σₖ xₖ^(i+j), for i, j = 0, 1, ..., n.

The right-hand side vector b has entries b[i] = Σₖ yₖ · xₖⁱ.

Solve A·coeff = b for [a₀, a₁, ..., aₙ] using Gaussian elimination with partial pivoting.

R²: R² = 1 − SS_res / SS_tot, where SS_res = Σ(yᵢ − ŷᵢ)², SS_tot = Σ(yᵢ − ȳ)²

Adjusted R²: R²_adj = 1 − (1 − R²) · (n − 1) / (n − p − 1), where p = degree

Prediction: ŷ = a₀ + a₁x + a₂x² + ··· + aₙxⁿ evaluated at any chosen X using Horner's method for numerical efficiency.

📖 How to Use This Calculator

1
Enter the X values and Y values as comma-separated numbers. Ensure both lists have the same count. You need at least degree+2 points for a meaningful fit.
2
Select the polynomial degree from the dropdown. Begin with degree 2 or 3; only increase if R² is low and the residuals show systematic curvature.
3
Click Calculate Polynomial Regression. All coefficients a₀ through aₙ, R², and adjusted R² are displayed. The full equation is shown in the equation box.
4
Enter any X in the Prediction field to get the predicted Y from the fitted polynomial. Be cautious extrapolating far beyond the observed X range - polynomial curves can behave unpredictably outside the data.
5
Compare R² and adjusted R² across degrees. If adjusted R² decreases when you increase the degree, the higher-degree model is not justified by the data - revert to the lower degree.

📝 Example Calculations

Example 1 - Degree 3: Economic Cubic Cost Function

Output Q: 1, 2, 3, 4, 5, 6, 7, 8. Total Cost TC (£000): 8, 11, 13, 17, 25, 40, 65, 104.

Degree 3 fit: a₀ ≈ 10.36, a₁ ≈ −4.14, a₂ ≈ 1.71, a₃ ≈ 0.286. R² ≈ 0.9993. Adj. R² ≈ 0.9988.

Equation: ŷ = 10.36 − 4.14x + 1.71x² + 0.286x³.

Prediction at Q = 9: ŷ = 10.36 − 4.14(9) + 1.71(81) + 0.286(729) ≈ £157,000.

Result = ŷ = 10.36 − 4.14x + 1.71x² + 0.286x³, R² ≈ 0.9993
Try this example →

Example 2 - Degree 4: Hourly Temperature Profile

Hour X: 0, 3, 6, 9, 12, 15, 18, 21, 24. Temp °C: 12, 10, 9, 16, 26, 29, 25, 18, 13.

Degree 4 fit: a₀ ≈ 11.55, a₁ ≈ −1.762, a₂ ≈ 0.607, a₃ ≈ −0.0556, a₄ ≈ 0.001538. R² ≈ 0.991. Adj. R² ≈ 0.982.

The degree-4 model captures the morning dip, afternoon peak, and evening decline accurately. Degree 2 (R² ≈ 0.852) misses the morning cooling - the extra terms are justified.

Prediction at hour 14: ŷ ≈ 28.4°C - matching the expected mid-afternoon maximum.

Result = Degree 4 fit, R² ≈ 0.991, Adj. R² ≈ 0.982
Try this example →

Example 3 - Degree 2 vs Degree 3: When Higher Degree is Not Needed

X: 0, 1, 2, 3, 4, 5, 6. Y: 1, 5, 11, 19, 29, 41, 55.

Degree 2: a₀ ≈ 1.00, a₁ ≈ 2.00, a₂ ≈ 2.00. R² = 1.000. Adj. R² = 1.000.

Degree 3: a₀ ≈ 1.00, a₁ ≈ 2.00, a₂ ≈ 2.00, a₃ ≈ ~0. R² = 1.000. Adj. R² stays near 1.000 but the cubic coefficient is essentially zero.

Conclusion: the data follows an exact quadratic (y = 2x² + 2x + 1). Adding a cubic term explains nothing extra. Prefer degree 2 for this data.

Result = ŷ = 2x² + 2x + 1, R² = 1.000 (exact quadratic)
Try this example →

Example 4 - Degree 5: Calibration Curve in Analytical Chemistry

Concentration (μg/mL): 0, 2, 4, 6, 8, 10, 12, 14, 16. Absorbance: 0.000, 0.082, 0.197, 0.348, 0.496, 0.621, 0.718, 0.781, 0.820.

The curve flattens at high concentrations (detector saturation). Degree 1 R² ≈ 0.949 - poor. Degree 2 R² ≈ 0.9987. Degree 3 R² ≈ 0.9996. Degree 5 R² ≈ 0.99999 - excellent.

For a required absorbance of 0.650, solving the degree-3 equation gives concentration ≈ 10.8 μg/mL - more accurate than the linear calibration (would predict 10.3 μg/mL).

Result = Degree 5 fit, R² ≈ 0.99999 (detector saturation captured)
Try this example →

Example 5 - Demonstrating Overfitting (Degree 6 on 8 Points)

X: 1, 2, 3, 4, 5, 6, 7, 8. Y (noisy data): 3, 10, 18, 22, 21, 15, 8, 2.

Degree 2: R² ≈ 0.982, Adj. R² ≈ 0.975. Clean inverted-parabola fit.

Degree 3: R² ≈ 0.983, Adj. R² ≈ 0.970 - only marginal improvement; cubic coefficient near zero.

Degree 6: R² ≈ 0.9998, Adj. R² ≈ 0.9987 - curve passes nearly through all 8 points but oscillates wildly at x = 0 and x = 9. A prediction at x = 9 from degree 6 could be wildly inaccurate despite the near-perfect R².

Lesson: For this data, degree 2 is the correct model. Degree 6 is an overfit that memorises the noise in the 8 training points.

Result = Degree 2: R² ≈ 0.982 (best model - degree 6 overfits)
Try this example →

❓ Frequently Asked Questions

What is polynomial regression?+
Polynomial regression fits a polynomial of degree n - y = a₀ + a₁x + a₂x² + ··· + aₙxⁿ - to a dataset using the method of least squares. It is a generalisation of linear regression (degree 1), quadratic regression (degree 2), and cubic regression (degree 3) to any degree. The coefficients a₀, a₁, ..., aₙ are found by solving a (n+1) × (n+1) system of normal equations. This calculator supports degrees 1 through 6.
How does polynomial regression work mathematically?+
The normal equations are derived by minimising SS_res = Σ(yᵢ − Σ aₖxᵢᵏ)² over a₀, a₁, ..., aₙ. The result is a symmetric (n+1)×(n+1) Gram matrix system: Aᵢⱼ = Σxₖ^(i+j) and bᵢ = Σyₖxₖⁱ for i, j = 0..n. This is also known as the Vandermonde normal equations. The system is solved here using Gaussian elimination with partial pivoting.
How do I choose the right polynomial degree?+
Start with a scatter plot of your data to visually assess how many turning points the relationship has. A linear trend → degree 1. One turning point → degree 2 (quadratic). Two turning points → degree 3 (cubic). Then fit successive degrees and compare R². A meaningful improvement (e.g. R² increases by 0.02+) justifies a higher degree. Beyond the natural complexity of the data, adding more terms captures noise rather than signal (overfitting). As a rule of thumb, use at most n/3 terms where n is your sample size.
What is the difference between polynomial regression and polynomial interpolation?+
Polynomial interpolation (e.g. Lagrange) passes the curve through every data point exactly by using a polynomial of degree n−1 for n points. Polynomial regression with fewer parameters than data points finds the best-fit polynomial that minimises squared errors but does not pass through all points. For noisy data, regression is always preferred as interpolation overfits noise and oscillates wildly (Runge's phenomenon). Regression uses Gaussian elimination on the normal equations; interpolation uses different methods (divided differences, etc.).
What is overfitting in polynomial regression?+
Overfitting occurs when the polynomial degree is too high relative to the amount of data. A degree-6 polynomial fitted to 8 points will achieve a very high R², but the curve will oscillate sharply between data points and make poor predictions for new X values. Symptoms: very high R² but wild oscillations between data points; coefficients are very large and unstable; removing one data point drastically changes all coefficients. The fix is to use a lower degree, cross-validation, or regularised regression (Ridge/LASSO).
What is R-squared in polynomial regression?+
R² = 1 − SS_res/SS_tot measures the proportion of variance in Y explained by the polynomial model. Importantly, R² can never decrease when you add more polynomial terms - it will always be at least as high as the lower-degree R². This means you should not simply choose the degree that maximises R². Instead, compare models using adjusted R² (which penalises for extra parameters) or formal F-tests for the added terms.
What is adjusted R-squared?+
Adjusted R² = 1 − (1−R²)(n−1)/(n−p−1), where n is the number of data points and p is the number of predictors (= degree). Adjusted R² penalises for adding more terms and can actually decrease if extra terms do not improve the fit enough to justify the added complexity. It is a better criterion than R² for comparing polynomial models of different degrees with the same data.
What are the numerical issues with high-degree polynomial regression?+
High-degree polynomial regression can suffer from numerical ill-conditioning: powers like x⁵ and x⁶ can be enormous numbers, leading to catastrophic cancellation in the normal equations. Practical remedies: (1) centre X by subtracting x̄ before fitting; (2) scale X to unit variance; (3) use orthogonal polynomials (Chebyshev or Legendre basis) instead of the monomial basis. This calculator uses partial-pivoting Gaussian elimination which helps considerably, but for degree 5–6 with large X values, centering X is strongly recommended.
When would I use degree 4, 5, or 6?+
Degree 4 or higher is useful for: (1) modelling complex periodic-like data over a limited range (e.g. hourly data over one day); (2) fitting calibration curves in analytical chemistry with known non-linearity; (3) approximating smooth functions where a closed-form is unavailable; (4) exploratory data analysis to identify the underlying complexity. In most practical cases, degrees 1–3 are sufficient. Degrees 5–6 are rarely physically motivated and should be used with caution.