The Least Squares Method
Residuals, the Normal Equations, and the Line That Best Fits — A TLDR Primer
Least squares regression shows up in statistics class, AP courses, college math, and data science — and most textbooks bury the core idea under pages of theory before you ever see a single worked number. This guide cuts straight to what you need.
**The Least Squares Method: A TLDR Primer** walks you through the complete picture of linear regression, concise and without the bloat. You'll start with the real problem: real data is messy, and you need a principled way to draw the best possible line through it. From there, the guide explains exactly why squaring residuals is the right move (not absolute values, not signed sums), derives the slope and intercept formulas using basic calculus, and then works through a complete numerical example by hand — residuals, R-squared, and a prediction included.
The guide doesn't stop at the simple case. It names the failure modes that trip students up — outliers, nonlinearity, heteroscedasticity — explains what each one actually does to your results, and points to the standard fixes. The final section generalizes to multiple regression through the matrix normal equations, giving you a clean on-ramp to the statistics and machine learning courses that come next.
Written for high school students in statistics or precalculus, college freshmen and sophomores meeting regression for the first time, and anyone who needs a fast, honest explanation of how the line of best fit actually works. Short by design, every section earns its place.
If least squares is on your next exam or assignment, start here.
- Define residuals and explain why we minimize their squares rather than their absolute values or signed sums
- Derive and apply the formulas for the slope and intercept of the least-squares regression line
- Compute a best-fit line by hand from a small data set and interpret slope, intercept, and R-squared
- Recognize when least squares is appropriate and when outliers, nonlinearity, or heteroscedasticity break it
- Connect the one-variable case to the general normal equations used in multiple regression
- 1. The Problem: Fitting a Line to Messy DataSets up the core problem least squares solves and introduces residuals and the cost function.
- 2. Why Squares? The Logic Behind the ChoiceExplains why we square residuals instead of taking absolute values or signed sums, with both geometric and statistical reasons.
- 3. Deriving the Slope and Intercept FormulasUses calculus to minimize the sum of squared residuals and derive the closed-form formulas for slope and intercept.
- 4. A Worked Example from ScratchComputes a best-fit line by hand on a small data set, including residuals, R-squared, and prediction.
- 5. When Least Squares Fails (and What to Do About It)Names the common failure modes — outliers, nonlinearity, heteroscedasticity — and the standard fixes.
- 6. Beyond One Variable: The General PictureGeneralizes to multiple regression via the matrix normal equations and points to where this leads in statistics and machine learning.