Perform simple linear regression analysis on data points. Calculate the best-fit line equation, correlation coefficient, R-squared, and make predictions with step-by-step solutions.
You might also find these calculators useful
Calculate the equation of the best-fit line for your data using least squares regression. Get slope, intercept, correlation coefficient (r), R-squared, standard error, and make predictions. Includes scatter plot visualization with regression line.
Linear regression is a statistical method for modeling the relationship between a dependent variable and one or more independent variables. Simple linear regression finds the best-fitting straight line (ŷ = b₀ + b₁x) through data points by minimizing the sum of squared residuals. It's fundamental to predictive modeling and data analysis.
Regression Equation
ŷ = b₀ + b₁xPredict future sales based on advertising spend, time, or other factors.
Analyze experimental data and quantify relationships between variables.
Model stock returns, estimate asset values, or analyze economic trends.
Study relationships between study hours and test scores, or other educational metrics.
R-squared, also called the coefficient of determination, indicates how well the regression line fits your data. An R² of 0.85 means 85% of the variance in Y is explained by X. Values closer to 1 indicate a better fit, while values near 0 suggest the model doesn't explain the data well.
The correlation coefficient (r) measures the strength and direction of the linear relationship between variables, ranging from -1 to 1. R² is simply r squared, which gives the proportion of variance explained. Unlike r, R² is always positive and doesn't indicate direction.
While you technically need at least 2 points to fit a line, meaningful statistical analysis requires more. Generally, you should have at least 10-20 data points, and preferably more, to get reliable results. The standard error calculation requires n > 2.
Don't use linear regression when: the relationship isn't linear (check the scatter plot), there are significant outliers, the data shows heteroscedasticity (non-constant variance), or the observations aren't independent. In these cases, consider other methods or data transformations.