Ch. 12 Linear Regression
Sections covered: 12.1, 12.2, 12.5
12.2 Estimating Model Parameters
Formulas to know from p. 498:
\(b_1 = \dfrac{\sum(x_i -\overline{x})(y_i - \overline{y})}{\sum(x_i - \overline{x})^2} = \frac{S_{xy}}{S_{xx}}\) and \(b_0 = \overline{y} - b_1 \overline{x}\)
Formula to know from p. 502:
\(SSE = \sum(y_i - \hat{y_i})^2\)
Formulas to know from p. 504:
\(SST = \sum(y_i - \overline{y})^2\) and \(r^2 = 1 - \frac{SSE}{SST}\)
Formulas to know from p. 505:
\(SSR = \sum(\hat{y_i} - \overline{y})^2\) and \(SSE + SSR = SST\)
Resources
Interactive Visualization: Linear Regression Try fitting the least squares line to a set of random data and check your answer (and another one).
Video: Regression I: What is regression? | SSE, SSR, SST | R-squared | Errors (ε vs. e)
Textbook p. 507 #17
Researchers fitted a simple linear regression model to explain how \(Y=\) porosity (%) is related to \(X=\) unit weight (pcf) in concrete specimens. Consider the following representative data:
x <- c(99.0, 101.1, 102.7, 103.0, 105.4, 107.0, 108.7, 110.8, 112.1, 112.4, 113.6, 113.8, 115.1, 115.4, 120.0)
y <- c(28.8, 27.9, 27.0, 25.2, 22.8, 21.5, 20.9, 19.6, 17.1, 18.9, 16.0, 16.7, 13.0, 13.6, 10.8)
Using R to find:
Model coefficients, \(b_0\) and \(b_1\):
mod <- lm(y ~ x)
mod$coefficients
## (Intercept) x
## 118.9099168 -0.9047307
(\(b_0\) is listed under (Intercept)
and \(b_1\) is listed under x
.)
Residuals
mod$residuals
## 1 2 3 4 5 6 7
## -0.5415817 0.4583527 1.0059218 -0.5226590 -0.7513055 -0.6037364 0.3343057
## 8 9 10 11 12 13 14
## 0.9342401 -0.3896101 1.6818091 -0.1325141 0.7484321 -1.7754181 -0.9039989
## 15
## 0.4577621
rounded:
round(mod$residuals, 2)
## 1 2 3 4 5 6 7 8 9 10 11 12 13
## -0.54 0.46 1.01 -0.52 -0.75 -0.60 0.33 0.93 -0.39 1.68 -0.13 0.75 -1.78
## 14 15
## -0.90 0.46
SSR, SSE, SST
SSR (regression sum of squares):
## [1] 426.6185
SSE (error sum of squares):
SSE <- sum(mod$residuals^2)
SSE
## [1] 11.43883
SST (total sum of squares):
## [1] 438.0573
(Check that SSR + SSE = SST)
SSR + SSE
## [1] 438.0573
- What proportion of observed variation in porosity can be attributed to the approximate linear relationship between unit weight and porosity?
Method #1: SSR/SST
SSR/SST
## [1] 0.9738874
Method #2: \(r^2\)
cor(x, y)^2
## [1] 0.9738874
12.5 Correlation
Skip: p. 530 “Inferences About the Population Correlation Coefficient” to the end of the section.
Resources
Interactive visualization: Correlation Coefficient (add and remove points)
Interactive visualization: Interpreting Correlations