## 统计代写|线性回归代写linear regression代考|Checking Lack of Fit

The response plot may look good while the residual plot suggests that the unimodal MLR model can be improved. Examining plots to find model violations is called checking for lack of fit. Again assume that $n \geq 5 p$.

The unimodal MLR model often provides a useful model for the data, but the following assumptions do need to be checked.
i) Is the MLR model appropriate?
ii) Are outliers present?
iii) Is the error variance constant or nonconstant? The constant variance assumption $\operatorname{VAR}\left(e_i\right) \equiv \sigma^2$ is known as homoscedasticity. The nonconstant variance assumption $\operatorname{VAR}\left(e_i\right)=\sigma_i^2$ is known as heteroscedasticity.
iv) Are any important predictors left out of the model?
v) Are the errors $e_1, \ldots, e_n$ iid?
vi) Are the errors $e_i$ independent of the predictors $\boldsymbol{x}_i$ ?
Make the response plot and the residual plot to check i), ii), and iii). An MLR model is reasonable if the plots look like Figures 1.2, 1.3, 1.4, and 2.1. A response plot that looks like Figure $13.7$ suggests that the model is not linear. If the plotted points in the residual plot do not scatter about the $r=0$ line with no other pattern (i.e., if the cloud of points is not ellipsoidal or rectangular with zero slope), then the unimodal MLR model is not sustained.
The $i$ th residual $r_i$ is an estimator of the $i$ th error $e_i$. The constant variance assumption may have been violated if the variability of the point cloud in the residual plot depends on the value of $\hat{Y}$. Often the variability of the residuals increases as $\hat{Y}$ increases, resulting in a right opening megaphone shape. (Figure 4.1b has this shape.) Often the variability of the residuals decreases as $\hat{Y}$ increases, resulting in a left opening megaphone shape. Sometimes the variability decreases then increases again, and sometimes the variability increases then decreases again (like a stretched or compressed football).

## 统计代写|线性回归代写linear regression代考|Residual Plots

Remark 2.3. Residual plots magnify departures from the model while the response plot emphasizes how well the MLR model fits the data.

Since the residuals $r_i=\hat{e}_i$ are estimators of the errors, the residual plot is used to visualize the conditional distribution $e \mid S P$ of the errors given the sufficient predictor $\mathrm{SP}=\boldsymbol{x}^T \boldsymbol{\beta}$, where $\mathrm{SP}$ is estimated by $\widehat{Y}=\boldsymbol{x}^T \hat{\boldsymbol{\beta}}$. For the unimodal MLR model, there should not be any pattern in the residual plot: as a narrow vertical strip is moved from left to right, the behavior of the residuals within the strip should show little change.

Notation. A rule of thumb is a rule that often but not always works well in practice.

Rule of thumb 2.1. If the residual plot would look good after several points have been deleted, and if these deleted points were not gross outliers (points far from the point cloud formed by the bulk of the data), then the residual plot is probably good. Beginners often find too many things wrong with a good model. For practice, use the lregpack function MLRsim to generate several MLR data sets, and make the response and residual plots for these data sets: type MLRsim(nruns=10) in $R$ and right click Stop for each plot (20 times) to generate 10 pairs of response and residual plots. This exercise will help show that the plots can have considerable variability even when the MLR model is good. See Problem 2.30.

Rule of thumb 2.2. If the plotted points in the residual plot look like a left or right opening megaphone, the first model violation to check is the assumption of nonconstant variance. (This is a rule of thumb because it is possible that such a residual plot results from another model violation such as nonlinearity, but nonconstant variance is much more common.)

