## 统计代写|回归分析作业代写Regression Analysis代考|Estimation and Practical Use of sigma2

The parameter $\sigma^{2}$ is perhaps the most important parameter of a regression model because it measures prediction accuracy. As shown previously, another way to write the model is $Y=\beta_{0}+\beta_{1} x+\varepsilon$, where $\varepsilon \sim \mathrm{N}\left(0, \sigma^{2}\right)$. Thus the prediction error terms are the $\varepsilon$ values, and these differ from zero with a variance of $\sigma^{2}$.

If the $\beta$ ‘s were known (as is true in simulations but not in reality), you could calculate errors $\varepsilon_{i}=\left{Y_{i}-\left(\beta_{0}+\beta_{1} x_{i}\right)\right}$ and obtain an unbiased estimate of $\sigma^{2}$ as:
(An unbiased estimator of $\left.\sigma^{2}\right)=\frac{1}{n} \sum_{i=1}^{n} \varepsilon_{i}^{2}$
This estimator is unbiased because each individual $\varepsilon_{i}^{2}$ is an unbiased estimator of $\sigma^{2}$, which you can see as follows:
$$\left.\mathrm{E}\left(\varepsilon_{i}^{2}\right)=\mathrm{E} \mid\left(Y_{i}-\beta_{0}-\beta_{1} x_{i}\right)^{2}\right)=\operatorname{Var}\left(Y_{i} \mid X=x_{i}\right)=\sigma^{2} .$$

However, in practice, you cannot use this estimator because the $\beta$ ‘s are unknown; thus the $\varepsilon^{\prime}$ s are unknown (or unobservable) as well. But you can use a similar estimator based on the residuals $e_{i}=\left{Y_{i}-\left(\hat{\beta}{0}+\hat{\beta}{1} x_{i}\right)\right}$, which are observable:
(Another estimator of $\sigma^{2}$ ) $=\frac{1}{n} \sum_{i=1}^{n} e_{i}^{2}$
This is, in fact, the maximum likelihood estimator, as given in Chapter 2. However, this estimator is biased: Recall that the values $\hat{\beta}{0}, \hat{\beta}{1}$ are chosen to minimize SSE; that is, $\mathrm{SSE}=\sum_{i=1}^{n} e_{i}^{2}$ is a minimum. In particular, $\mathrm{SSE}=\sum_{i=1}^{n} e_{i}^{2} \leq \sum_{i=1}^{n} \varepsilon_{i}^{2}$, which means that the estimator $\frac{1}{n} \sum_{i=1}^{n} e_{i}^{2}=\mathrm{SSE} / n$ is biased low since $\frac{1}{n} \sum_{i=1}^{n} \varepsilon_{i}^{2}$ is unbiased.

In basic statistics, you learned that the variance estimator uses “n-1″ in the denominator instead of ” $n$ ” to remove similar bias; the quantity ” $n-1$ ” is sometimes called degrees of freedom. You may have also heard that you lose a degree of freedom for every parameter you estimate. In regression, these parameters refer to the $\beta^{\prime}$ ‘s, so in simple regression, you lose two degrees of freedom. This leads to the following estimator of $\sigma^{2}$.

## 统计代写|回归分析作业代写Regression Analysis代考|Standard Errors

The Gauss-Markov theorem states that the OLS estimator has minimum variance among linear unbiased estimators. What does “variance” of the OLS estimator refer to? Please look at Figure $3.1$ again: You can see that there is variability in the possible values of $\hat{\beta}{1}$ ranging from $1.0$ to $2.0$. Variance of the estimator $\hat{\beta}{1}$, denoted symbolically by $\operatorname{Var}\left(\hat{\beta}{1}\right)$, refers to the variance of the distribution $p\left(\hat{\beta}{1}\right)$ that is shown in Figure 3.1.

If the assumptions of the Gauss-Markov model are true, then the following formula gives the exact variance of the OLS estimator $\hat{\beta}{1}$. Variance of the OLS estimator $\hat{\beta}{1}$
$$\operatorname{Var}\left(\hat{\beta}{1}\right)=\frac{\sigma^{2}}{(n-1) s{x}^{2}}$$
In the formula for $\operatorname{Var}\left(\beta_{1}\right)$, note that $s_{x}^{2}=\sum\left(x_{i}-\hat{\mu}{x}\right)^{2} /(n-1)$ is the usual estimate of the variance of $X$. Note that the $\operatorname{Var}\left(\hat{\beta}{1}\right)$ formula is conditional on the observed values of the $X$ data; this is apparent because $s_{x}^{2}$ is specifically a function of the observed $X$ data.

When coupled with unbiasedness of $\hat{\beta}{1}$, smaller $\operatorname{Var}\left(\hat{\beta}{1}\right)$ implies a more accurate estimate, i.e., an estimate that tends to be closer to $\beta_{1}$. Hence, we have the following interesting conclusions regarding the accuracy of the OLS estimate $\hat{\beta}{1}$ : The OLS estimate $\hat{\beta}{1}$ of $\beta_{1}$ is more accurate when:

• $n$ is larger, and/or
• $\mathrm{s}_{x}^{2}$ is larger, and/or
• $\sigma^{2}$ is smaller.
As mentioned above, the formula given for $\operatorname{Var}\left(\hat{\beta}{1}\right)$ can be mathematically derived from the assumptions of the Gauss-Markov model. Violation of assumptions renders the formula incorrect. In particular, violation of the homoscedasticity assumption is the rationale for using heteroscedasticity-consistent standard errors, which are covered in Chapter $12 .$ Strangely enough, the mathematics needed to prove the variance formula is easier in the multiple regression model, so we will prove it later in Chapter 7. But for now, you should understand the assumptions that imply the result (e.g., the classical model) and the result itself (the formula for $\operatorname{Var}\left(\hat{\beta}{1}\right)$ ) by using simulation: If you simulate many thousands of data sets from the same model, with the same sample size, and with the same $X$ data, then the sample variance estimate of the resulting thousands of $\hat{\beta}{1}$ estimates will be (within simulation error) equal to $\sigma^{2} /\left{(n-1) s{x}^{2}\right}$. The simulation also clarifies the “conditional on observed values of the $X$ data” interpretation because the $X$ data are the same for every simulated data set.

## 统计代写|回归分析作业代写Regression Analysis代考|Estimation and Practical Use of sigma2

$$\left.\mathrm{E}\left(\varepsilon_{i}^{2}\right)=\mathrm{E} \mid\left(Y_{i}-\beta_{0}-\beta_{1} x_{i}\right)^{2}\right)=\operatorname{Var}\left(Y_{i} \mid X=x_{i}\right)=\sigma^{2} .$$

(left 的分隔符缺失或无法识别 另一个估计 $\left.\sigma^{2}\right)=\frac{1}{n} \sum_{i=1}^{n} e_{i}^{2}$

## 统计代写|回归分析作业代写Regression Analysis代考|Standard Errors

$$\operatorname{Var}(\hat{\beta} 1)=\frac{\sigma^{2}}{(n-1) s x^{2}}$$ 到的函数 $X$ 数据。

• $n$ 更大, 和/或
• $s_{x}^{2}$ 更大，和/或
• $\sigma^{2}$ 更小。
如上所述，给出的公式为 $\operatorname{Var}(\hat{\beta} 1)$ 可以从高斯-马尔可夫模型的假设数学推导出来。违反假设会使公式不正确。特别是，违反同方差假设是使用异方差一致 标准误的基本原理，这将在第 1 章中介绍。12.奇怪的是，证明方差公式所需的数学在多元回归模型中更容易，因此我们将在第 7 章后面证明它。但是现 在，您应该了解暗示结果的假设 (例如，经典模型) 和结果本身 (公式为 $\operatorname{Var}(\hat{\beta} 1)$ ) 通过使用模拟：如果您从相同的模型、相同的样本量和相同的 $X$ 数
据，然后对结果的数干个样本方差估计 $\hat{\beta} 1$ 估计将（在模拟误差内) 等于 $\backslash 1 \mathrm{eft}$ 的分隔符缺失或无法识别
模拟还阐明了“以观测值为条件的 $X$ 数据”的解释，因为 $X$ 每个模拟数据集的数据都是相同的。

