## 经济代写|计量经济学代写Econometrics代考|Consistency of the NLS Estimator

A univariate “nonlinear regression model” has up to now been expressed in the form
$$\boldsymbol{y}=\boldsymbol{x}(\boldsymbol{\beta})+\boldsymbol{u}, \quad \boldsymbol{u} \sim \operatorname{IID}\left(\mathbf{0}, \sigma^2 \mathbf{I}_n\right),$$
where $\boldsymbol{y}, \boldsymbol{x}(\boldsymbol{\beta})$, and $\boldsymbol{u}$ are $n$-vectors for some sample size $n$. The model parameters are therefore $\boldsymbol{\beta}$ and either $\sigma$ or $\sigma^2$. The regression function $x_t(\boldsymbol{\beta})$, which is the $t^{\text {th }}$ element of $\boldsymbol{x}(\boldsymbol{\beta})$, will in general depend on a row vector of variables $\boldsymbol{Z}_t$. The specification of the vector of error terms $\boldsymbol{u}$ is not complete,since the distribution of the $u_t$ ‘s has not been specified. Thus, for a sample of size $n$, the model $\mathbb{M}$ described by (5.08) is the set of all DGPs generating samples $y$ of size $n$ such that the expectation of $y_t$ conditional on some information set $\Omega_t$ that includes $\boldsymbol{Z}_t$ is $x_t(\boldsymbol{\beta})$ for some parameter vector $\boldsymbol{\beta} \in \mathbb{R}^k$, and such that the differences $y_t-x_t(\boldsymbol{\beta})$ are independently distributed error terms with common variance $\sigma^2$, usually unknown.

It will be convenient to generalize this specification of the DGPs in M a little, in order to be able to treat dynamic models, that is, models in which there are lagged dependent variables. Therefore, we explicitly recognize the possibility that the regression function $x_t(\boldsymbol{\beta})$ may include among its (until now implicit) dependences an arbitrary but bounded number of lags of the dependent variable itself. Thus $x_t$ may depend on $y_{t-1}, y_{t-2}, \ldots, y_{t-l}$, where $l$ is a fixed positive integer that does not depend on the sample size. When the model uses time-series data, we will therefore take $x_t(\boldsymbol{\beta})$ to mean the expectation of $y_t$ conditional on an information set that includes the entire past of the dependent variable, which we can denote by $\left{y_s\right}_{s=1}^{t-1}$, and also the entire history of the exogenous variables up to and including the period $t$, that is, $\left{\boldsymbol{Z}s\right}{s=1}^t$. The requirements on the disturbance vector $\boldsymbol{u}$ are unchanged.
For asymptotic theory to be applicable, we must next provide a rule for extending (5.08) to samples of arbitrarily large size. For models which are not dynamic (including models estimated with cross-section data, of course), so that there are no time trends or lagged dependent variables in the regression functions $x_t$, there is nothing to prevent the simple use of the fixed-inrepeated-samples notion that we discussed in Section 4.4. Specifically, we consider only sample sizes that are integer multiples of the actual sample size $m$ and then assume that $x_{N m+t}(\boldsymbol{\beta})=x_t(\boldsymbol{\beta})$ for $N>1$. This assumption makes the asymptotics of nondynamic models very simple compared with those for dynamic models. ${ }^3$

## 经济代写|计量经济学代写Econometrics代考|Asymptotic Normality of the NLS Estimator

In this section, we discuss the asymptotic normality of the nonlinear least squares estimator. For this, we will require a bit more regularity than was needed for consistency, as we will see. First, a formal definition of asymptotic normality:
Definition 5.4.
A consistent estimator $\hat{\boldsymbol{\beta}} \equiv\left{\hat{\boldsymbol{\beta}}^n\right}$ of the parameters of the asymptotically identified parametrized model (M, $\boldsymbol{\beta}$ ) is asymptotically normal if for every DGP $\mu_0 \in \mathbb{M}$, the sequence of random variables $\left{n^{1 / 2}\left(\hat{\boldsymbol{\beta}}^n-\boldsymbol{\beta}_0\right)\right}$ tends in distribution to a (multivariate) normal distribution, with mean zero and finite covariance matrix.
The crucial difference between the property of asymptotic normality and that of consistency discussed in the preceding section is the factor of $n^{1 / 2}$. This factor “blows up” $\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0$, which, if $\hat{\boldsymbol{\beta}}$ is consistent for $\boldsymbol{\beta}_0$, tends to zero as $n$ tends to infinity. Thus the product $n^{1 / 2}\left(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0\right)$ tends to a vector of nonzero random variables. Asymptotic normality, when it holds, will of course imply consistency, since if $n^{1 / 2}\left(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0\right)$ is $O(1)$, it follows that $\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0$ must be $O\left(n^{-1 / 2}\right)$. If the estimator $\hat{\boldsymbol{\beta}}$ satisfies the latter property, it is said to be root$\boldsymbol{n}$ consistent, meaning that the difference between the estimator and the true value is proportional to one over $\sqrt{n}$. An estimator that is root- $n$ consistent must also be weakly consistent, since plim $\left(\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}_0\right)=\mathbf{0}$. Not all consistent estimators are root- $n$ consistent, however.

As in the last section, we will first state a theorem which gives conditions sufficient for the asymptotic normality of the NLS estimator and then discuss the circumstances in which we may hope that the conditions are satisfied. First, some notation. As usual we let $\boldsymbol{X}t(\boldsymbol{\beta}) \equiv D\beta x_t(\boldsymbol{\beta})$ denote the row vector of partial derivatives of the regression function $x_t(\boldsymbol{\beta})$; then $\boldsymbol{A}t(\boldsymbol{\beta}) \equiv D{\beta \beta} x_t(\boldsymbol{\beta})$ will denote the Hessian of $x_t(\boldsymbol{\beta})$, and $\boldsymbol{H}t\left(y_t, \boldsymbol{\beta}\right) \equiv D{\beta \beta}\left(y_t-x_t(\boldsymbol{\beta})\right)^2$ will denote the Hessian of the contribution to the sum-of-squares function from observation $t$. This last is readily seen to be
$$\boldsymbol{H}_t\left(y_t, \boldsymbol{\beta}\right)=2\left(\boldsymbol{X}_t^{\top}(\boldsymbol{\beta}) \boldsymbol{X}_t(\boldsymbol{\beta})-\boldsymbol{A}_t(\boldsymbol{\beta})\left(y_t-x_t(\boldsymbol{\beta})\right)\right) .$$
Evidently, the Hessian $\boldsymbol{A}_t$ of the regression function will be a zero matrix if the regression function $x_t$ is linear, and $\boldsymbol{X}_t(\boldsymbol{\beta})$ will just be $\boldsymbol{X}_t$. In that case, $\boldsymbol{H}_t\left(y_t, \boldsymbol{\beta}\right)$ will simplify to $2\left(\boldsymbol{X}_t^{\top} \boldsymbol{X}_t\right)$, which is necessarily positive semidefinite.

# 计量经济学代考

## 经济代写|计量经济学代写Econometrics代考|Consistency of the NLS Estimator

$$\boldsymbol{y}=\boldsymbol{x}(\boldsymbol{\beta})+\boldsymbol{u}, \quad \boldsymbol{u} \sim \operatorname{IID}\left(\mathbf{0}, \sigma^2 \mathbf{I}n\right),$$ 在哪里 $\boldsymbol{y}, \boldsymbol{x}(\boldsymbol{\beta})$ ，和 $\boldsymbol{u}$ 是 $n$ – 一些样本大小的向量 $n$. 因此模型参数为 $\boldsymbol{\beta}$ 并且要么 $\sigma$ 或者 $\sigma^2$. 回归函数 $x_t(\boldsymbol{\beta})$ ，哪一个是 $t^{\text {th }}$ 的元素 $\boldsymbol{x}(\boldsymbol{\beta})$ ，通常取决于变量 的行向量 $\boldsymbol{Z}_t$. 误差项向量的规范 $\boldsymbol{u}$ 不完整，由于分布 $u_t$ ‘s 尚末指定。因此，对于一个大小的样本 $n$ ，该模型 M (5.08) 描述的是所有 DGP 生成样本的 集合 $y$ 大小的 $n$ 这样的期望 $y_t$ 以某些信息集为条件 $\Omega_t$ 那包含着 $\boldsymbol{Z}_t$ 是 $x_t(\boldsymbol{\beta})$ 对于一些参数向量 $\boldsymbol{\beta} \in \mathbb{R}^k$ ，并且使得差异 $y_t-x_t(\boldsymbol{\beta})$ 是具有共同方差的独 立分布的误差项 $\sigma^2$ ，通常是末知的。 为了能够处理动态模型，即存在滞后因变量的模型，可以方便地对 M 中的 DGP 的这种规范进行一些概括。因此，我们明确地认识到回归函数的可 能性 $x_t\left(\boldsymbol{\beta}\right.$ )可以在其（直到现在是隐式的）依赖项中包括因变量本身的任意但有限数量的滞后。因此 $x_t$ 可能取决于 $y{t-1}, y_{t-2}, \ldots, y_{t-l} ，$ 在哪里 $l$ 是 一个不依赖于样本大小的固定正整数。当模型使用时间序列数据时，我们将因此取 $x_t(\boldsymbol{\beta})$ 表示期望 $y_t$ 以包含因变量整个过去的信息集为条件，我们可 以表示为 $\backslash$ left 的分隔符蚗失或无法识别

