## 经济代写|计量经济学代写Econometrics代考|Transformations and Reparametrizations

In this and the subsequent sections of this chapter, we develop the classical theory of maximum likelihood estimation and, in particular, demonstrate the properties that make it a desirable estimation method. We will also point out that in some circumstances these properties fail. As we discussed in Section 8.1, the major desirable features of ML estimators are invariance, consistency, asymptotic normality, asymptotic efficiency, and computability. In this section, we will discuss the first of these, the invariance of ML estimators to reparametrization of the model.

The idea of invariance is an important one in econometric analysis. Let us denote by $\mathbb{M}$ the model in which we are interested. A parametrization of the model $\mathbb{M}$ is a mapping, say $\lambda$, from a parameter space $\Theta$ to $\mathbb{M}$. For any given model $\mathbb{M}$ there may in general exist an infinite number of parametrizations. There are, after all, few constraints on the parameter space $\Theta$, other than its dimensionality. A subset of $\mathbb{R}^k$ of full dimension can be mapped in a one-to-one and differentiable manner onto virtually any other subset of $\mathbb{R}^k$ of full dimension by such devices as translation, rotation, dilation, and so on, and subsequently any of these other subsets can perfectly well serve as the parameter space for the model $\mathbb{M}$. It is because of this fact that one appeals to invariance as a desirable property of estimators. “Invariance” is understood in this context as invariance under the sort of transformation we have been discussing, which we call formally reparametrization.

As an illustration of the fact that any model may be parametrized in an infinite number of ways, consider the case of the exponential distribution, which was discussed in Section 8.1. The likelihood function for a sample of independent drawings from this distribution was seen to be (8.03). If we make the definition $\theta \equiv \delta^\alpha$, we can define a whole family of parametrizations indexed by $\alpha$. We may choose $\alpha$ to be any finite, nonzero number. The likelihood function corresponding to this family of parametrizations is
$$L(\boldsymbol{y}, \delta)=\prod_{t=1}^n \delta^\alpha e^{-\delta^\alpha y_t}$$
Evidently, $\alpha=1$ corresponds to the $\theta$ parametrization of (8.02) and $\alpha=-1$ corresponds to the $\phi$ parametrization of (8.07).

It is easy to see that ML estimators are invariant to reparametrizations of the model. Let $\eta: \Theta \rightarrow \Phi \subseteq \mathbb{R}^k$ denote a smooth mapping that transforms the vector $\theta$ uniquely into another vector $\boldsymbol{\phi} \equiv \boldsymbol{\eta} \boldsymbol{\theta})$. The likelihood function for the model $\mathbb{M}$ in terms of the new parameters $\phi$, say $L^{\prime}$, is defined by the relation
$$L^{\prime}(\boldsymbol{y}, \boldsymbol{\phi})=L(\boldsymbol{y}, \boldsymbol{\theta}) \quad \text { for } \boldsymbol{\phi}=\boldsymbol{\eta}(\boldsymbol{\theta}) .$$
Equation (8.23) follows at once from the facts that a likelihood function is the density of a stochastic process and that $\boldsymbol{\theta}$ and $\boldsymbol{\phi}=\boldsymbol{\eta}(\boldsymbol{\theta})$ describe the same stochastic process. Let us define $\hat{\boldsymbol{\phi}}$ as $\boldsymbol{\eta}(\hat{\boldsymbol{\theta}})$ and $\boldsymbol{\phi}^$ as $\boldsymbol{\eta}\left(\boldsymbol{\theta}^\right)$. Then if
$$L(\boldsymbol{y}, \hat{\boldsymbol{\theta}})>L\left(\boldsymbol{y}, \boldsymbol{\theta}^\right) \text { for all } \boldsymbol{\theta}^ \neq \hat{\boldsymbol{\theta}},$$
it follows that
$$L^{\prime}(\boldsymbol{y}, \hat{\boldsymbol{\phi}})=L^{\prime}(\boldsymbol{y}, \boldsymbol{\eta}(\hat{\boldsymbol{\theta}}))=L(\boldsymbol{y}, \hat{\boldsymbol{\theta}})>L\left(\boldsymbol{y}, \boldsymbol{\theta}^\right)=L^{\prime}\left(\boldsymbol{y}, \boldsymbol{\phi}^\right) \text { for all } \boldsymbol{\phi}^* \neq \hat{\boldsymbol{\phi}}$$

## 经济代写|计量经济学代写Econometrics代考|Asymptotic Efficiency of the ML Estimator

In this section, we will demonstrate the asymptotic efficiency of the ML estimator or, strictly speaking, of the Type $2 \mathrm{ML}$ estimator. Asymptotic efficiency means that the variance of the asymptotic distribution of any consistent estimator of the model parameters differs from that of an asymptotically efficient estimator by a positive semidefinite matrix; see Definition 5.6. One says an asymptotically efficient estimator rather than the asymptotically efficient estimator because, since the property of asymptotic efficiency is a property only of the asymptotic distribution, there can (and do) exist many estimators that differ in finite samples but have the same, efficient, asymptotic distribution. An example can be taken from the nonlinear regression model, in which, as we will see in Section 8.10, NLS is equivalent to ML estimation if we assume normality of the error terms. As we saw in Section 6.6, there are nonlinear models that are just linear models with some nonlinear restrictions imposed on them. In such cases, one-step estimation starting from the estimates of the linear model was seen to be asymptotically equivalent to NLS, and hence asymptotically efficient. One-step estimation is possible in the general maximum likelihood context as well and can often provide an efficient estimator that is easier to compute than the ML estimator itself.

We will begin our proof of the asymptotic efficiency of the ML estimator by a discussion applicable to any root- $n$ consistent and asymptotically unbiased estimator of the parameters of the model represented by the loglikelihood function $\ell(\boldsymbol{y}, \boldsymbol{\theta})$. Note that consistency by itself does not imply asymptotic unbiasedness without the imposition of various regularity conditions. Since every econometrically interesting consistent estimator that we are aware of is in fact asymptotically unbiased, we will deal only with such estimators here. Let such an estimator be denoted by $\hat{\boldsymbol{\theta}}(\boldsymbol{y})$, where the notation emphasizes the fact that the estimator is a random variable, dependent on the realized sample $\boldsymbol{y}$. Note that we have changed notation here, since $\hat{\boldsymbol{\theta}}(\boldsymbol{y})$ is in general not the ML estimator. Instead, the latter will be denoted $\tilde{\boldsymbol{\theta}}(\boldsymbol{y})$; the new notation is designed to be consistent with our treatment throughout the book of restricted and unrestricted estimators, since in an important sense the ML estimator corresponds to the former and the arbitrary consistent estimator $\hat{\boldsymbol{\theta}}(\boldsymbol{y})$ corresponds to the latter.

## 经济代写|计量经济学代写econometrics代考| transforms -and- reparameterizations

$$L(\boldsymbol{y}, \delta)=\prod_{t=1}^n \delta^\alpha e^{-\delta^\alpha y_t}$$

$$L^{\prime}(\boldsymbol{y}, \boldsymbol{\phi})=L(\boldsymbol{y}, \boldsymbol{\theta}) \quad \text { for } \boldsymbol{\phi}=\boldsymbol{\eta}(\boldsymbol{\theta}) .$$

$$L(\boldsymbol{y}, \hat{\boldsymbol{\theta}})>L\left(\boldsymbol{y}, \boldsymbol{\theta}^\right) \text { for all } \boldsymbol{\theta}^ \neq \hat{\boldsymbol{\theta}},$$
，则得到
$$L^{\prime}(\boldsymbol{y}, \hat{\boldsymbol{\phi}})=L^{\prime}(\boldsymbol{y}, \boldsymbol{\eta}(\hat{\boldsymbol{\theta}}))=L(\boldsymbol{y}, \hat{\boldsymbol{\theta}})>L\left(\boldsymbol{y}, \boldsymbol{\theta}^\right)=L^{\prime}\left(\boldsymbol{y}, \boldsymbol{\phi}^\right) \text { for all } \boldsymbol{\phi}^* \neq \hat{\boldsymbol{\phi}}$$

