## 统计代写|抽样调查作业代写sampling theory of survey代考|SUPERPOPULATION APPROACH

With the fixed population approach considered so far it is difficult, as we have just seen, to hit upon an appropriately optimal strategy or an estimator for $Y$ or $\bar{Y}$ based on a fixed sampling design. So, one approach is to regard $\underline{Y}=\left(Y_{1}, \ldots, Y_{N}\right)^{\prime}$ as a particular realization of an $N$-dimensional random vector $\underline{\eta}=\left(\eta_{1}, \ldots, \eta_{N}\right)^{\prime}$, say, with real-valued coordinates. The probability distribution of $\eta$ defines a population, called a superpopulation. A class of such distributions is called a superpopulation model or just a model, in brief. Our central objective remains to estimate the total (or mean) for the particular realization $\underline{Y}$ of $\underline{\eta}$. But the criteria for the choice of strategies $(p, t)$ may now be changed suitably.

We assume that the superpopulation model is such that the expectations, variances of $\eta_{i}$, and covariances of $\eta_{i}, \eta_{j}$ exist. To simplify notations we write $E_{m}, V_{m}, C_{m}$ as operators for expectations, variances, and covariances with respect to a model and write $Y_{i}$ for $\eta_{i}$ pretending that $\underline{Y}$ is itself a random vector.
Let $\left(p_{1}, t_{1}\right)$ and $\left(p_{2}, t_{2}\right)$ be two unbiased strategies for estimating $Y$, that is, $E_{p_{1}} t_{1}=E_{p_{2}} t_{2}=Y$. Assume that $p_{1}, p_{2}$ are suitably comparable in the sense of admitting samples of comparable sizes with positive selection probabilities. We might have, for example, the same average effective sample sizes; that is,
$$\sum|s| p_{1}(s)=\sum|s| p_{2}(s)$$
where $\sum$ extends over all samples and $|s|$ is the cardinality of $s$.
Then, $\left(p_{1}, t_{1}\right)$ will be preferred to $\left(p_{2}, t_{2}\right)$ if
$$E_{m} V_{p_{1}}\left(t_{1}\right) \leq E_{m} V_{p_{2}}\left(t_{2}\right)$$

## 统计代写|抽样调查作业代写sampling theory of survey代考|Comparison of RHCE and HTE under Model

Incidentally, we have already noted that if a fixed samplesize design is employed with $\pi_{i} \propto Y_{i}$, then $V_{p}(\bar{t})=0$. But $\underline{Y}$ is unknown. So, if $\underline{X}=\left(X_{1}, \ldots, X_{i}, \ldots, X_{N}\right)^{\prime}$ is available such that $Y_{i}$ is approximately proportional to $X_{i}$, for example, $Y_{i}=\beta X_{i}+\varepsilon_{i}$, with $\beta$ an unknown constant, $\varepsilon_{i}$ ‘s small and unknown but $X_{i}$ ‘s known and positive, then taking $\pi_{i} \propto X_{i}$, one may expect to have $V_{p}(\bar{t})$ under control. Any sampling design $p$ with $\pi_{i} \propto X_{i}$ is called an IPPS or $\pi$ PS design-more fully, an inclusion probability proportional to size design. $\mathrm{Nu}-$ merous schemes are available that satisfy or approximate this $\pi$ PS criterion for $n \geq 2$. One may consult BREWER and HANIF (1983) and CHAUDHURI and VOS (1988) for a description of many of them along with a discussion of their properties and limitations. We need not repeat them here.

Supposing $n$ as the common fixed sample size and $N / n=$ $1 / f$ as an integer let us compare $\bar{t}$ based on a $\pi \mathrm{PS}$ scheme with $t_{3}$ based on the RHC scheme with $N / n$ as the common group size and $P_{i}=X_{i} / X$ as the normed size measures. For this we postulate a superpopulation model $\mathcal{M}{2 \gamma}$ : $$Y{i}=\beta X_{i}+\varepsilon_{i}, E_{m}\left(\varepsilon_{i}\right)=0, V_{m}\left(\varepsilon_{i}\right)=\sigma^{2} X_{i}^{\gamma}$$
where $\sigma, \gamma$ are non-negative unknown constants and $Y_{i}$ ‘s are supposed to be independently distributed. Then, with $\pi_{i}=$ $n P_{i}=n X_{i} / X$
\begin{aligned} E_{m}\left[V_{p}\left(t_{3}\right)-V_{p}(\bar{t})\right] \ =& E_{m}\left[\frac{N-n}{N-1} \frac{1}{n} \sum_{i<j} X_{i} X_{j}\left(\frac{Y_{i}}{X_{i}}-\frac{Y_{j}}{X_{j}}\right)^{2}\right.\ &\left.-\sum_{i<j} \sum_{i<j}\left(\pi_{i} \pi_{j}-\pi_{i j}\right)\left(\frac{Y_{i}}{\pi_{i}}-\frac{Y_{j}}{\pi_{j}}\right)^{2}\right] \end{aligned}

