Statistical Inference 统计推断
Statistical Computing 统计计算
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

## 机器学习代写|机器学习代写machine learning代考|Fixed or Random Effects

As mentioned in Chap. 1, fixed effect models in general (to design experiments, regression models, genomic prediction models, etc.) are recommended when the levels under study (collected by the scientist) are the unique levels of interest in the study, and the levels or quantities observed in the explanatory variables are treated as if they were nonrandom. For these reasons, a fixed factor is defined as a categorical or classification variable, chosen to represent specific conditions, for which the researcher has included all levels (or conditions) that are of interest for the study in the model. This means that fixed effects are unknown constant parameters associated with continuous covariates or levels of categorical factors in any fixed or mixed effects (fixed + random effects). The estimation of these fixed parameters in fixed effects models or mixed effects models is generally of intrinsic interest, since they indicate the relationships of the covariates with the response variable. Fixed effects can be associated with continuous covariates such as the weight of an animal in kilograms, maize yield in tons per hectare, qualification of a reference test or socioeconomic level, which will carry a continuous range of values, or they can be associated with factors such as gender, hybrid, or group treatment, which are categorical. This implies that fixed effects are the best option when performing an inference for the whole target population.

A random factor is a classification variable with levels that can be randomly sampled from a population with different levels of study, such as classrooms, regions, cattle herds, or clinics that are randomly sampled from a population. All possible levels of the random factor are not present in the data set, yet it is the intention of the researcher to make an inference about the entire population of levels from the selected sample of these factor levels. Random factors are included in an analysis in order for the modification in the dependent variable through the levels of the random factors to be evaluated and the results of the data analysis generalized to all levels of the population random factor. This means that random effects are represented by random variables (not observed) which, we generally assume, have a particular distribution, the normal distribution being the most common. Due to the above, random effects are suggested when we want to perform an inference for all levels of the target population.

## 机器学习代写|机器学习代写machine learning代考|BLUEs and BLUPs

This section presents the concepts and terminologies of BLUE and BLUP. Since these two concepts are related to a mixed model, we present the following linear mixed model as
$$\boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+\mathbf{Z} \boldsymbol{u}+\boldsymbol{\varepsilon},$$
where $Y$ is the vector of response variables of order $n \times 1, X$ is the design matrix of fixed effects of order $n \times \boldsymbol{p}, \boldsymbol{\beta}$ is the vector of order $p \times 1$ of beta coefficients, $\boldsymbol{Z}$ is the design matrix of random effects of order $n \times q, \boldsymbol{u}$ is the vector of random effects distributed as $N(\mathbf{0}, \mathbf{\Sigma})$, where $\boldsymbol{\Sigma}$ is a variance-covariance matrix of random effects of dimension $q \times q$, and $\boldsymbol{\varepsilon}$ is a vector of residuals distributed as $\boldsymbol{N}(\mathbf{0}, \boldsymbol{R})$, where $\boldsymbol{R}$ is a variance-covariance matrix of residual effects of dimension $n \times n$. The unconditional mean of $\boldsymbol{Y}$ is equal to $E(\boldsymbol{Y})=\boldsymbol{X} \boldsymbol{\beta}$, while the conditional mean of $\boldsymbol{Y}$, given the random effects, is equal to $E(\boldsymbol{Y} \mid \boldsymbol{u})=\boldsymbol{X} \boldsymbol{\beta}+\boldsymbol{Z} \boldsymbol{u}$. A solution to jointly “estimate” parameters $\boldsymbol{\beta}$ and $\boldsymbol{u}$ was proposed by Henderson (1950, 1963, 1973, 1975, 1984), which consists in solving the mixed model equation (MME)
$$\left(\begin{array}{cc} \boldsymbol{X}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{X} & \boldsymbol{X}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{Z} \ \boldsymbol{Z}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{X} & \boldsymbol{Z}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{Z}+\boldsymbol{\Sigma}^{-1} \end{array}\right)\left(\begin{array}{c} \widehat{\boldsymbol{\beta}} \ \widehat{\boldsymbol{u}} \end{array}\right)=\left(\begin{array}{c} \boldsymbol{X}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{y} \ \boldsymbol{Z}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{y} \end{array}\right)$$
The solution obtained for $\boldsymbol{\beta}$ is the BLUE and the solution obtained for $\boldsymbol{u}$ is the BLUP.

While this expression to find the estimates of $\widehat{\boldsymbol{\beta}}$ and $\widehat{\boldsymbol{u}}$ may look quite complex, when the number of observations is larger than the sum of the number of fixed effects and the number of random effects $(p+q)$, it is quite efficient since only needs to calculate the inverse of the small matrices of $\boldsymbol{R}$ and $\boldsymbol{\Sigma}$. Also, the matrix on the left that needs to be inverted to obtain the solution for $\widehat{\boldsymbol{\beta}}$ and $\widehat{\boldsymbol{u}}$ is of order $(p+q) \times(p+q)$, which in some applications is considerably less than a matrix of dimension $n \times n$ as $\boldsymbol{V}=\boldsymbol{Z} \boldsymbol{\Sigma} Z^{\mathrm{T}}+\boldsymbol{R}$, which is also useful to obtain the solution of these parameters by using $\widehat{\boldsymbol{\beta}}=\left(\boldsymbol{X}^{\mathrm{T}} \boldsymbol{V}^{-1} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\mathrm{T}} \boldsymbol{V}^{-1} \boldsymbol{y}$ and $\widehat{\boldsymbol{u}}=\boldsymbol{\Sigma} \boldsymbol{Z}^{\mathrm{T}}(\boldsymbol{y}-\boldsymbol{X} \widehat{\boldsymbol{\beta}})$. Under both solutions, for $\widehat{\boldsymbol{\beta}}$ and $\widehat{\boldsymbol{u}}$ it is assumed that the covariance matrices are known, but in practice these are replaced by estimations and the results are known as empirical BLUE (EBLUE) and empirical BLUP (EBLUP).

$$\boldsymbol{Y}=\boldsymbol{X} \boldsymbol{\beta}+\mathbf{Z} \boldsymbol{u}+\boldsymbol{\varepsilon},$$
，其中$Y$是响应变量的向量$n \times 1, X$是阶固定效应的设计矩阵$n \times \boldsymbol{p}, \boldsymbol{\beta}$是阶$p \times 1$的贝塔系数的向量，$\boldsymbol{Z}$是阶$n \times q, \boldsymbol{u}$的随机效应的设计矩阵，是分布为$N(\mathbf{0}, \mathbf{\Sigma})$的随机效应的向量，其中$\boldsymbol{\Sigma}$是维度$q \times q$随机效应的方差-协方差矩阵，$\boldsymbol{\varepsilon}$是分布为$\boldsymbol{N}(\mathbf{0}, \boldsymbol{R})$的残差向量，其中$\boldsymbol{R}$是维度$n \times n$的残差效应的方差-协方差矩阵。$\boldsymbol{Y}$的无条件平均值等于$E(\boldsymbol{Y})=\boldsymbol{X} \boldsymbol{\beta}$，而考虑到随机效应，$\boldsymbol{Y}$的条件平均值等于$E(\boldsymbol{Y} \mid \boldsymbol{u})=\boldsymbol{X} \boldsymbol{\beta}+\boldsymbol{Z} \boldsymbol{u}$。Henderson(1950, 1963, 1973, 1975, 1984)提出了一个联合“估计”参数$\boldsymbol{\beta}$和$\boldsymbol{u}$的解，它包括求解混合模型方程(MME)
$$\left(\begin{array}{cc} \boldsymbol{X}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{X} & \boldsymbol{X}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{Z} \ \boldsymbol{Z}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{X} & \boldsymbol{Z}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{Z}+\boldsymbol{\Sigma}^{-1} \end{array}\right)\left(\begin{array}{c} \widehat{\boldsymbol{\beta}} \ \widehat{\boldsymbol{u}} \end{array}\right)=\left(\begin{array}{c} \boldsymbol{X}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{y} \ \boldsymbol{Z}^{\mathrm{T}} \boldsymbol{R}^{-1} \boldsymbol{y} \end{array}\right)$$
$\boldsymbol{\beta}$的解为BLUE, $\boldsymbol{u}$的解为BLUP

