## 统计代写|生物统计代写biostatistics代考|Model 1-JLSM: Under Student’s $t$ Distribution

It is common for observations to come from a population that has a heavy-tailed distribution. Student’s t distribution is one of the basic models for describing heavytailed data, which is common in a variety of applications.

Let $y_{i} \in \mathbb{R}$, for $i=1,2, \ldots, n$, be independently distributed random variables and assume that for each $i, y_{i}$ has the student $\mathrm{t}$ distribution $\left(y_{i} \sim t\left(\mu_{i}, \sigma_{i}^{2}, v\right)\right)$ with the following probability density function (pdf)
$$f\left(y_{i} ; \mu_{i}, \sigma_{i}, v\right)=\frac{c_{v}}{\sigma_{i}}\left(v+\frac{\left(y_{i}-\mu_{i}\right)^{2}}{\sigma_{i}^{2}}\right)^{-\frac{v+1}{2}}$$
where $\mu_{i} \in \mathbb{R}$ and $\sigma_{i}>0$ are the location and scale parameters, respectively. Here $v$ is the degrees of freedom that can be seen as a robustness parameter and down weights the effect of the outliers. As $v$ tends to infinity, this model reduces the JLSM under normal distribution. In this study, the parameter $v$ is regarded as known and we take $v=3$ to achieve the robustness (see Lange et al. 1989; Arslan and Genç 2003). Figure 1 shows the plots of the pdf of the Student $t$ distribution for different degrees of the freedom.
Now, let us consider the JLSM of the student $t$ distribution given by
$$\left{\begin{array}{c} y_{i} \sim t\left(\mu_{i}, \sigma_{i}^{2}, v\right) \ \mu_{i}=\boldsymbol{x}{i}^{T} \boldsymbol{\beta} \ \log \sigma{i}^{2}=z_{i}^{T} \boldsymbol{\gamma} \ i=1,2, \ldots, n \end{array}\right.$$
where $\boldsymbol{x}{i}=\left[x{i 1}, x_{i 2}, \ldots, x_{i p}\right]^{T}$ and $z_{i}=\left[z_{i 1}, z_{i 2}, \ldots, z_{i q}\right]^{T}$ are the observed covariates corresponding to the response $y_{i}$, and $\beta=\left[\beta_{1}, \beta_{2}, \ldots, \beta_{p}\right]^{T} \in \mathbb{R}^{p}$ and $\gamma=\left[\gamma_{1}, \gamma_{2}, \ldots, \gamma_{q}\right]^{T} \in \mathbb{R}^{q}$ are the unknown parameter vectors in the location and scale models, respectively. We will assume that $n>p+q$. Note that, although we use two different sets of explanatory variables to model location and scale parameters, there may be only one set of explanatory variables to model location and scale parameters in some problems.

## 统计代写|生物统计代写biostatistics代考|Under Skew-t Distribution

In addition to heavy tailedness, there can be a presence of high skewness in the data. To accommodate skewness and heavy tailed data together, the construction of flexible parametric skew distributions has received considerable attention in recent years. Numerous authors have developed various classes of these distributions. In this study, we will use the skew-t distribution, which is proposed by Azzalini (2003). To provide a wide and flexible family of modeling data that account for skewness and heavy tail, Azzalini (2003) have proposed skew-t distribution by introducing a generalization of the Student’s t distribution.

Let $y_{i} \in \mathbb{R}$, for $i=1,2, \ldots, n$, be independently distributed random variables and assume that for each $i, y_{i}$ has the skew-t distribution $\left(y_{i} \sim S t\left(\mu_{i}, \sigma_{i}, \lambda i, v\right)\right)$ with the following pdf
$$f_{S_{t}, v}\left(y_{i} ; \mu_{i}, \sigma_{i}, \lambda_{i}, v\right)=\frac{2}{\sigma_{i}} t_{v}\left(y_{i 0}, v\right) T_{v+1}\left(\lambda y_{i 0} \sqrt{\frac{v+1}{v+y_{i 0}^{2}}}\right)$$
where $\mu_{i} \in \mathbb{R}, \sigma_{i}>0$ and $\lambda_{i} \in \mathbb{R}$ are the location, scale and skewness parameters, respectively. Here $y_{i 0}=\left(y_{i}-\mu_{i}\right) / \sigma_{i}, t_{v}(\cdot)$ denotes the pdf of Student t distribution with $v$ degrees of freedom and $T_{v+1}(\cdot)$ denotes the cumulative distribution function (cdf) of Student $t$ distribution with $v+1$ degrees of freedom (Azzalini 2003). Figure 2 shows the plots of the pdf of the skew-t distribution for different values of $\lambda$ and $v$.

Similar to the Student’s t distribution case, the JLSM under skew-t distribution is defined as follows.
$$\left{\begin{array}{c} y_{i} \sim S t\left(\mu_{i}, \sigma_{i}^{2}, \lambda, v\right) \ \mu_{i}=\boldsymbol{x}{i}^{T} \boldsymbol{\beta} \ \log \sigma{i}^{2}=z_{i}^{T} \boldsymbol{\gamma} \ i=1,2, \ldots, n \end{array}\right.$$
Note that, the skewness parameter $\lambda$ has no variability in this model. When $\lambda$ is equal to zero, this model reduces the JLSM of Student $t$ distribution. Moreover when $\lambda=0$ and $\nu \rightarrow \infty$, this model reduces the JLSM of normal distribution. Here we will assume that $n>p+q+1$. Similar to Student’s $\mathrm{t}$ distribution case, the parameter $v$ is taken 3 to achieve the robustness and regarded as known.

We first obtain the ML estimates of the parameters of JLSM of skew-t distribution. Let $\theta=\left(\theta_{1}, \theta_{2}, \ldots, \theta_{s_{2}}\right)=\left(\beta^{T}, \gamma^{T}\right)$ with $s_{2}=p+q+1$ be the combined vector of unknown parameters. Given independent observations $y_{1}, y_{2}, \ldots, y_{n}$ the log likslihood function of $\theta$ corrcsponding to thc JLSM of the skcw t distribution can be written as follows.
$$\ell(\boldsymbol{\theta} \mid y, \boldsymbol{x}, \boldsymbol{z})=n \log \left(c_{v}\right)-\frac{1}{2} \sum_{i=1}^{n} z_{i}^{T} \boldsymbol{\gamma}-\frac{v+1}{2} \sum_{i=1}^{n} \log \left(\nu+\frac{\left(y_{i}-\boldsymbol{x}{i}^{T} \boldsymbol{\beta}\right)^{2}}{\boldsymbol{e}^{z{i}^{T} \gamma}}\right)$$

## 统计代写|生物统计代写biostatistics代考|Model-3 JLSSM: Under Skew-t Distribution

JLSMs of Student’s $t$ and skew-t distributions are limited in addressing only the heteroscedasticity. However, the skewness parameter is at least as important as the other parameters to model the data and it may be different for each observation and depend on some of the covariates. Because of this case, modeling the skewness may also be required. Since our main concern is to provide the best modeling of all parameters and to obtain the best modeling of the data, we also consider the skewness model in addition to location and scale. For this purpose, JLSM under skew-t distribution can be extended to JLSSM under skew-t distribution in order to allow modeling the skewness of the data. In this subsection, we consider the JLSSM under skew-t distribution to take into account the variability of skewness parameter.
Let $y_{i} \in \mathbb{R}$, for $i=1,2, \ldots, n$, be independently distributed with $S t(\mu, \sigma, \lambda, v)$ In some cases, in addition to $\mu$ and $\sigma^{2}$, the skewness parameter $\lambda$ may also be different for each $y_{i}, i=1,2, \ldots, n$, and may also be related to a number of variables. Then, the JLSSM under skew-t distribution is defined as follows.
$$\left{\begin{array}{c} y_{i} \sim S t\left(\mu_{i}, \sigma_{i}^{2}, \lambda_{i}, v\right) \ \mu_{i}=\boldsymbol{x}{i}^{T} \boldsymbol{\beta} \ \log \sigma{i}^{2}=z_{i}^{T} \boldsymbol{\gamma} \ \lambda_{i}=v_{i}^{T} \alpha \ i=1,2, \ldots, n \end{array}\right.$$
where $v_{t}=\left[v_{t 1}, v_{i 2}, \ldots, v_{i r}\right]^{T}$ denote the observed covariates and $\alpha=$ $\left[\alpha_{1}, \alpha_{2}, \ldots, \alpha_{r}\right]^{T} \in \mathbb{R}^{r}$ is the unknown parameter vector in the skewness model. We will assume that $n>p+q+r$.

It is important to stress that the JLSSM under skew-t distribution includes the previous models given in Eqs. (2) and (11) as special cases. If the skewness parameter does not have variability, then JLSSM under skew-t distribution reduces to the JLSM under skew-t distribution. If the skewness parameter is equal to zero, the model reduces the JLSM under Student t distribution. In addition when $v \rightarrow \infty$, the JLSSM under skew-t distribution reduces the JLSSM under skewnormal distribution. The advantage of the JLSSM under skew-t distribution is that it may give a better fit for heavy-tailed and/or asymmetric data sets.

