Statistical Inference 统计推断
Statistical Computing 统计计算
(Generalized) Linear Models 广义线性模型
Statistical Machine Learning 统计机器学习
Longitudinal Data Analysis 纵向数据分析
Foundations of Data Science 数据科学基础

## 统计代写|统计与机器学习作业代写Statistical and Machine Learning代考|Matrix Algebra Review

In this section, we provide the basic elements of linear algebra that are key to understanding the machinery behind the process of building statistical machine learning algorithms.

A matrix is a rectangular arrangement of numbers whose elements can be identified by the row and column in which they are located. For example, matrix $\boldsymbol{E}$, consisting of three rows and five columns, can be represented as follows:
$$\boldsymbol{E}=\left[\begin{array}{lllll} E_{11} & E_{12} & E_{13} & E_{14} & E_{15} \ E_{21} & E_{22} & E_{23} & E_{24} & E_{25} \ E_{31} & E_{32} & E_{33} & E_{34} & E_{35} \end{array}\right]$$
For example, by replacing the matrix with numbers, we have
$$\boldsymbol{E}=\left[\begin{array}{ccccc} 7 & 9 & 4 & 3 & 6 \ 9 & 5 & 9 & 8 & 11 \ 3 & 2 & 11 & 9 & 6 \end{array}\right]$$
where the element $E_{i j}$ is called the $i j$ th element of the matrix; the first subscript refers to the row where the element is located and the second subscript refers to the column, for example, $E_{32}=2$. The order of an array is the number of rows and columns. Therefore, a matrix with $r$ rows and $c$ columns has an order of $r \times c$. Matrix $\boldsymbol{E}$ has an order of $3 \times 5$ and is denoted as $\boldsymbol{E}_3 \times 5$.

In $\mathrm{R}$, the way to establish an array is through the command matrix(…) with parameters of this function given by matrix (data $=N A$, nrow $=3$, ncol $=5$, byrow = FALSE) where data is the data for the matrix, nrow the number of rows, ncol the number of columns, and byrow is the way in which you will accommodate the data in the matrix by row or column. The data entered by default are FALSE, so they will fill the matrix by columns, while if you specified TRUE, they will fill the matrix by rows.
For example, to build matrix $\boldsymbol{E}$ in $\mathrm{R}$, use the following $\mathrm{R}$ script:
$$\boldsymbol{E}=\left[\begin{array}{ccccc} 7 & 9 & 4 & 3 & 6 \ 9 & 5 & 9 & 8 & 11 \ 3 & 2 & 11 & 9 & 6 \end{array}\right]$$

## 统计代写|统计与机器学习作业代写Statistical and Machine Learning代考|Statistical Data Types

To use statistical learning methods correctly, it is very important to understand the classification of the types of data that exist. This is of paramount importance because data are the input to all statistical machine learning methods and because the data type determines the appropriate and valid analysis to be implemented; in addition, each statistical machine learning method is specific to a certain type of data. In general, data are most commonly classified as quantitative (numerical) or qualitative (categorical) (Fig. 1.4).

By quantitative (numerical) data, we understand that the result of the observation or the result of a measurement is a number. They are classified as
(a) Discrete. The variable can only have point values and no values in between, that is, the variable can only have a certain set of possible values and represent items that can be counted because they only have isolated numerical values. Examples: number of household members, number of surgical interventions, number of reported cases of a certain pathology, number of accidents per month, etc. Examples in the context of plant breeding are panicle number per plant, seed number per panicle, weed count per plot, number of infected spikelets per spike, etc. Also, discrete values are called as count responses and those models based on Poisson and negative binomial distribution are appropriate for this type of responses.

(b) Continuous. They are usually the result of a measurement that is expressed in particular units, and values are measured based on a zero point and are treated as real numbers. There are many types of mathematical operations that can be performed on this type of data. The measurements can theoretically have an infinite set of possible values within a range and they do not need transformation. In practice, the possible values of the variable are limited by the accuracy of the measurement method or by the recording mode. Examples: plant height, age, weight, grain yield, $\mathrm{pH}$, blood cholesterol level, etc. The distinction between discrete and continuous data is important for deciding which statistical learning method to use for the analysis, since there are methods that assume that the data are continuous. Consider, for example, the age variable. Age is continuous, but if it is recorded in years, it turns out to be discrete. In studies with adults, in which the age ranges from 20 to 70 years, for example, there are no problems in treating age as continuous, since the number of possible values is large. But in the case of preschool children, if the age is recorded in years, it should be treated as discrete, while if it is recorded in months, it can be treated as continuous.

# 统计与机器学习代考

## 统计代写|统计与机器学习作业代写统计和机器学习代考|矩阵代数评论

.

其中元素$E_{i j}$被称为矩阵的$i j$第th元素;第一个下标指向元素所在的行，第二个下标指向列，例如$E_{32}=2$。矩阵$\boldsymbol{E}$的顺序为$3 \times 5$，记为$\boldsymbol{E}_3 \times 5$。

$$\boldsymbol{E}=\left[\begin{array}{ccccc} 7 & 9 & 4 & 3 & 6 \ 9 & 5 & 9 & 8 & 11 \ 3 & 2 & 11 & 9 & 6 \end{array}\right]$$
，其中元素$E_{i j}$被称为矩阵的$i j$第th元素;第一个下标指向元素所在的行，第二个下标指向列，例如$E_{32}=2$。数组的顺序是行数和列数。因此，具有$r$行和$c$列的矩阵的顺序为$r \times c$。矩阵$\boldsymbol{E}$的顺序为$3 \times 5$，记为$\boldsymbol{E}_3 \times 5$。

$$\boldsymbol{E}=\left[\begin{array}{ccccc} 7 & 9 & 4 & 3 & 6 \ 9 & 5 & 9 & 8 & 11 \ 3 & 2 & 11 & 9 & 6 \end{array}\right]$$

## 统计代写|统计与机器学习作业代写统计和机器学习代考|统计数据类型

.

(a)离散。变量只能有点值，中间不能有点值，也就是说，变量只能有一组可能的值，并且表示可以计算的项目，因为它们只有孤立的数值。例如:家庭成员的数量、外科干预的数量、报告的某种病理病例的数量、每月事故的数量等。在植物育种方面的例子有每株穗数、每穗种子数、每田杂草数、每穗感染小穗数等。此外，离散值被称为计数响应，基于泊松和负二项分布的模型适用于这类响应

(b)连续。它们通常是用特定单位表示的测量结果，数值是基于零点测量的，并被视为实数。可以对这类数据执行多种类型的数学运算。理论上，测量可以在一个范围内拥有无限可能的值集，而且它们不需要转换。在实践中，变量的可能值受到测量方法的准确性或记录方式的限制。例如:株高、年龄、重量、产量、$\mathrm{pH}$、血液胆固醇水平等。离散数据和连续数据之间的区别对于决定使用哪种统计学习方法进行分析非常重要，因为有些方法假设数据是连续的。例如，考虑年龄变量。年龄是连续的，但如果以年为单位记录，则是离散的。例如，在年龄从20岁到70岁的成年人的研究中，将年龄视为连续的是没有问题的，因为可能的值的数量很大。但对于学龄前儿童，如果以年为单位记录年龄，则应视为离散，而如果以月为单位记录年龄，则可视为连续

