## 统计代写|抽样理论作业代写sampling theory 代考|Practical Advantages of Controlling Sampling Correctness

A continuous random variable can be defined as a variable whose cumulative distribution function is continuous. We use a stronger definition that ensures that a continuous random variable has a well-defined (probability) density function.
Definition 3.3.14 (Continuous random variable)
A random variable $X$ is continuous if its distribution function can be expressed as
$$F_X(x)=\int_{-\infty}^x f_X(u) d u \quad \text { for } x \in \mathbb{R}$$
for some integrable function $f_X: \mathbb{R} \longrightarrow[0, \infty)$. The function $f_X$ is called the (probability) density function of $X$.

Definition 3.3.14 shows how the distribution function for a random variable can be found by integrating the density. We can also work in the opposite direction: the density is found by differentiating the cumulative distribution function.
Claim 3.3.15 (Density from cumulative distribution function)
For a continuous random variable $X$ with cumulative distribution function $F_X$, the density function is given by
$$f_X(x)=\left.\frac{d}{d u} F_X(u)\right|{u=x}=F_X^{\prime}(x) \text { for all } x \in \mathbb{R} .$$ The basic properties of density functions are given by the following claim. Claim 3.3.16 (Properties of continuous random variables) If $f_X$ is a density function then i. $f_X(x) \geq 0$ for all $x \in \mathbb{R}$. ii. $\int{-\infty}^{\infty} f_X(x) d x=1$.
The first property is a direct consequence of Definition $3.3 .14$ and the second property can be seen directly from part ii. of Proposition 3.2.3.We use the same notation (lowercase $f$ ) for density functions as we do for mass functions. This serves to emphasise that the density plays the same role for a continuous variable as mass does for a discrete variable. However, there is an important distinction; it is legitimate to have density function values that are greater than one since values of a density function do not give probabilities. Probability is not associated with the values of the density function but with the area beneath the curve that the density function defines. In order to work out the probability that a random variable $X$ takes a value between $a$ and $b$, we work out the area above the $x$-axis, beneath the density, and between the lines $x=a$ and $x=b$. As we might expect, given this interpretation, the total area beneath a density function is one, as stated in Claim 3.3.16. The general relationship between probability and density is given by the following proposition.

## 统计代写|抽样理论作业代写sampling theory 代考|Fundamental Statistical Concepts

The reader, particularly if he or she happens to be a statistician, may wonder if it is useful to introduce in this book a short course on probabilities (i.e., study of random variables prior to experimentation) and especially on statistics (i.e., study of random variables using data provided by experimentation). The answer is a definite yes and the reason is simple: the author, who has given more than 500 short courses on sampling, and who has been in contact with many clients having sampling problems, found that most people involved with sampling in quality assurance and quality control circles have no basic knowledge of descriptive statistics or long ago forgot this knowledge. Nevertheless, because much software is available and many articles cover the subject of sampling in a very short and superficial way, for better or worse, these people are using statistical concepts. Therefore, it is necessary to include this chapter on fundamental statistical concepts, so these wonderful tools are not misused.

The term “statistic” was used for the first time by the German professor Achenwall in 1748. In 1843, Cournot defined statistics as “a science having for objective the collection and coordination of numerous facts within a given category of events, thus obtaining quantified effects which are likely to be independent from happening only by accident.” Statistical concepts are necessary for the development, understanding, and use of the TOS because they give information unpredictable in any other way; furthermore, they strongly link theory with reality. Statistical concepts are the basic “tools” of modern processes and quality control programs. They prevent detrimental effects from accumulating dangerously for too long. They also make possible the anticipation of acceptable or unacceptable operating errors.

The concepts presented in this chapter have their limitations, which are often voluntarily or involuntarily forgotten by an operator, making the conclusions of his statistical evaluation not only naïve but also deprived of any scientific value. Let us imagine an operator who has collected two series of values from a certain experiment. He is plotting one series on the $y$-axis of a rectangular coordinate system and the other series on the corresponding $x$-axis. Thus, he is going to find a series of experimental points in the $x y-$ plane through which he draws a continuous line, obtains a graph, and calculates the equation of the graph. Meanwhile, more often than not, our operator is confronted with numerous problems:

• Interpolation: The temptation is great to allow the simplest continuous curve to fit inside the “area of influence” of each point, however, this curve may or may not be represented by a simple equation, except when it can be approximated into a straight line. For this straight line to be a reality, it may become convenient to change a few variables, forcing a phenomenon to obey a preconceived law that happens to be convenient for the operator. When intervals between points become larger, interpolation may become dangerous because a large quantity of curves are found suitable in appearance, and the operator is likely to choose the curve that helps him to prove his preconceived idea.
• Extrapolation: With the exception of the immediate vicinities of points corresponding to the limits of an experiment, the graph representing all points obtained by the operator should not be extrapolated beyond these limits, because there is neither a scientific nor a legitimate justification to do so. Similarly, for various reasons such as lack of time or funds, it is not uncommon to see statistical evaluations made on the analysis of two, three, or four samples, and some decisions that should have never been made. It is dangerous to extrapolate the information reached from the analysis of very few samples to an infinite population of potential samples because there is neither a scientific nor a legitimate way to find out exactly what kind of probability distribution they belong to without extensive additional testing. This is especially true in the case of trace constituents. In fact, the common problem exposed here is a combination of extrapolation (i.e., shape of a probability distribution defined beyond experimental points) and interpolation (i.e., shape of a probability distribution defined between too few experimental points).

$$F_X(x)=\int_{-\infty}^x f_X(u) d u \quad \text { for } x \in \mathbb{R}$$
。函数$f_X$被称为$X$的(概率)密度函数

$$f_X(x)=\left.\frac{d}{d u} F_X(u)\right|{u=x}=F_X^{\prime}(x) \text { for all } x \in \mathbb{R} .$$密度函数的基本性质由以下权利要求给出。权利要求3.3.16(连续随机变量的属性)如果$f_X$是一个密度函数，那么i. $f_X(x) \geq 0$对所有$x \in \mathbb{R}$。2$\int{-\infty}^{\infty} f_X(x) d x=1$ .

## Fundamental Statistical Concepts

. 读者，特别是如果他或她碰巧是一名统计学家，可能会想在本书中介绍一门关于概率(即在实验前研究随机变量)，特别是关于统计学(即利用实验提供的数据研究随机变量)的短期课程是否有用。答案肯定是肯定的，原因很简单:作者开过500多门关于抽样的短期课程，也接触过许多有抽样问题的客户，发现在质量保证和质量控制圈中，大多数涉及抽样的人没有描述统计学的基本知识，或者早就忘记了这方面的知识。然而，由于许多软件和许多文章都以非常简短和肤浅的方式介绍抽样的主题，无论好坏，这些人使用的是统计概念。因此，有必要包括这一章的基本统计概念，以使这些奇妙的工具不会被滥用 “统计”一词是由德国教授阿肯沃尔在1748年首次使用的。1843年，古诺将统计学定义为“一门客观地收集和协调特定事件类别内众多事实的科学，从而获得量化的效果，而这种效果很可能不依赖于偶然事件的发生。”统计概念对于TOS的开发、理解和使用是必要的，因为它们提供的信息在任何其他方面都是不可预测的;此外，他们将理论与现实紧密联系起来。统计概念是现代过程和质量控制程序的基本“工具”。它们防止有害影响危险地累积太长时间。它们还使预期可接受或不可接受的操作错误成为可能 本章所介绍的概念有其局限性，操作者往往会自愿或不自愿地忘记这些局限性，使其统计评估的结论不仅naïve而且丧失了任何科学价值。让我们想象一个操作员从一个特定的实验中收集了两个系列的值。他在直角坐标系的$y$轴上画了一个级数，在相应的$x$轴上画了另一个级数。因此，他将在$x y-$平面上找到一系列的实验点，通过这些点画出一条连续的直线，得到一个图，并计算出图的方程。同时，我们的运算符通常会遇到许多问题:

• 外推:除了与实验极限相对应的点的直接邻近外推之外，表示算子得到的所有点的图不应该外推超过这些极限，因为这样做既没有科学的理由，也没有合法的理由。同样，由于缺乏时间或资金等各种原因，对两个、三个或四个样本的分析进行统计评估的情况并不少见，而且有些决定本不应该做出。将从分析极少数样本中得到的信息外推到无限数量的潜在样本中是危险的，因为在没有大量额外测试的情况下，既没有科学的也没有合法的方法来确切地找出它们属于哪种概率分布。在微量成分的情况下尤其如此。事实上，这里暴露的共同问题是外推(即定义在超出实验点的概率分布的形状)和插值(即定义在太少实验点之间的概率分布的形状)的组合。

