## 数学代写|信息论代写information theory代考|First asymptotic theorem and related results

In the previous chapter, for one particular example (see Sections $3.1$ and 3.4) we showed that in calculating the maximum entropy (i.e. the capacity of a noiseless channel) the constraint $c(y) \leqslant a$ imposed on feasible realizations is equivalent, for a sufficiently long code sequence, to the constraint $\mathbb{E}[c(y)] \leqslant a$ on the mean value $\mathbb{E}[c(y)]$. In this chapter we prove (Section $4.3$ ) that under certain assumptions such equivalence takes place in the general case; this is the assertion of the first asymptotic theorem. In what follows, we shall also consider the other two asymptotic theorems (Chapters 7 and 11), which are the most profound results of information theory. All of them have the following feature in common: ultimately all these theorems state that, for utmost large systems, the difference between the concepts of discreteness and continuity disappears, and that the characteristics of a large collection of discrete objects can be calculated using a continuous functional dependence involving averaged quantities. For the first variational problem, this feature is expressed by the fact that the discrete function $H=\ln M$ of $a$, which exists under the constraint $c(y) \leq a$, is asymptotically replaced by a continuous function $H(a)$ calculated by solving the first variational problem. As far as the proof is concerned, the first asymptotic theorem turns out to be related to the theorem on canonical distribution stability (Section 4.2), which is very important in statistical thermodynamics and which is actually proved there when the canonical distribution is derived from the microcanonical one. Here we consider it in a more general and abstract form. The relationship between the first asymptotic theorem and the theorem on the canonical distribution once more underlines the intrinsic unity of the mathematical apparatus of information theory and statistical thermodynamics.

Potential $\Gamma(\alpha)$ and its properties are used in the process of proving the indicated theorems. The material about this potential is presented in Section 4.1. It is related to the content of Section 3.3. However, instead of regular physical free energy $F$ we consider dimensionless free energy, that is potential $\Gamma=-F / T$. Instead of parameters $T, a_{2}, a_{3}, \ldots$ common in thermodynamics we introduce symmetrically defined parameters $\alpha_{1}=-1 / T, \alpha_{2}=a_{2} / T, \alpha_{3}=a_{3} / T, \ldots$ Under such choice the temperature is an ordinary thermodynamic parameter along with the others.

## 数学代写|信息论代写information theory代考|Potential Γ or the cumulant generating function

Consider a thermodynamic system or an informational system, for which formula (3.5.1) is relevant. For such a system we introduce symmetrically defined parameters $\alpha_{1}, \ldots, \alpha_{r}$ and the corresponding potential $\Gamma(\alpha)$ that is mathematically equivalent to free energy $F$, but has the following advantage over $F: \Gamma$ is the cumulant generating function.

For a physical thermodynamic system the equilibrium distribution is usually represented by Gibbs’ formula:
$$P(d \zeta)=\exp \left[\frac{F-\mathcal{H}(\zeta, a)}{T}\right] d \zeta$$
where $\zeta=(p, q)$ are dynamic variables (coordinates and impulses); $\mathcal{H}(\zeta, a)$ is a Hamilton’s function dependent on parameters $a_{2}, \ldots, a_{r}$. Formula (4.1.1) is an analog of formula (3.5.2). Now assume that the Hamilton’s function is linear with respect to the specified parameters
$$\mathcal{H}(\zeta, a)=\mathcal{H}{0}(\zeta)-a{2} F^{2}(\zeta)-\cdots-a_{r} F^{r}(\zeta)$$
Then (4.1.1) becomes
$$P(d \zeta)=\exp \left[F-\mathcal{H}{0}(\zeta)+a{2} F^{2}(\zeta)+\cdots+a_{r} F^{r}(\zeta)\right] d \zeta$$
Further we introduce new parameters
$$\alpha_{1}=-1 / T \equiv-\beta ; \quad \alpha_{2}=a_{2} / T ; \quad \ldots \quad ; \quad \alpha_{r}=a_{r} / T$$
which we call canonical external parameters. Parameters
$$B^{1}=\mathcal{H}{0} ; \quad B^{2}=F^{2} ; \quad \ldots \quad ; \quad B^{r}=F^{r}$$ are called random internal parameters and also $$A^{1}=\mathbb{E}\left[\mathcal{H}{0}\right] ; \quad A^{2}=\mathbb{E}\left[F^{2}\right] ; \quad \ldots \quad ; \quad q A^{r}=\mathbb{E}\left[F^{r}\right]$$ are called canonical internal parameters conjugate to external parameters $\alpha_{1}, \ldots$, $\alpha_{r}$. Moreover, we introduce the potential $\Gamma$ and rewrite distribution (4.1.3) in a canonical form:
$$P(d \zeta \mid d)=\exp \left[-\Gamma(\alpha)+B^{1}(\zeta) \alpha_{1}+\cdots+B^{r}(\zeta) a_{r}\right] d \zeta$$

## 数学代写|信息论代写information theory代考|Some asymptotic results of statistical thermodynamics

The deepest results of information theory and statistical thermodynamics have an asymptotic nature, i.e. are represented in the form of limiting theorems under growth of a cumulative system. Before considering the first asymptotic theorem of information theory, we present a related (as it is seen from the proof) result from statistical thermodynamics, namely, an important theorem about stability of the canonical distribution. In the case of just one parameter the latter distribution has the form
$$P(\xi \mid \alpha)=\exp [-\Gamma(\alpha)+\alpha B(\xi)-\varphi(\xi)]$$
If $B(\xi)=\mathscr{H}(p, q)$ is perceived as energy of a system that is Hamilton’s function and $\varphi(\xi)$ is supposed to be zero, then the indicated distribution becomes the canonical Gibbs distribution:

$$\exp \left[\frac{F(T)-\mathcal{H}(p, q)}{T}\right] \quad\left(F(T)=-T \Gamma\left(-\frac{1}{T}\right)\right)$$
where $T=-1 / \alpha$ is temperature. The theorem about stability of this distribution (i.e. about the fact that it is formed by a ‘microcanonical’ distribution for a cumulative system including a thermostat) is called Gibbs theorem.

Adhering to a general and formal exposition style adopted in this chapter, we formulate the addressed theorem in abstract form.

Preliminary, we introduce several additional notions. We call the conditional distribution
$$P_{n}\left(\xi_{1}, \ldots, \xi_{n} \mid \alpha\right)$$
an $n$-th degree of the distribution
$$P_{1}\left(\xi_{1} \mid \alpha\right)$$
if
$$P_{n}\left(\xi_{1}, \ldots, \xi_{n} \mid \alpha\right)=P_{1}\left(\xi_{1} \mid \alpha\right) \cdots P_{1}\left(\xi_{n} \mid \alpha\right)$$
Let the distribution (4.2.3) be canonical:
$$\ln P_{1}\left(\xi_{1} \mid \alpha\right)=-\Gamma_{1}(\alpha)+\alpha B_{n}\left(\xi_{1}\right)-\varphi_{1}\left(\xi_{1}\right)$$

## 数学代写|信息论代写information theory代考|Potential Γ or the cumulant generating function

$$P(d \zeta)=\exp \left[\frac{F-\mathcal{H}(\zeta, a)}{T}\right] d \zeta$$

$$\mathcal{H}(\zeta, a)=\mathcal{H} 0(\zeta)-a 2 F^{2}(\zeta)-\cdots-a_{r} F^{r}(\zeta)$$

$$P(d \zeta)=\exp \left[F-\mathcal{H} 0(\zeta)+a 2 F^{2}(\zeta)+\cdots+a_{r} F^{r}(\zeta)\right] d \zeta$$

$$\alpha_{1}=-1 / T \equiv-\beta ; \quad \alpha_{2}=a_{2} / T ; \quad \ldots \quad ; \quad \alpha_{r}=a_{r} / T$$

$$B^{1}=\mathcal{H} 0 ; \quad B^{2}=F^{2} ; \quad \ldots \quad ; \quad B^{r}=F^{r}$$

$$A^{1}=\mathbb{E}[\mathcal{H} 0] ; \quad A^{2}=\mathbb{E}\left[F^{2}\right] ; \quad \ldots \quad ; \quad q A^{r}=\mathbb{E}\left[F^{r}\right]$$

$$P(d \zeta \mid d)=\exp \left[-\Gamma(\alpha)+B^{1}(\zeta) \alpha_{1}+\cdots+B^{r}(\zeta) a_{r}\right] d \zeta$$

## 数学代写|信息论代写information theory代考|Some asymptotic results of statistical thermodynamics

$$P(\xi \mid \alpha)=\exp [-\Gamma(\alpha)+\alpha B(\xi)-\varphi(\xi)]$$

$$\exp \left[\frac{F(T)-\mathcal{H}(p, q)}{T}\right] \quad\left(F(T)=-T \Gamma\left(-\frac{1}{T}\right)\right)$$

$$P_{n}\left(\xi_{1}, \ldots, \xi_{n} \mid \alpha\right)$$

$$P_{1}\left(\xi_{1} \mid \alpha\right)$$

$$P_{n}\left(\xi_{1}, \ldots, \xi_{n} \mid \alpha\right)=P_{1}\left(\xi_{1} \mid \alpha\right) \cdots P_{1}\left(\xi_{n} \mid \alpha\right)$$

$$\ln P_{1}\left(\xi_{1} \mid \alpha\right)=-\Gamma_{1}(\alpha)+\alpha B_{n}\left(\xi_{1}\right)-\varphi_{1}\left(\xi_{1}\right)$$

