## 数学代写|信息论代写information theory代考|Examples of application of general methods for computation of channel capacity

Example 3.1. For simplicity let there be only two symbols initially: $m=2 ; y=1,2$, which correspond to different costs
$$c(1)=b-a, \quad c(2)=b+a \text {. }$$
In this case, the partition function (3.3.11) is equal to
$$Z=e^{-\beta b+\beta a}+e^{-\beta b-\beta a}=2 e^{-\beta b} \cosh \beta a .$$
By formula (3.3.12) the free energy
$$F=b-T \ln 2-T \ln \cosh \frac{a}{T}$$
corresponds to it. Applying formulae (3.3.12), (3.3.13) we find entropy and average energy
\begin{aligned} &C=H_{y}(T)=\ln 2+\ln \cosh \frac{a}{T}-\frac{a}{T} \tanh \frac{a}{T} \ &R(T)=b-a \tanh \frac{a}{T} \end{aligned}
The graphs of these functions are presented on Figure 3.1. It is also shown there how to determine channel capacity with the given level of $\operatorname{costs} R=R_{0}$ in a simple graphical way.

When temperature changes from 0 to $\infty$, entropy changes from 0 to $\ln 2$ and energy goes from $b-a$ to $b$ (where $a>0$ ). Also in a more general case as $T \rightarrow \infty$ entropy $H_{y}$ goes to limit $\ln m$. This value appears to be the channel capacity corresponding to the absence of constraints. Also, this value cannot be exceeded for any average cost.

The additive parameter $b$ implicated in the cost function (3.4.1) is not essential. Neither entropy nor a probability distribution depends on it. This corresponds to the fact that the additive constant from the expression for energy (recall that $R$ is an analog of energy) can be chosen arbitrarily. In the example in question the optimal probability distribution has the form
$$P(1,2)=e^{\pm a / T} / 2 \cosh (a / T)$$
according to (3.3.14). The determined functions (3.4.2) can be used in cases, for which there exists sequence $y^{L}=\left(y_{1}, \ldots, y_{L}\right)$ of length $L$ that consists of symbols described above. If the number of distinct elementary symbols equals 2 , then the number of distinct sequences is evidently equal to $m=2^{L}$. Next we assume that the costs imposed on an entire sequence are the sum of the costs imposed on symbols, which constitute this sequence. Hence,
$$c\left(y^{L}\right)=\sum_{i=1}^{L} c\left(y_{i}\right)$$

## 数学代写|信息论代写information theory代考|Methods of potentials in the case of a large number of parameters

The method of potentials worded in Sections $3.3$ and $3.4$ can be generalized to more difficult cases when there are a larger number of external parameters, i.e., this method resembles methods usually used in statistical thermodynamics even more.

Here we outline possible ways to realize the above generalization and postpone a more elaborated analysis to the next chapter.

Let cost function $c(y)=c(y, a)$ depend on a numeric parameter $a$ now and be differentiable with respect to this parameter. Then free energy
$$F(T, a)=-T \ln \sum_{y} e^{-c(y, u) / T}$$
and other functions introduced in Section $3.4$ will be dependent not only on temperature $T$ (or parameter $\beta=1 / T$ ), but also on the value of $a$. The same holds true for the optimal distribution (3.3.14) now having the form
$$P(y \mid a)=\exp \left[\frac{F(T, a)-c(y, a)}{T}\right] .$$
Certainly, formula (3.3.15) and other results from Section $3.3$ will remain valid if we account parameter $a$ to be constant when varying parameter $T$, i.e. if regular derivatives are replaced with partial ones.
Hence, entropy of distribution (3.5.2) is equal to
$$H_{y}=-\frac{\partial F(T, a)}{\partial T}$$
Now in addition to these results we can derive a formula of partial derivatives taken with respect to the new parameter $a$.

## 数学代写|信息论代写information theory代考|Capacity of a noiseless channel with penalties

1. In Sections $3.2$ and $3.3$ we considered capacity of a discrete channel without noise, but with penalties. As it was stated, computation of that channel capacity can be reduced to solving the first variational problem of information theory. The results derived here can be generalized to the case of arbitrary random variables, which can assume continuous values, in particular. Some formulae relative to the generalized version of the formula and provided in Section $1.6$ give us some hints how to do so.
We suppose that there is given a noiseless channel (not necessarily discrete) if there are given measurable space $(X, F)$, random variable $\xi$ taking values from the indicated space, $F$-measurable function $c(\xi)$ called a cost function, and also measure $v$ on $(X, F)$ (normalization of $v$ is not required). We define channel capacity $C(a)$ for the level of losses $a$ as the maximum value of entropy (1.6.9):
$$H_{\xi}=-\int P(d \xi) \ln \frac{P(d \xi)}{v(d \xi)},$$
compatible with the constraint
$$\int c(\xi) P(d \xi)=a .$$
Basically, the given variational problem is solved in the same way as it was done in Section 3.3. But note that partial derivatives are replaced with variational derivatives in this modified approach. After variational differentiation with respect to $P(d x)$ instead of (3.3.4) we will have the extremum condition:
$$\ln \frac{P(d \xi)}{v(d \xi)}=\beta F-\beta c(\xi)$$
where $\beta F=-1-\gamma$.
From here we obtain the extremum distribution
$$P(d \xi)=e^{\beta F-\beta c(\xi)} v(d \xi) .$$
Averaging (3.6.3) and taking into account (3.6.1), (3.6.2) we obtain that
$$H_{\xi}=\beta \mathbb{E}[c(\xi)]-\beta F, \quad C=\beta R-\beta F .$$

