It is common in information theory to look for single letter formulas or to dismiss a result as suboptimal if no single letter formulas are available.
Could someone clarify the meaning of what is a single letter formula?
It is common in information theory to look for single letter formulas or to dismiss a result as suboptimal if no single letter formulas are available.
Could someone clarify the meaning of what is a single letter formula?
In classical information theory (as initiated by Shannon), the answers to most problems (for example, compressing a source i.e. source coding or communicating over a noisy channel i.e. channel coding) are given in terms of random coding schemes involving block-codes.
Say you want to transmit bits (communicate) over a noisy channel. What block-coding means is that in coming up with your communication scheme, you don't transmit one bit at a time - instead, you transmit a large number of bits $n$ at once (for classical theoretical results, one takes $n\to\infty$) - this lets you make use of the law of large numbers and the AEP. While this block-coding approach is not terribly practical, it does allow us to come up with tight answers for the rates of communication/compression needed for information-theoretic problems.
Now since the approach was block coding, the first answer one comes up with may look like this: to communicate reliably over a noisy channel $P_{Y|X}$, it is sufficient to use $R<\frac{1}{n}I(X^n;Y^n)$ bits of communication, where $I(\cdot;\cdot)$ is the mutual information and $X^n=(X_1,\ldots,X_n)$ and $Y^n=(Y_1,\ldots,Y_n)$ are the transmitted block and received block, respectively. The above formula is not a single-letter formula. The single-letter formula for channel coding is given by $R which does not reference any blocks - you only need the joint distribution of $(X,Y)$ to compute it. This is usually desirable because it is easy to check how far practical schemes that we come up with for compression/communication are from optimal performance.