publicMasters/2-InformationTheory/1InformationTheory.md

72 lines
3.4 KiB
Markdown

Cours de Florent.
# Information Theory
## Coding theory
When we store or transmit data, no system is perfect and some bits of information are incorrectly stred/retrieved or transmitted.
The purpose of this field is to come up with coding and decoding methods that allows to detect and correct errors with a high probablilty.
We shall provide an introduction with simple codes.
## Topics covered on the board
* Binary symmetric channel
* Coding and decoding one bit to obtain arbitrary error
* example for a probability of error of 1/6
repeating 3 times, repeating 5 times.
This is essentially a practical version of Shannon's noisy-channel coding theorem.
[Details here](https://en.wikipedia.org/wiki/Binary_symmetric_channel)
So in a nutshell, we can tranform a binary symmetric channel with this repetition trick into a binary symmetric chanel with an arbitrary low error rate.
It is not very practical because, it is achieved at a very high cost in terms of transmited of information compared with the actual information we wish to send.We shall therefore look for cheaper alternatives.
## Detection
If we are ready to forget about correction and concentrate on detection there is a very simple trick.
We transmit some bits of information $b_1\ldots b_n$ with *one additionnal bit*
$c$ that is computed via a very simple method from these bits, that is a certain function $f$ of $n$ arguments such that $f(b_1\ldots b_n)=c$.
At reception of some word $b'_1\ldots b'_nc'$ we check whether $f(b'_1\ldots b'_n)=c'$. If it does we assume that there is no error (we might be wrong here), it it does not we assume that there is an error and ask for retransmission of this message (we are correct here).
This is used for low level transmission of information, in particular for ascii characters (since we tend to use powers of 2 when transmitting and storing information and we have one available bit when storing the 7 bits of the ascii encoding).
[Details here](https://en.wikipedia.org/wiki/Parity_bit)
## Correction
We shall consider a last example which allows to detect and correct 1 error, and the algorithm to do this is quite simple.
Assume that we have 9 bits of information. We write this bits in the form of a 3 times 3 matrix. We add 7 redundant bits, one at the end of each line, one at the bottom of each column, and one in the bottom right corner. Each redundant bit is the sum modulo 2 of the corresponding column / line.
With this scheme we can detect 2 errors, but not correct them as there might be up to three codewords that are the nearest to a received message with 2 errors.
We can always correct 1 error by recomputing the redundant data and compare it to the received data.
In particular, if the error is within the information part, the line / column of the bit to correct lies at the intersection of the line and colmun where the two redundant bits differ.
We call this informally the matrix code below.
## Hamming distance, Minimal distance of a code.
Drawing on the board.
Example for the repeating 3 times code.
## Comparing codes.
Examples with :
* Repetition code (1 bit of information, 2 redundant bits). Detects 2 errors, corrects 1.
* parity code (7 bits of information, 1 parity bit). Detects 1 error, corrects none.
* matrix code (9 bits of information, 7 redundant bits). Detects 2 errors, corrects 1.
## Conclusion.
Very quick discussion of linear code.
Importance of encoding, detecting and correcting schemes that are efficient.