66 lines
1.8 KiB
Markdown
66 lines
1.8 KiB
Markdown
|
Cours de Florent.
|
||
|
|
||
|
# Information Theory
|
||
|
|
||
|
## A quick introduction to regular expressions.
|
||
|
|
||
|
Regular expressions (regexp for short) provide an effective tool to define languages.
|
||
|
The correspondance with finite automata mean that it is possible to efficiently compile a regular expression into an automaton that recognises the corresponding language.
|
||
|
|
||
|
On Linux this is precisely what the command grep does.
|
||
|
|
||
|
We will explain how an automaton can be generated from a regexp and see how to use the grep command to solve riddles.
|
||
|
|
||
|
### Ingredients of classical regexp.
|
||
|
|
||
|
```
|
||
|
the letters of the alphabet
|
||
|
|
||
|
+ means or
|
||
|
used as L1 + L2 where L1 and L2 are two languages
|
||
|
means a word of L1 or a word of L2.
|
||
|
denotes the union of the languages
|
||
|
|
||
|
. means concatenation
|
||
|
used as L1.L2
|
||
|
means a word of L1 followed by a word of L2
|
||
|
|
||
|
* means repetition 0 or more times
|
||
|
used as L*
|
||
|
means the empty word epsilon (0 repetition) or one or more words of L one after another
|
||
|
equivalent to epsilon + L + L.L + L.L.L + L.L.L.L + ...
|
||
|
```
|
||
|
|
||
|
### Construction : from regexp to automaton
|
||
|
|
||
|
We allow for automaton that allow transitions labelled with epsilon.
|
||
|
Then we show how to do without them.
|
||
|
|
||
|
Details on the board.
|
||
|
|
||
|
Note that JFLAP proposes an activity for this construction.
|
||
|
|
||
|
There is also an inverse transformation from automaton to regexp, also available on JFLAP.
|
||
|
This shows that languages defined by a regexp and languages recognized by a finite automaton form the same class of languages, commonly known as regular languages.
|
||
|
|
||
|
|
||
|
### grep
|
||
|
|
||
|
We shall in fact use the extended regular expressions of grep.
|
||
|
Use the command egrep or grep -e.
|
||
|
|
||
|
See the manual of grep for the syntax.
|
||
|
|
||
|
### Some additionnal commands
|
||
|
|
||
|
* tr to replace a character by another
|
||
|
* grep to search for some regular expression line by line
|
||
|
* wc to count words (or characters)
|
||
|
* sort to sort the lines of a file
|
||
|
|
||
|
### Exercise
|
||
|
|
||
|
Wordle.
|
||
|
|
||
|
Demo.
|