77 lines
2.4 KiB
Markdown
77 lines
2.4 KiB
Markdown
Cours de Florent.
|
|
|
|
# Information Theory
|
|
|
|
## A quick introduction to regular expressions.
|
|
|
|
Regular expressions (regexp for short) provide an effective tool to define languages.
|
|
The correspondance with finite automata mean that it is possible to efficiently compile a regular expression into an automaton that recognises the corresponding language.
|
|
|
|
On Linux this is precisely what the command grep does.
|
|
|
|
We will explain how an automaton can be generated from a regexp and see how to use the grep command to solve riddles.
|
|
|
|
### Ingredients of classical regexp.
|
|
|
|
```
|
|
the letters of the alphabet
|
|
|
|
+ means or
|
|
used as L1 + L2 where L1 and L2 are two languages
|
|
means a word of L1 or a word of L2.
|
|
denotes the union of the languages
|
|
|
|
. means concatenation
|
|
used as L1.L2
|
|
means a word of L1 followed by a word of L2
|
|
|
|
* means repetition 0 or more times
|
|
used as L*
|
|
means the empty word epsilon (0 repetition) or one or more words of L one after another
|
|
equivalent to epsilon + L + L.L + L.L.L + L.L.L.L + ...
|
|
```
|
|
|
|
### Construction : from regexp to automaton
|
|
|
|
We allow for automaton that allow transitions labelled with epsilon.
|
|
Then we show how to do without them.
|
|
|
|
Details on the board.
|
|
|
|
Note that JFLAP proposes an activity for this construction.
|
|
|
|
There is also an inverse transformation from automaton to regexp, also available on JFLAP.
|
|
This shows that languages defined by a regexp and languages recognized by a finite automaton form the same class of languages, commonly known as regular languages.
|
|
|
|
#### Les ingrédients.
|
|
|
|
Constructions qu'on doit détailler.
|
|
1. comment construire un automate pour le langage a
|
|
1. comment construire un automate pour L1.L2 si on connaît un automate A1 pour L1 et un automate A2 pour L2
|
|
1. comment construire un automate pour L1+L2 si on connaît un automate A1 pour L1 et un automate A2 pour L2
|
|
1. comment construire un automate pour L* si on connaît un automate A pour L
|
|
1. comment décrire sans ambiguité l'ordre des opérations dans une expression régulière
|
|
(écrire l'expression sous forme d'un arbre)
|
|
1. comment combiner toutes ses idées.
|
|
|
|
|
|
### grep
|
|
|
|
We shall in fact use the extended regular expressions of grep.
|
|
Use the command egrep or grep -e.
|
|
|
|
See the manual of grep for the syntax.
|
|
|
|
### Some additionnal commands
|
|
|
|
* tr to replace a character by another
|
|
* grep to search for some regular expression line by line
|
|
* wc to count words (or characters)
|
|
* sort to sort the lines of a file
|
|
|
|
### Exercise
|
|
|
|
Wordle.
|
|
|
|
Demo.
|