The first,
and most familiar,
is the union operator,
which we shall denote |.
The rule for the union is that if
R and S are two regular expressions,
then
R | S
denotes the union of the languages that R and S denotes.
That is
L(R|S) = L(R) L(S)
Examples:
Concatenation
The second operator for the algebra of regular expressions
is called concatenation.
It is represented by no operator symbol at all,
just like the multiplication sometimes written without an operator.
Like union,
concatenation is an infix operator.
The rule for the concatenation is that if
R and S are two regular expressions,
then
RS
denotes the concatenation of the languages R and S. L(RS), the language is denoted by RS is formed from the language L(R) and L(S) as follows. For each string r in L(R) and each string s in L(S) the string rs, the concatenation of r and s, is in L(RS).
Examples:
Then RS is the expression ca. To form L(RS), we need to take every string in L(R) and concatenate it with every string in L(S).
Thus L(RS) = { ca }.
Then (RS)T is the expression (ca)t = cat. To from L(RST), we need to take every string in L(R) and concatenate it with every string in L(S) and concatenate the result with every string in L(T).
Thus L(RST) = { cat }.
Let R be the regular expression
a | (ab).
The language L(R) is the union of L(a) and L(ab),
that is { a, ab }.
Let S be the regular expression
c | (bc).
Similarly L(S) = { c, bc }.
RS = ( a | (ab))(c | (bc)) = ( ac, abc, abc, abbc )
Closure
The third operator is called closure. It is an unary postfix operator. Closure is denoted as a star, so R* is the closure of the regular expression R. The effect of the closure operator is to say, ``zero or more occurrences of strings R''. So L(R*) consists of
We can informally write:
R* = | R | RR | RRR | RRR | ...
Last modified: 27/July/98 (12:14)