1
$\begingroup$

I'm taking a computer science class and right now we need to make a regular expression for a string input consisting of the letters a, b and c with a maximum of 1 b and maximum of 4 c's.

The only operators we can use are:

ab = a and b
a|b = a or b
a+ = at least 1 a
a* = any number of a's
ϵ = empty

I came up with an expression that seems to work, but we are supposed to make the expression as short as possible and of course this is extremely long...

((b | ϵ)a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*) | 
(a*(b | ϵ)(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*) | 
(a*(c | ϵ)(b | ϵ)a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*) | 
(a*(c | ϵ)a*(b | ϵ)(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*) | 
(a*(c | ϵ)a*(c | ϵ)(b | ϵ)a*(c | ϵ)a*(c | ϵ)a*) | 
(a*(c | ϵ)a*(c | ϵ)a*(b | ϵ)(c | ϵ)a*(c | ϵ)a*) | 
(a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)(b | ϵ)a*(c | ϵ)a*) | 
(a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*(b | ϵ)(c | ϵ)a*) | 
(a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)(b | ϵ)a*) | 
(a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*(c | ϵ)a*(b | ϵ))

Any tips on how to simplify this would be appreciated.

1 Answers 1

0

When you write $a b = a \mathop{\mbox{$\mathtt{and}$}} b$, I think you mean that $ab$ means $a$ followed by $b$.

Your answer is along the right lines, but you haven't catered for strings like $abacccc$ where the initial $b$ has an $a$ on either side and then you have $4$ $c$s.

One way to approach the question is to think first about what would happen if you deleted all the letter $a$s from the strings you want to specify. Let me abbreviate the notation by writing $x^?$ for $x \mid \epsilon$, so $x^?$ is an optional $x$. After deleting the $a$s, the resulting possibilities for the $b$s and $cs$ are described by the following regular expression, which describes any string of $b$s and $c$s with at most $1$ $b$ and at most $4$ $c$s.

$$ \begin{array}{cr} (b^?c^?c^?c^?c^?) &\mid \\ (c^?b^?c^?c^?c^?) &\mid \\ (c^?c^?b^?c^?c^?) &\mid \\ (c^?c^?c^?b^?c^?) &\mid \\ (c^?c^?c^?c^?b^?) \end{array} $$

However we want to allow arbitrary strings of $a$s to be interleaved anywhere in the string. This can specified by the following regular expression (where I've just put $a^*$ in every possible slot for a string of $a$s including the beginning and the end).

$$ \begin{array}{cr} a^*\\ ((b^?a^*c^?a^*c^?a^*c^?a^*c^?) &\mid \\ (c^?a^*b^?a^*c^?a^*c^?a^*c^?) & \mid \\ (c^?a^*c^?a^*b^?a^*c^?a^*c^?) & \mid \\ (c^?a^*c^?a^*c^?a^*b^?a^*c^?) & \mid \\ (c^?a^*c^?a^*c^?a^*c^?a^*b^?)) \\ a^* \end{array} $$