0
$\begingroup$

This theorem is theorem 3.13 in Hopcroft, Motwani, and Ullman's "Introduction to Automata Theory, Languages, and Computation", 3rd international edition on page 111.

Theorem 3.13:

Let E be a regular expression with variables $L_1, L_2, \dots, L_m$. Form concrete regular expression C by replacing each occurrence of $L_i$ by the symbol $a_i$, for $i = 1,2,\dots,m$. Then for any languages $L_1, L_2, \dots, L_m$, every string $w$ in $L(E)$ can be written $w = w_1w_2...w_k$, where each $w_i$ is in one of the languages, say $L_{ji}$, and the string $a_{j1}a_{j2} \dots a_{jk}$ is in the language $L(C)$.

Less formally, we can construct $L(E)$ by starting with each string in $L(C)$, say $a_{j1}, a_{j2}, \dots a_{jk}$, and substituting for each of the $a_{ji}$'s any string from the corresponding language $L_{ji}$.

PROOF: The proof is a structural induction on the expression E.

BASIS: The basis cases are where E is epsilon, empty set, or a variable L. In the first two cases, there is nothing to prove, since the concrete expression $C$ is the same as $E$.

If $E$ is a variable $L$, then $L(E) = L$. The concrete expression $C$ is just $\mathbf{a}$ where $\mathbf{a}$ is the symbol corresponding to L. Thus, $L(C) = \{a\}$. If we substitute any string in $L$ for the symbol $a$ in this one string, we get the language $L$, which is also $L(E)$.

Questions: What if $L$ contains three elements, such as strings $b$, $bb$, and $bbb$? If we substitute any string in $L$ for the symbol $a$ in this one string, do we not get a subset of $L$? If $C = a$ and substitute the string b for $a$, then $C$ is a subset of $L$. (but not equal to L).

What is wrong with my reasoning?

  • 0
    What is your inductive step in your inductive proof?2017-02-09

1 Answers 1

1

The key passage is, "substituting for each of the $a_{j_i}$'s any string from the corresponding language $L_{j_i}$." You have to consider all strings in $L(C)$ and all possible substitutions. If you only consider one string of $L_{j_i}$, as your example shows, you are not guaranteed the full $L(E)$.

While that "any" is admittedly a bit ambiguous, taken in the context of the formal statement it's meant to clarify, it's reasonably clear. No matter how you pick a string from each $L_{j_i}$, you get a string of $L(E)$. The same string of $L(E)$ may be obtained in more than one way.

  • 0
    I see. Thanks for clarifying the theorem!2017-02-09