4
$\begingroup$

The most obvious one that I found was, $S \rightarrow SSS | A | B | C$ $A \rightarrow Aa | \epsilon$ $B \rightarrow Bb | \epsilon$ $C \rightarrow Cc | \epsilon$

However, I realize this CFG is ambious because we can generate different parse trees for the same string due to the rule $S \rightarrow SSS$.
So I revised my language to avoid ambiguity as follows:

$S \rightarrow SABC | ASBC | ABSC | ABCS $ $A \rightarrow Aa | \epsilon$ $B \rightarrow Bb | \epsilon$ $C \rightarrow Cc | \epsilon$

where $S$ is the starting variable in both grammars.

I wonder if this CFG is correct? What I wasn't sure is the first rule $S$, it looks a bit redundant but I don't know how to fix it so that it can generate all strings over the alphabet $\Sigma = \{a, b, c\}$. Any idea?

Update
The question is motivated from this problem:

Show that the complement of the language $L = \{a^nb^mc^k, k = n + m\}$ is context-free.

Since I already found the CFG for the language $L_1 = \{a^nb^mc^k, k \neq n + m\}$, what I'm trying to do is finding the CFG for its complement which includes two parts:

  1. $L_1$
  2. $L_2$ = Any string over alphabet $\Sigma = \{a, b, c\}$ that is not in the form $a^mb^nc^k$.

So if I can find the CFG for 2, the union of these two languages is also CFG by adding an extra rule: $S \rightarrow S_1 | S_2$ where $S_1$ is starting variable of $L_1$, and $S_2$ is starting variable for $L_2$.

3 Answers 3

6

Why don't you use the following easy grammar, where S is the starting variable?

S $\rightarrow$ BS | $\varepsilon$

B $\rightarrow$ a | b | c

Edit for your update:

You want to prove that $L_2$ is context free.
However, as Yuval said $L_3 = \{a^mb^nc^k \mid m,n,k \in \mathbb{N}\}$ is regular.
Now, since $\Sigma^*$ is regular and $\Sigma^*$\ $L_3$ is your wanted language, we know that $L_2$ is regular since regular languages are closed under difference, which implies that $L_2$ context free.

  • 0
    @Dimitri Surinx: Nice one! Thank you.2011-08-21
1

For the motivating problem,

Show that the complement of the language $L = \{a^nb^mc^k, k = n + m\}$ is context-free.

A string is in the complement of L if and only if it contains (at least) one of $ba$, $ca$ or $cb$ as a substring, or a letter other than $a,b$ or $c$ in case of a larger alphabet. This is easily expressed as a union of several languages, or as an automaton that scans the string in search of any of the forbidden substrings.

0

The answer by @zyx goes a long way, but if the string is in $a^* b^* c^*$ you have to ensure the condition on the number of symbols is violated. This condition can go wrong because (1) there are too many $c$, or (2) there are too few. Written in terms of a grammar, with $L$ for less $a$ and $b$ than $c$, and $G$ for more: $ \begin{align*} S &\rightarrow L \mid G \\ L &\rightarrow A c \\ A &\rightarrow A c \mid B \\ B &\rightarrow a B c \mid C \\ C &\rightarrow b C c \mid \epsilon \\ G &\rightarrow a D \mid E \\ D &\rightarrow a D \mid B \\ E &\rightarrow a E c \mid b F \\ F &\rightarrow b F \mid C \end{align*} $ The languages mentioned by zyx are regular, so context free, and context free languages are closed respect to union.

Another take is to obtain the language $\{ a^n b^m c^k \mid n + m \ne k \}$ by operations that preserve context freeness. First consider $L_1 = \{0^n 1^m \mid n \ne m \}$, generated by: $ \begin{align*} S &\rightarrow 0 A \mid B 1 \\ A &\rightarrow 0 A \mid E \\ B &\rightarrow B 1 \mid E \\ E &\rightarrow 0 E 1 \mid \epsilon \end{align*} $ The $A$ branch ensures too many 0, the $B$ branch too many 1; $E$ gives as many 0 as 1, and is context free by closure properties.

Take now the homomorphism defined by $h(a) = h(b) = 0$, $h(c) = 1$. Then $h^{-1}(L_1) \cap a^* b^* c^*$ is the language we are looking for.