2
$\begingroup$

For languages A and B, let the shuffle of A and B be the language
$ \{w| w = a_1b_1···a_kb_k,$ where $a_1···a_k ∈ A$ and $b_1···b_k ∈ B,$ each$ a_i,b_i ∈ Σ^∗\}$.
Show that the class of regular languages is closed under shuffle.

Approach: If A and B are regular, then there exists an NFA R and T that recognizes them. I was thinking I could run R and T in parallel, so I would start running R by processing $a_1$ and then by using non-determinism, I would jump to the NFA T to process $b_1$. In this process I make sure that both NFAs aren't executed the original way because that would result in accepted strings that are not in the language, so I try to disconnect the edges in each NFA to jump to the other NFA and then disconnect again the edges of this NFA to come back to the previos NFA.. Is this process clear?

Here is the proof, but some lines are a bit unclear to me.enter image description here

I can't understand d)i. I don't understand how the transition function works. How does it detect whether we are at DFA A OR B?

  • 0
    [Related](http://math.stackexchange.com/a/2134503/123852).2017-02-17
  • 0
    @FabioSomenzi.. can you explain (d)i? Why does it mean with N can make $D_A$ in $q_A$2017-02-17
  • 0
    http://www.cs.nthu.edu.tw/~wkhon/assignments/assign1ans.pdf2017-02-17
  • 0
    [Also asked on cs exchange](https://cs.stackexchange.com/questions/70408/show-that-the-class-of-regular-languages-is-closed-under-shuffle)2018-08-25

2 Answers 2

1

There is a problem in that solution because it says that $N$ should accept the empty string. However, the shuffle of two languages $A$ and $B$ contains the empty word if and only if both $A$ and $B$ contain it.

Aside from that, the solution is OK. The constructed automaton uses nondeterminism to guess how the input string $w$ is to be divided into substrings $a_1, a_2, \ldots, a_k$ and $b_1,b_2, \ldots, b_k$ such that $A$ contains $a_1 \cdots a_k$ and $B$ contains $b_1 \cdots b_k$.

Input letters are consumed by either the acceptor of $A$ or the acceptor of $B$. In its infinite wisdom, the nondeterministic automaton doles out individual letters to the right acceptor, if there is indeed a way to split the input word into two words, one in $A$ and the other in $B$.

The new initial state $q_0$ is added because they want to accept the empty string regardless. Otherwise, taking $(q_A,q_B)$ as initial state would work just fine.


Let's prove that the construction is correct. Suppose $w = a_1b_1 \cdots a_kb_k$, with $a_1 \cdots a_k \in A$ and $b_1 \cdots b_k \in B$. When reading a letter from $a_i$, $N$ makes a transition that changes the first component of the state, and leaves the second component unchanged. Likewise, when it reads a letter from $b_i$. Nondeterminism allows $N$ to correctly break up $w$ before seeing it all. When the whole string has been read, the state is accepting for both $A$ and $B$. Hence $N$ accepts.

Suppose $w$ is accepted by $N$. Then there is an accepting run of $N$ on $w$. This run can be divided into maximal segments that are uniform with respect to the type of transitions. That is, in each segment one component of the state does not change. A segment is an $a_i$ if the second component of the state is unchanged, and a $b_i$ otherwise.

Since $a$-segments and $b$-segments alternate, the imbalance between the number of $a$-segments and $b$-segments is at most one. We need to consider four cases.

  1. Even number of segments with an $a$-segment first. This is the easiest case: nothing needs to be done, besides separating the $a$'s from the $b$'s, to find the two words $a_1 \cdots a_k \in A$ and $b_1 \cdots b_k \in B$.

  2. Even number of segments with a $b$-segment first. In this case we sandwich the segments between two empty segments. The one at the beginning becomes $a_0$, and the one at the end becomes $b_k$. We then proceed as in Case 1.

  3. Odd number of segments with an $a$-segment first. We add an empty segment at the end to act as $b_k$.

  4. Odd number of segments with a $b$-segment first. We add an empty segment at the beginning to act as $a_1$.

In conclusion, if $w$ is accepted by $N$, it is in the shuffle of $A$ and $B$.


Another, perhaps more intuitive way to see why the constructed $N$ works is this: Suppose $w = a_1b_1 \cdots a_kb_k$, with $a_1\cdots a_k \in A$ and $b_1\cdots b_k \in B$. Let

$$a_ib_i = a_{i1}a_{i2} \cdots a_{in_i} b_{i1} b_{i2} \cdots b_{im_i} \enspace,$$

where the $a_{ij}$ and $b_{ik}$ are strings of length 1; that is, they are letters in $\Sigma$. Then we can also write

$$ a_ib_i = a_{i1} \,\epsilon\, a_{i2} \,\epsilon\, \cdots \,\epsilon\, a_{in_i} b_{i1} \,\epsilon\, b_{i2} \,\epsilon\, \cdots \,\epsilon\, b_{im_i} \enspace. $$

That is, we can interleave empty strings and enforce the constraint that each $a_i$ and $b_i$ is of length less than or equal to 1. This interpretation allows us to see why the automaton effectively decides for each letter which of the two acceptors is going to consume it.

  • 0
    How could you prove this?2017-02-17
  • 0
    The other thing that really confuses me is that they are referring to input as characters rather than strings. I know that strings are processed character by character, but they don't make any point on it.2017-02-17
  • 0
    It's a shuffle not a perfect shuffle. Are u considering the fact that $a_i$ are strings and not characters?2017-02-17
  • 0
    Ooops! It's not a *perfect* shuffle. OK. Never mind. I got carried away.2017-02-17
  • 0
    Oh no , you pointed that the solution is wrong and I agree. I mean how can you solve the original problem?2017-02-17
  • 0
    For the (imperfect) shuffle, that solution is OK. I'll change my answer.2017-02-17
  • 0
    is $(q_A,q_B)$ the start state of N assuming that the empty string is not allowed?2017-02-17
  • 0
    Yes, $(q_A,q_B)$ works just fine if the empty string is not tacked on to the shuffle. The state $(q_A, q_B)$ is accepting if and only if both $q_A$ and $q_B$ are.2017-02-17
  • 0
    I still don't follow xd. It looks like we are alternating all the time between the two DFAs. I feel like this is not what we are not supposed to do. What we should do that is that we alternate once every $a_i$ or $b_i$ is read and not once each character is read.2017-02-17
  • 0
    @daniel See if you find the proof I added to my answer convincing.2017-02-17
  • 0
    Ok, sorry I am taking too long, so I guess in the NFA there is an arrow that connects $q_0$ and $(q_A,q_B)$ with the empty string. Is that correct?. If so, then it would imply that once the empty string is processed, it ends up in $(q_A,q_B)$ which is not the accepting state.2017-02-17
  • 0
    @daniel No, $\epsilon$-transitions are *optional*. If the input is the empty string, $N$ can sit in $q_0$ and declare success.2017-02-17
  • 0
    I think I got it, so basically the transition function will exhaust the number of letters in one word. During this process, one of the machines is stuck hence it will be receiving input which can't be recognized, so it will stay in the same state without affecting the reading of the first word. It's so easy to see that once the first word is read it will get stuck because it will start receiving input that is just recognized by the other machine. At that time, the second machine starts processing the second word and so on and so on2017-02-17
  • 0
    @daniel Yes. Keep in mind that the same $w$ may be split in many, many ways. One of the two machines may not be stuck, but simply not scheduled to consume input. It's the nondeterministic control that picks one specific way to parse $w$ when there's more than one.2017-02-17
  • 0
    Ohhh ok, so what you mean by "but simply not scheduled to consume input" is that if for example we have two states connected by some letter r and the first state receives input that is not r then the machine was not schedule to consume input and will therefore stay in the same state. Right?2017-02-17
  • 0
    What I mean is that in state $(s,t)$, both machines may be able to process input $\sigma$, but only one is allowed to. So, the next state may be $(s',t)$ or $(s,t')$, but never $(s',t')$.2017-02-17
  • 0
    but it looks like that we are running two functions at the same time but just one is allowed to make a change.2017-02-17
  • 0
    @daniel That's exactly it. It is as if there were a supervisor process that says, "this letter is for you, this one is still for you, but this other one is for the other machine. Stand still while it is being consumed."2017-02-17
  • 0
    That's it. Thank you so much.2017-02-17
  • 1
    I am afraid that the machine would accept strings that are not in the language . Ex $a_1a_2b_1b_2$2017-02-24
  • 0
    I thought the languages had different alphabets, but it looks like they have the same one. I still can't see how this would work. How do we know we can switch functions to read a different word?2017-02-24
  • 0
    @daniel As long as $a_1a_2 \in A$ and $b_1b_2 \in B$, it is also true that $a_1a_2b_1b_2$ is in the shuffle of $A$ and $B$. An NFA guesses the right way to dole out letters to the two automata for $A$ and $B$. Or, if you prefer, tries all ways in parallel, which is in fact what the DFA produced by the subset construction does.2017-02-24
0

I do not think the solution works.

Given:

$ A = \{ a_1a_2, a_1a_2a_3a_4 \}$
$ B = \{ b_1b_2\}$

And this definition of shuffle:

For languages A and B, let the shuffle of A and B be the language
$ \{w| w = a_1b_1···a_kb_k,$ where $a_1···a_k ∈ A$ and $b_1···b_k ∈ B,$ each$ a_i,b_i ∈ Σ^∗\}$.

$ Shuffle(A,B) = \{\epsilon, a_1b_1a_2b_2\} $

But the constructed NFA would accept $ \{\epsilon, a_1b_1a_2b_2, a_1b_1a_2b_2a_3a_4\} $

Arguably, the $\epsilon$ also shouldn't be there.

This homework from Roger Jestes does what I was thinking about. Add another item to the state that indicates which machine the last symbol was sent to. That determines which machine the next symbol will be sent to.

$ Q = (Q_A x Q_B x {1, 2})$ Where 1 means B consumed the last symbol and 2 means a consumed the last symbol.