2
$\begingroup$

You are given a String S of length N.

Now, a special subsequence is one that can be represented in the form a^i b^j c^k, where

i≥1, j≥1 and k≥1. For example ,if i=2, j=1, k=3, it represents the string aabccc. In short, a special subsequence is a subsequence that first consist of

i ′a′ characters, followed by j ′b′ characters, followed by k ′c′ characters, where

i≥1, j≥1 and k≥1

Now, you need to find the number of special subsequences of String S.

Note: Two subsequences are considered different if the set of array indexes picked for the 2 subsequences are different.

Sample Input:
abcabc

Sample Output:
7

Explanation:
Valid sub sequences are(1-based indexing):

{1,2,3}

{1,2,6}

{1,5,6}

{4,5,6}

{1,2,5,6}

{1,4,5,6}

{1,2,3,6}

My Approach:
My approach is to recursively go through each and every index of string and check whether the resultant substring meets the criteria or not. But, as you can see this approach will grow exponentially with the length of string. Is there any better way to solve this problem?

1 Answers 1

2

I might approach the problem in the following recursive way to count such substrings without actually listing them:

Define 3 sets $A_a$, $A_b$ and $A_c$ of indexes for the locations of the letters "a", "b" and "c" respectively in the string $S$, these sets taken together may be seen to completely describe $S$

Next define a 2-dimensional array of entries $s_{i,j}$ with the recurrence

$$s_{i,j}=\begin{cases} 0&i=1\, , j\not\in A_a\\ 0&i=2\, , j\not\in A_b\\ 0&i=3\, , j\not\in A_c\\ 1+\sum\limits_{k=1}^{j-1}s_{i,k}&i=1\, , j\in A_a\\ \sum\limits_{k=1}^{j-1}s_{i-1,k}+\sum\limits_{k=1}^{j-1}s_{i,k}&\text{otherwise} \end{cases}$$

And $s_{1,j}=1$ for the least $j\in A_a$.

Then the desired total number of "special subsequences" of length $N$ is

$$\text{special subsequences}=\sum_{k=1}^{N}s_{3,k}$$

Let's use your example of $S=\text{abcabc}$ so that $N=7$, then

$$A_a=\{1,4\}\\ A_b=\{2,5\}\\A_c=\{3,6\}$$

so now display our array of $s_{i,j}$ beneath the sequence $S$

$$\begin{array}{c|cccccc}\text{location index}&1&2&3&4&5&6\\\hline \text{string $S$}&\text{a}&\text{b}&\text{c}&\text{a}&\text{b}&\text{c}\\\hline s_{1,j}&s_{1,1}&0&0&s_{1,4}&0&0\\ s_{2,j}&0&s_{2,2}&0&0&s_{2,5}&0\\ s_{3,j}&0&0&s_{3,3}&0&0&s_{3,6} \end{array}$$

Now $s_{1,1}=1$ and our top row $a_{1,j}$ clearly will always turns out to be powers of $2$, then our recurrence effectively fills in non-zero entries of $a_{i,j}$ with the sum of those values to the left in the same row and in the row above hence

$$\begin{array}{c|cccccc}\text{location index}&1&2&3&4&5&6\\\hline \text{string $S$}&\text{a}&\text{b}&\text{c}&\text{a}&\text{b}&\text{c}\\\hline s_{1,j}&1&0&0&2&0&0\\ s_{2,j}&0&1&0&0&4&0\\ s_{3,j}&0&0&1&0&0&6 \end{array}$$

and so

$$\text{special subsequences}=\sum_{k=1}^{6}s_{3,k}=1+6=7$$

In general this eliminates the need for checking cases. Essentially $s_{i,j}$ do the job of counting substrings that begin with "a" and end with a particular location index.

  • 1
    u used dynamic programming.. this is clearly a better approach.. thanks for the very nice explanation2017-02-27
  • 0
    Thanks @JerryGoyal. You're welcome!2017-02-27
  • 0
    u said top row a1,j clearly will always turns out to be powers of 2 but in the case of aabcabc it will be 12003002017-02-27
  • 0
    Check the $4^{\text{th}}$ row definition of $s_{i,j}$ we have $s_{1,1}=1$, $s_{1,2}=1+1=2$ and $s_{1,5}=1+1+2=4$. They will always be powers of 2 because each successive non-zero $a_{1,j}$ is a sum of powers of 2 (which is a geometic sequence with ratio 2 thus has sum of the form $2^a-1$) plus 1 thus by induction the first row is of the form $2^a$ where $a=0,1,2,\ldots$ left to right for the non-zero $s_{1,j}$.2017-02-27
  • 1
    oh right. get it2017-02-28