1
$\begingroup$

My strings are invalid if it contains aa anywhere as a substring, that is, a specific character two or more times in a row.

What's the probability of a valid string given c possible characters and a string length of n, each string being equally likely?

  • 0
    If the string is `aba`, is it valid? How is the string composed? Of alphabets, or alphanumeric characters?2017-02-14
  • 0
    @GNUSupporter I think the answer makes it perfectly clear that `aba` is valid (since no character appears two or more times in a row). And there are `c` possible characters.2017-02-14
  • 0
    @5xum Thanks for your explanation. I misunderstood "in a row".2017-02-14

2 Answers 2

1

We seek $\mathsf P(n,c)$, the probability that a string of length $n$ randomly generated from $c$ characters will not contain a specified character adjacent to itself.

We can ignore the identity of all $c-1$ characters save the specified character, call it a.   Let the $n$ character string be composed of characters a and b , where the probability that any particular character in the string is a is $1/c$.

The count for arrangements of $k$ a and $(n-k)$ b is $\binom n k$, and each has a probability weight of $(c-1)^{n-k}/c^n$.

However, the count for such arrangements where there is at least one b between every two consecutive a is: $\tbinom{\Box}{\Box}$.

Clearly $k$ cannot be more than the ceiling for half of $n$.   Then the weighted sum is thus:

$$\therefore \mathsf P(n,c) ~=~\sum\limits_{k=0}^{\lceil n/2\rceil} \dbinom{\Box}{\Box} \dfrac{(c-1)^{n-k}}{c^n}$$

Which may have a closed form.

Fill in the boxes.

0

Let us answer first to this other related question
In how many ways can we dispose $m$ characters $a$ in $n$ places such that none of them are contiguous ?

We consider the ordered couple $ax$, where $x$ is one whichever of the remaining $c-1$ characters, acting as a separator. So however we place some of this couples inside a word, we are sure that the $a$'s won't "touch" each other.

Thus it is clear that we can compose a word respecting the above requisites as per the following scheme $$ \underbrace {\overbrace { \cdots x \cdots ,\left( {ax} \right),\; \cdots x \cdots ,\left( {ax} \right),\, \cdots }^{n\;\text{char}}}_{m\, \cdot \;\left( {ax} \right)}\; + \;\underbrace {\overbrace { \cdots x \cdots ,\left( {ax} \right),\; \cdots x \cdots ,\left( {ax} \right),\, \cdots }^{n - 1\;\text{char}}}_{\left( {m - 1} \right)\, \cdot \;\left( {ax} \right)}a $$

The relevant number is $$ \begin{gathered} N_{\text{nc}} (n,c,m)\quad \left| \begin{gathered} \;0 \leqslant n,m \hfill \\ \;1 \leqslant c \hfill \\ \end{gathered} \right.\quad = \hfill \\ = \left( \begin{gathered} n - m \\ m \\ \end{gathered} \right)\left( {c - 1} \right)^{\,n - m} + \left( \begin{gathered} n - 1 - \left( {m - 1} \right) \\ m - 1 \\ \end{gathered} \right)\left( {c - 1} \right)^{\,n - 1 - \left( {m - 1} \right)} = \hfill \\ = \left( {\left( \begin{gathered} n - m \\ m \\ \end{gathered} \right) + \left( \begin{gathered} n - m \\ m - 1 \\ \end{gathered} \right)} \right)\left( {c - 1} \right)^{\,n - m} = \hfill \\ = \left( \begin{gathered} n - m + 1 \\ m \\ \end{gathered} \right)\left( {c - 1} \right)^{\,n - m} = \left( \begin{gathered} n + 1 - m \\ n + 1 - 2m \\ \end{gathered} \right)\left( {c - 1} \right)^{\,n - m} \hfill \\ \end{gathered} $$

and of course the answer to your question will be the sum of that for $m$ that goes from $0$ to $n$, which is actually limited to $\left\lfloor {\left( {n + 1} \right)/2} \right\rfloor = \left\lceil {n/2} \right\rceil $. $$ \bbox[lightyellow] { p(n,c) = \frac{{N_{\text{nc}} (n,c)}} {{c^{\,n} }} = \frac{1} {{c^{\,n} }}\sum\limits_{\left( {0\, \leqslant } \right)\,m\, \leqslant \,\,\left( {\left\lceil {n/2} \right\rceil \, \leqslant \,\,n} \right)} {\left( \begin{gathered} n - m + 1 \\ m \\ \end{gathered} \right)\left( {c - 1} \right)^{\,n - m} } }$$

An example, for $n=4, \, 5$ is given in the figure

String_rep_char