2
$\begingroup$

I have a word with 13 characters. Each character is random and alphanumerical (a-z, 0-9).
So for each character there are 36 possibilities: 10 numbers and 26 letters.

I am trying to find out the probabilities for different amounts of segments. I define a segment to be a series of numbers or letters without an element from the other group.

Example:

abcdefg123456 = 2 Segments
abcdefg123abc = 3 Segments
abcdefg123ab1 = 4 Segments, etc.

Now I am looking for the probabilities of having 1 segment, 2 segments, ..., 13 segments but I can't figure it out. Any help is greatly appreciated.

  • 0
    yes. each character is randomly drawn from the 36 possibilities.2012-09-24

2 Answers 2

2

Using Arthur's notation, it might be simpler to calculate $P_1(n,m)=\frac{26}{36}(P_1(n,m-1)+P_2(n-1,m-1))$ and $P_2(n,m)=\frac{10}{36}(P_1(n-1,m-1)+P_2(n,m-1))$ starting at $P_1(1,1)=\frac{26}{36}$ and $P_2(1,1)=\frac{10}{36}$ and with $P_1(n,1)=P_2(n,1)=0$ for other values of $n$.

Looking at $P(n,13)=P_1(n,13) + P_2(n,13)$, I get the approximate values

n   P(n,13)  1   0.01455 2   0.01818 3   0.10000 4   0.09943 5   0.22373 6   0.16575 7   0.19337 8   0.09825 9   0.06141 10  0.01884 11  0.00565 12  0.00078 13  0.00007 

Note the curiosity that this is not smoothly unimodal (6 segments is less likely than 5 and less likely than 7, even though 6 is the median) and there is roughly a 60% probability that the number of segments is odd. If this is not an error on my part, then it is because the most likely outcome is to start with letters and end with letters.

  • 0
    @Arthur: since $\left(\frac{26}{30}\right)^2+\left(\frac{10}{30}\right)^2 \approx 0.5987654321$ I have since convinced myself the strange pattern made sense2012-09-24
3

You could always brute force it, by which I mean this:

Let $P_1(n, m)$ be the probability of $n$ segments among $m$ symbols, where the first segment consists of letters (and is at least 1 long), and define $P_2(n, m)$ be the same but for digits. Then you have the total probability $P(n, m) = P_1(n, m) + P_2(n, m)$.

Since the segments are alternating between letters and digits, you also have the relationship $ P_i(n, m) = \sum_{k=1}^{m-n+1}\left(\frac{x}{36}\right)^kP_j(n-1, m-k) $ where $i\neq j$ and $x$ is either $26$ or $10$, depending on $i$. If you have electronics to help you, this shouldn't be impossible to compute all the way up to $m=13$ for any $n$. Of course, $P_1(1, m) = \left(\frac{26}{36}\right)^m$ and $P_2(1, m) = \left(\frac{10}{36}\right)^m$

Edit: after seing the comment of Jean-Sébastien, I have to say that I assume symbols can repeat independently of eachother.