1
$\begingroup$

I tried to solve this problem, and I used something similar to the probability of the Bernoulli distribution to find how many tosses you need to obtain a probability of almost one for one occurrence. Then I just divide the number of tosses in the problem statement by the number of tosses I get.

Is there a better way to do this?

Edit: What I tried is

m = $12$

N = $10^6$

Probability of having X = "m heads followed by m tails" is $\frac{1}{2^m}$

Probability of X at exactly at position i is $1/2^m*(1-1/2^m)^i$

Probability of X after p tries is $\sum_{i=0}^{p}1/2^m*(1-1/2^m)^i =1-(1-1/2^m)^{p+1}$

Now when p is big enough the probability is 1. And then I can just divide . But it depends what you define as big enough and lead to speculative results. There must be a solution based on statistics but It's been a while and I forgot quite a bit.

  • 5
    Use linearity of expectation.2012-02-18
  • 0
    Your explanation of what you tried is really hard to follow. Perhaps you should rewrite it, being very clear about exactly what you did and why you think it makes sense to do it.2012-02-18
  • 1
    To elaborate on Qiaochu's answer, what is the probability that the string of 12 coins starting with the 5467th flip is 6 heads followed by 6 tails?2012-02-19
  • 0
    You might find this article about the distribution of the length of the longest run in coin tossing of interest: http://mathdl.maa.org/images/upload_library/22/Polya/07468342.di020742.02p0021g.pdf2012-02-19

2 Answers 2

3

Well, your question suggest that you want to calculate the expected value of the number of substrings of the form $h^6t^6$ in your million-character string. However your calculations seem like trying to calculate the expected value of an indicator saying if there is a $h^6t^6$ substring (one or more!) in the million-character string.

If I were to answer your question, then I would repeat the hints given above: the probability of $h^6t^6$ at $i^{th}$ position is $1/2^m$, so the expected value of such an indicator is $1/2^m$ and the result is (by linearity) the sum of expected values of all the indicators ($10^6-m+1$ of them).

The problem you are trying to solve is a little bit harder. First, you shouldn't write things like:

Probability of X at exactly at position i is $1/2^m*(1-1/2^m)^i$

because the events "no $h^6t^6$ at $k$" and "$h^6t^6$ at $k+1$" aren't independent, so you can't multiply their probabilities to compute the probability of the intersection.

If the numbers were smaller, then we could use the inclusion-exclusion principle to get a precise answer. In this case I can think of a Markov chain with states $nothing, h, hh, hhh, hhhh, h^5, h^{6+}, h^6t, h^6t^2, h^6t^3, h^6t^4, h^6t^5, success$. You can prepare a 13 x 13 transition matrix $P$ (it is not hard since it is very regular and there are mainly zeros in it, some 1/2 and one 1) and use a computer to compute it's $10^{6th}$ power $R$, multiply $e_1^T \cdot R$ and then the last coordinate of the result vector is the answer. Maybe it is possible to avoid using a computer here because of the regularity of $P$, but I don't feel like checking it now :)

0

Let's take a simpler example to get the gist:

flip m times (m = 5), how many times can we get a 2H followed by 2T (n = 2 + 2 = 4). If you think about it, this can only happen in two circumstances, which are: HHTT_ or _HHTT (where _ could either be a H or a T and it would not matter). This can be deduced by m - n + 1. The Expected number of times that this could happen is the E(x) = m-n+1 * prob(HHTT) = 2/(2^4).

Similarly for m = 1000000 and n = 12 we have E(x) = (1000000-11)/(2^12)