0
$\begingroup$

$n$ vehicles are stopped at random, the probability that a driver who is stopped is a beginner is $p$ while the probability that a driver who is stopped is a professional is $q$. There are drivers that are neither beginners nor professionals. $X$ is the random variable representing the number of beginners stopped while $Y$ is the number of professional drivers stopped.

  • If $X_k$ is the random variable of the number of beginners stopped knowing that exactly $k$ professionals were also stopped, what is is the conditional probability $\mathrm {Pr}(X=i|Y=k)?$
  • What is the probability $\mathrm {Pr}(X=i, Y=j)$?
  • If $Y_k$ is the variable defined by $\cfrac Y{X+Y}$ with $X+Y = k$, what is the probability distribution and expected value of $Y_k$?
  • 1
    It seems there isn't enough information -- how can this probability be known without knowing how many drivers of which kind arrived?2012-11-04
  • 0
    $n$ total vehicles arrived, we only want to know the distribution of $X$ when $Y$ is fixed at $k\ $ i.e. $0\le i\le n-k$2012-11-04
  • 0
    I don't see how that helps, since you explicitly emphasize that there are also drivers that are neither beginners nor professionals. I don't see how we can derive any information from what's given about how many of the drivers that arrived or were stopped were in that category.2012-11-05
  • 0
    @joriki we can, the probability that a driver who is neither a beginner nor a professional is $1-p-q$. Now we have a total of $n$ drivers, if $i$ beginners were stopped among the $n$, $j$ professional drivers among the remaining $n-i$ and the remaining $n-i-j$ are neither. It is a multinomial distribution with variable amount but fixed probability. It is like there are $np$ beginners, $nq$ professionals and the rest $n(1-p-q)$ it is a multinomial distribution of choosing $n$ drivers from $n$ drivers of three different categories with replacement.2012-11-05
  • 0
    That last comment is entirely at odds with what the question says. It says "the probability that a beginner driver is stopped is $p$", not "the probability that a driver who is stopped is a beginner is $p$". If what you had in mind is correctly described in the comment, you should rewrite the question accordingly.2012-11-05
  • 0
    @joriki I don't get it, if you would explain the difference between the two statement. The question was in French and I may have translated it wrongly but really, I do not get the difference between the two statement.2012-11-05
  • 0
    In "the probability that a beginner driver is stopped", it is assumed that the driver is a beginner, and the probability is that of being stopped; in "the probability that a driver who is stopped is a beginner", it is assumed that the driver is stopped, and the probability is that of being a beginner. Expressed in terms of conditional probabilities, "the probability that a beginner driver is stopped" is $P(\text{stopped}\mid\text{beginner})$, whereas "the probability that a driver who is stopped is a beginner" is $P(\text{beginner}\mid\text{stopped})$.2012-11-05
  • 0
    @joriki I see what you meant. Thank you! Please is there anything else to fix in the question?2012-11-05
  • 0
    "If $X_k$ is the random variable of the number of beginners stopped knowing that exactly k professionals were also stopped, what is the probability $Pr(X_k=i)$?" is wrong. There is no such random variable. What you mean is the conditional probability $Pr(X=i|Y=k)$.2012-11-05
  • 0
    @Robert: That was also my first reaction, but if you Google "conditional random variable", you'll find that some authors introduce such a construction. Not that it seems like a good idea...2012-11-05

1 Answers 1

1

If I understand correctly what you mean by a conditional random variable, your probability $\Pr(X_k=i)$ could equivalently be expressed as $\Pr(X=i\mid Y=k)$. If $k$ professionals were stopped, $n-k$ further people were stopped, and each has an independent probability $p/(1-q)$ of being a beginner and $(1-p-q)/(1-q)=1-p/(1-q)$ of being neither a beginner nor a professional. Thus we have a binomial distribution,

$$ \begin{align} \Pr(X=i\mid Y=k) &= \binom{n-k}i\left(\frac p{1-q}\right)^i\left(\frac{1-p-q}{1-q}\right)^{n-k-i} \\ &= (1-q)^{-(n-k)}\binom{n-k}ip^i(1-p-q)^{n-k-i}\;. \end{align} $$

For the second question, this is just the overall multinomial distribution,

$$ \Pr(X=i,Y=j)=\binom n{i,j,n-i-j}p^iq^j(1-p-q)^{n-i-j}\;. $$

The third question is again less than fully clear to me but from analogy with the first question, it appears that the intention is to condition on $X+Y=k$. As for the first question, this leads to a binomial distribution, this time with probabilities $p/(p+q)$ and $q/(p+q)$:

$$ \begin{align} \Pr(Y=i\mid X+Y=k) &= \binom ki\left(\frac q{p+q}\right)^i\left(\frac p{p+q}\right)^{k-i} \\ &= (p+q)^{-k}\binom kiq^ip^{k-i} \;, \end{align} $$

and the expected value of the proportion $Y/(X+Y)$ is simply the probability $q/(p+q)$, independent of $k$, so this is also the expected value of $Y/(X+Y)$ without conditioning.

  • 0
    I'm sorry about the confusion with the question. I don't understand how you got the independent probabilities for the first and last question. And doesn't the fact that the $k$ professionals have to be chosen also count? I truly don't understand.2012-11-05
  • 0
    @F'OlaYinka: You can derive all this by grinding through the formulas, but I think it would be more valuable to develop some intuition for it. It doesn't matter in which slots the $k$ professionals are; in any case they leave $n-k$ slots to be filled. The professionals become entirely irrelevant once you know that there are $k$ of them; you can then just consider the case of stopping $n-k$ beginners and others, and their probabilities must have the same ratio as they do overall.2012-11-05
  • 0
    @F'OlaYinka: If you do want to do it with formulas, it will be convenient to make use of the fact that the multinomial coefficients factorize as $$ \binom n{i,j,n-i-j}=\binom n{i+j}\binom{i+j}i\;, $$ which allows you to perform the sum over the professionals; you'll see that the probabilities of the other two sorts of drivers remain in place, and you get a normalization factor $(1-q)^{-(n-k)}$ out of the sum that's exactly what's needed to normalize the remaining probabilities to $1$.2012-11-05
  • 0
    you just gave me one more point in my exam. Thank you!2012-11-05