9
$\begingroup$

We know that conditional probability $P(A | B)$ is undefined when $P(B) = 0$. But this doesn't seem to be true to me always.

Consider the probability of chosing a real number between $r$ such that $0 \leq r \lt 1$ where any real number in $[0, 1)$ is equally likely to be chosen.

Thus, the sample space $S = [0, 1)$. Let $A = \{0.1\}$ and $B = \{0.1, 0.2\}$. Therefore, $P(A) = P(B) = 0$.

Now, speaking from intuition, if we are given that $r$ is either $0.1$ or $0.2$, one might conclude that the probability that $r = 0.1$ is $0.5$. Therefore, it seems like $P(A | B) = 0.5$. But this contradicts the fact that since $P(B) = 0$, $P(A|B)$ is undefined.

Where am I making a mistake?

  • 0
    In your example the probability that X takes on some value in [0,1) is 1. Intuitively that means it could be 0.1. But the probability measure you defined gives X a uniform density and assigns the probability [b-a] to an interval [a,b] where 0<=a2012-09-17

2 Answers 2

2

As you know, when $\mathrm P(B)\ne0$, $\mathrm P(A\mid B)=\mathrm P(A\cap B)/\mathrm P(B)$. The trouble in your context is that $\mathrm P(B)=0$, and even, as you say, that $\mathrm P(A)=0$. A way to define nevertheless some quantity $\mathrm P^*(A\mid B)$ akin to $\mathrm P(A\mid B)$ is to replace $A$ and $B$ by some sets $A_t$ and $B_t$ whose probabilities are positive for every positive $t$ and such that, in a sense, $A_t\to A$ and $B_t\to B$ when $t\to0$. Then one could compute $\mathrm P(A_t\mid B_t)$ for every positive $t$ in the usual way and see if this quantity has a limit when $t\to0$. If so, the limit could be chosen as $\mathrm P^*(A\mid B)$.

In the case $A=\{a\}$ and $B=\{a,b\}$ with $a$ and $b$ in $(0,1)$, one can consider $A_t=A+[-t,t]$ and $B_t=B+[-t,t]$, that is, $A_t=[a-t,a+t]$ and $B_t=[a-t,a+t]\cup[b-t,b+t]$.

Assume that $t$ is small enough. Then, $[a-t,a+t]\subset[0,1]$ hence $\mathrm P(A_t)=2t$, and $[a-t,a+t]\cup[b-t,b+t]\subset[0,1]$ with $[a-t,a+t]\cap[b-t,b+t]=\varnothing$, hence $\mathrm P(B_t)=4t$. Thus $\mathrm P(A_t\mid B_t)=\frac12$ for every $t$ small enough, which suggests indeed that $\mathrm P^*(A\mid B)=\frac12$ is a reasonable choice.

Note that this procedure is relatively robust since $[-t,t]$ could be replaced by any neighbourhood of $0$ shrinking to $\{0\}$ when $t\to0$, for example $[-2t,5t+t^4]$, without changing the final result.

More generally, assume that $A=\{a_1\}$ and $B=\{a_1,a_2,\ldots,a_n\}\subset(0,1)$, that $\mathrm P$ has density $f$ and that $f$ is continuous at $a_k$ for every $1\leqslant k\leqslant n$. One sees that the reasoning above suggests to choose $\mathrm P^*(A\mid B)=\frac{f(a_1)}{f(a_1)+\cdots+f(a_n)}$. In effect, this is equivalent to replacing the nonexistent conditional probability $\mathrm P(\ \mid B)$ by the discrete probability measure $\mathrm P^*(\ \mid B)=\mu_B$ defined by $\mu_B(\{a_k\})=\frac{f(a_k)}{f(a_1)+\cdots+f(a_n)}$ for every $k$ and $\mu_B(C)=0$ if $B\cap C=\varnothing$.

0

First of all, the conditional probability $ P_B(A) = P(A | B) $ is supposed to be a probability. It is very easy, if $B$ is a finite set with $P(B) = 0$, to simply IMPOSE that the conditional probability $P_B$ is uniformly distributed. Since your original $P$ has uniform distribution in some sense (what sense?).

But, what if $B = \mathbb{Q} \cap S$? What would be $P_B$? Notice that $B = \{b_1, b_2, \dotsc\}$ is countable. So, if it is uniformly distributed, $ P_B(b_j) = P_B(b_1) \quad\text{ and }\quad P_B(B) = \sum_j P_B(b_j). $ This implies that either $P(S | B) = 0$ or $P(S | B) = \infty$. Which is NOT a probability at all.

If all you want is that $P(A \cap B) = P(A | B) P(B)$, then you can define $P(A | B)$ the way you think suits best --- or simply leave it undefined --- whenever $P(B) = 0$.

However, imagine that $S = [0,1] \times [0,1]$, and consider the Lebesgue measure $\lambda \times \lambda$, where $\lambda$ is the Lebesgue measure over $[0,1]$. Then, it could make sense to define conditional probabilities for the "slice" $B_x = \{x\} \times [0,1]$, where $P(A | B_x)$ would be the length of $A \cap B$. The same way you just imposed "uniform distribution" on the set $\{0.1, 0.2\}$, I am imposing "uniform distribution" on $B_x$. One nice thing here, is that $ P(A) = \int P(A | B_x) dx. $