0
$\begingroup$

Suppose we have the following Bayesian net (or a probabilistic graphical model):

$L \rightarrow X \leftarrow F$, i.e. $P(L,X,F) = P(X|L,F)P(L)P(F)$ and all of these probabilities are known.

Let $\delta(x)$ be a decision rule: given a $x$ it outputs the value of $L$ with the highest posterior. $\delta(x) = \rm{argmax}_\ell P(L=\ell|x)$.

I want to compute the value of $F$ which maximizes the probability of giving a correct decision:

$\rm{argmax}_f \;\;P(\delta(X)=L | F=f) =?$

I'm confused on how to proceed from here. The fact that the decision rule uses the prior probability of L confuses me. So,

$ P(\delta(X)=L | F=f) = \sum_{i=1}^N P(\delta(x)=i | F=f, L=i) P(L=i|F=f)$

In the first term (right handside), the event $L=i$ is given, so in that case, how can $\delta(x)$ function correctly?

I'm either formulating the objective wrongly, or there are some problems with my notation. Any leads, hints will be highly appreciated.

  • 0
    @EmreA It is just what I saw from your formulation from the second line. You wrote it without mentioning that $F$ and $L$ were independent. Let me have a look at your answer.2012-08-31

1 Answers 1

0

So, here is my answer. Please check for correctness.

I added another variable, $D$, representing the decision. I know that this is not a random variable but it is a function of $X$. As @SeyhmusGungoren suggests in his comment above, once we are given $P(L)$, the decision function $D$ is set up.

The model looks like this now: $L \rightarrow X \leftarrow F, \;\;X\rightarrow D$. The joint is:

$P(L,X,F,D) = P(D|X)P(X|F,L)P(F)P(L)$.

The probability of being correct can be written as:

$P(D=L|F=f) = \sum_i P(D=i, L=i | F=f)$.

$\begin{eqnarray} P(D=i, L=i | F=f) & \propto & \sum_{x} P(D=i,L=i,F=f,X=x) \\ & = & \sum_x P(D=i|X=x) P(X=x|L=i,F=f)P(L=i)P(F=f) \\ & = & P(L=i)P(F=f) \sum_x P(D=i|X=x)P(X=x|L=i, F=f). \end{eqnarray}$

Does this sound right?

  • 0
    I think it sounds ok. The first line is correct and in the last line $P(D=i|X=x)$ this probability is a bit tricky. Because given $x$, $D$ is already known. What we have in probabiliy instead is $P(D=i|L=i)$. Meanwhile according to my experience not too many people here (in MAT SE) are involved in detection theory.2012-08-31