3
$\begingroup$

Let $\mathcal{P}_i$ be the set of probability density functions to which $f_i$ belongs, $(i=0,1)$. Furthermore assume that $L(y)=\frac{f_1(y)}{f_0(y)}$ is an increasing function for any chosen $f_1$ and $f_2$. Let the support of the densities be a compact set in reals defined by $\mathbb{K}$.

For a given threshold $\tau\in\mathbb{K}$ one can calculate the probability of false alarm and probability of miss detection as follows:

$P_F(\tau)=\int_\tau^{\infty}f_0(y)dy$

$P_M(\tau)=\int_{-\infty}^\tau f_1(y)dy$

ROC:=$(P_F(\tau),P_M(\tau))$ forms a curve in $[0,1]$ which is convex.

(ROC, for those who don't know, stands for Receiver Operator Characteristic).

Here is an example:

$f_0(y)=\frac{1}{\sqrt{2\pi\sigma_0^2}}e^{\frac{-\left(y-\mu_0\right)^2}{2\sigma_0^2}}$

$f_1(y)=\frac{1}{\sqrt{2\pi\sigma_1^2}}e^{\frac{-\left(y-\mu_1\right)^2}{2\sigma_1^2}}$

with $\sigma_0=\sigma_1=1$ and $\mu_0=0$ and $\mu_1=1$. Then we have the following figure for $(P_F(\tau),P_M(\tau))$ when $\tau$ is changed from $-\infty$ to $\infty$, ($\mathbb{K}=\mathbb{R}$).

enter image description here

As known and can be seen from the figure, the blue curve is convex.

For any chosen pair of densities $(f_0,f_1)\in \mathcal{P}_0\times \mathcal{P}_1$. The ROC curve (the blue one) $(P_F(\tau),P_M(\tau))$ when $\tau\in (-\infty,\infty)$ will lie in the butterfly given in the figure with red lines assuming that the point $\theta=P_F=P_M$ is common for all densities in $\mathcal{P}_0\times\mathcal{P}_1$ (in the figure $\theta \approx 0.3$)

Question:

Assume that all densities $(f_0,f_1)\in\mathcal{P}_0\times\mathcal{P}_1$ are known to have a particular $\theta$ in their ROC. In other words, let $\mathcal{P}_0\times\mathcal{P}_1$ define only the pair of densities that have $\theta$ in their ROC and furthermore let one choose any pair of density from $\mathcal{P}_0\times\mathcal{P}_1$ with equal probability.

What is the probabilty that a single point of the ROC that we obtain by this selection will lie in the green sector?

Once again the green sector is the intersection of the butterfly with the area under the line which passes through $\theta$ and $f_1/f_0$ is increasing as defined before. One can assume any $\mathbb{K}$ for example $\mathbb{K}=[0,1]$ or ($\mathbb{K}=\mathbb{R}$).

  • 0
    @EdGorcenski ok ok I add the Tag as you wish. I dont want to underestimate any group as I am from signal processing and already discussed the problem with some friends.2012-11-05

1 Answers 1

1

If the black line is tangent, and the blue curve is convex, then there is only a single point of the blue line contained in the green area. This is because the green area is defined by the tangent line, and convexity guaranteed that the blue curve will not intersect the black line, and hence the green region, at any other point.

If you're looking for the probability that a single realization will land in this region, simply compute the area of the region. The ROC curve defines the "dividing line" of classification; however, the unit square is still your global probability space.

If you want to know the probability that a different ROC curve intersects this green region, then you can employ a few different conditions. First, assume that any other ROC curve is convex and continuous. Then, the curve defined by $ R = \left\{ \left(P_F(\tau),P_M(\tau)\right) \right\}$ is continuous and monotonic and maps from $[0,1]$ to $[0,1]$.

Therefore, this curve has a fixed point in $[0,1]$, namely the point where

$P_F(\tau) = P_M(\tau).$

These fixed points lie on the line $y=x$. Obviously, any monotonic curve whose fixed point is $x' > \theta$ will not intersect the green region.

Conversely, any curve with a fixed point $x' \le \theta$ will pass through the green triangle.

Therefore, your probability is $P(x' \le \theta)$ and is uniformly distributed, so your probability in question is therefore exactly $\theta$, which agrees with my previous assessment.

  • 0
    @SeyhmusGüngören This analysis is with respect to some pre-determined value of $\theta$; if you wish to let $\theta$ vary randomly as well, then you can choose to compute the joint probability with any arbitrary distribution on $\theta$. However, the probability of a randomly chosen ROC falling within the butterfly region **with respect to some pre-determined ROC** can be taken to be uniform. The problem then reduces to computing the joint distribution of fixed points. The specific value is unanswerable, as you must provide further information to determine that value.2012-11-05