The probability of a sequence of n independent Bernoulli trials can be easily expressed as $$p(x_1,...,x_n|p_1,...,p_n)=\prod_{i=1}^np_i^{x_i}(1-p_i)^{1-x_i}$$ but what if the trials are not independent? how would one express the probability to capture the dependence?
Dependent Bernoulli trials
2
$\begingroup$
probability
combinatorics
statistics
probability-distributions
machine-learning
-
2You need to completely specify the dependence structure in order to be able to write the probability. What form does it take? – 2012-12-27
-
0Hm, good question :). I suppose all of the $x_i$ are related. So let's say we have a random vector $X=(X_1,\ldots,X_n)$ of correlated random variables, and for each of them we specify a correlation coefficient c_{i,j}. However, I'm not sure how to determine that coefficient. – 2012-12-27
-
0Related: http://math.stackexchange.com/questions/52137/multivariate-normal-and-multivariate-bernoulli/52321#52321 – 2012-12-27
1 Answers
3
The most flexible structure is the one that assings to all possible $n$ binary vectors $\left( x_1, \ldots, x_n \right)$ a probabilty $$P \left[ x_1 = i_1, \ldots, x_n = i_n \right] = p_{i_1, \ldots, i_n}$$ such that $$ \sum_{i_1 = 0}^1 \cdots \sum_{i_n = 0}^1 p_{i_1, \ldots, i_n} = 1$$ Thus, you have to specify $2^n - 1$ parameters (This is much more complicated than your i.i.d. case where you specify $n$ parameters $p_1, \ldots, p_n$.)
There are many ways to simplify the problem.
-
0Thank you. That is indeed quite general. I am basically using a mixture of such Bernoullis and then EM to estimate the parameters. This approach would result in a substantial increase in the time complexity... – 2012-12-27
-
0A random vector of dependent Bernoulli RVs, each with its own parameter $P(X_i=1|X_1,\ldots,X_{i-1},X_{i+1},\ldots,X_n)=p_i$ – 2012-12-27