0
$\begingroup$

I wonder what are the ways for constructing a distribution over the values that a discrete random variable can take on given its mean.

For example, say a variable $x$ takes on an integer value from from $1$ to $5$, and now given the mean/expected value of $x$ over a population is $3.3$, what are the ways of constructing a distribution over $x$, which results the given mean, and what are the assumptions we need to make for each method? Thanks.

Edit (more context):

Say, a population of people are asked to each choose an integer from $1$ to $5$, and one of them (let's say $i$) chooses 4. In addition, $i$ estimates that the population mean is $3.3$. Now, I am interested in finding out, given the information provided by $i$, what can we say about $i$'s estimated distribution of the population's choices? In other words, by some reasonable assumptions or principle (e.g. maximum entropy), can we construct $i$'s estimated distribution?

  • 0
    Simul-posted to MathOverflow, where it is well on its way to closure, http://mathoverflow.net/questions/78384/construct-a-distribution-for-discrete-random-variable-from-its-mean2011-10-17

3 Answers 3

4

If the discrete random variables takes on $n$ values say $a_1,a_2,\ldots,a_n$, with probability $p_1,p_2,\ldots,p_n$, and you want it to take a given mean $a$, then you have $\displaystyle \sum_{k=1}^{n} p_k a_k = a$ along with the fact that the probabilities must add up to one i.e. $\displaystyle \sum_{k=1}^{n} p_k = 1.$ Hence, in general you have $n-2$ degrees of freedom. It depends what other constraints you want to enforce (or) what is your objective to construct a distribution. You might want to minimize the variance subject to the constraint that you want a certain mean (or) maximize the entropy subject to the constraint that you want a certain mean.

  • 0
    thanks Sivaram, please see my updated question, and would like to hear your suggestion.2011-10-18
2

There are infinitely many different distributions which have the same range and mean.

But suppose you wanted to restrict the probabilities to rational numbers with a given denominator (or some factor of it; it should also be possible to write the mean as a rational number with this as the denominator). Then you certainly could count the possible distributions and create them, and it becomes similar to counting and creating restricted partitions.

Take your example, with the possible integer values 1 to 5 and mean 3.3, and suppose the denominator you were interested was 10. There are 49 different possible distributions (which is the same as the number of partitions of 33 into 10 positive parts each less than or equal to 5). Here are two of them:

Value Prob1 Prob2   1    0.0   0.2    2    0.1   0.1    3    0.5   0.2    4    0.4   0.2    5    0.0   0.3        ---   ---         1     1 

Change the common denominator and you will get a different answer. For example with 4,200 as the denominator, there would be 1,825,356,471 possible distributions (including the earlier 49).

  • 0
    thanks Henry. I realized that more context should be provided for my original question, please see the update above.2011-10-18
2

To expand on Sivaram's answer, the maximum entropy approach, which is often appealed to when one wants to identify a pdf under minimal information, yields the following Lagrangian:

\mathcal L[p, p'] \equiv p_n \log p_n + \lambda_1 n p_n + \lambda_2 p_n

The first term maximizes the entropy, $\lambda_1$ enforces the expectation constraint, $\lambda_2$ enforces the probability distribution constraint.

If you take the partial derivative of $\mathcal{L}$ w.r.t $p_n$ and enforce the constraints, you can find the answer given here.

  • 0
    thanks Emre. the maximum entropy principle is certainly something I consider, please could you comment on this more given the additional information I've updated to my question.2011-10-18