3
$\begingroup$

I am having a few doubt on the interpretation of this problem that I have read on book about interviews questions.

Here the text:

A mythical city contains N=100,000 married couples but no children. Each family wishes to continue the male lane but they do not wish to overpopulate. So, each family has one baby per annum until the arrival of the first boy. Assume that all the children are equally like to be born male and female (and independent). Let $p(n)$ be the percentage of children that are male at the end of the year n. How is this percentage expected to evolve through time?

This is the problem and the solution says that the percentage is expected to remain constant at a level $\frac{1}{2}$.

Thanks in advance; if something is not clear, just ask.

  • 0
    "Assume that all the children are equally like to be born male and female (and independent)" forces the expected fraction of male children to be 0.5.2012-03-05
  • 0
    Thanks for the quick comment. How did you calculate your 0.5.2012-03-05
  • 0
    "equally likely" between two possibilities suggests $\frac{1}{2}$2012-03-05
  • 0
    @Henry: The situation is not symmetric since each family has exactly one boy but may have any number of girls.2012-03-05
  • 0
    @Didier: some children are born in the city, and each child (given that they are born) has an equal probability of being a boy or a girl independent of anything else that happened before their birth. So long as you do not have sex-selective abortion, no family planning of any kind will change this.2012-03-05
  • 0
    @Henry: Yes I know, thank you. My point is that your previous explanations dismiss the (in the end, irrelevant) asymmetry of the problem a tad too quickly for the unexperienced reader.2012-03-05

2 Answers 2

4

During a given year, each family either (1) has a unique child or (2) has no child. Those who previously had a boy decide to have no supplementary child, hence these are all in case (2), but maybe some others are in case (2) as well, for other reasons of their own, this does not matter.

What matters is that each family in case (1) has as much chances to have a boy than a girl. By the law of large numbers, if the number $M$ of families who do procreate during this given year is large and if each procreates independently on the others, $\frac12M+r_M$ boys are born and $\frac12M-r_M$ girls are born, where $r_M$ is random and $|r_M|\ll M$. The proportion of boys amongst the children born this year is $\frac{M/2+r_M}M=\frac12+\varepsilon_M$ with $\varepsilon_M=\frac{r_M}M$ hence $|\varepsilon_M|\ll1$. In other words, roughly one half of all the children born this year are boys.

Thus the hypothesis that the global population is large is important, but the details of the strategy (in the present case, Stop after one boy) are simply not relevant since every adapted strategy (in the sense that the decision on a given year only depends on what happened on the previous years) would yield the same result.


Edit (This is to answer a comment by the OP.)

The preceding paragraphs describe the almost sure behaviour in the limit of large initial populations. Turning to the behaviour in the mean for finite initial populations, note that the distribution of $r_M$ is symmetric, since $r_M$ is the sum of a random number $M$ of i.i.d. centered $\pm1/2$ Bernoulli random variables. Hence $\mathrm E\left(\frac{b_k}{g_k+b_k}\right)=\frac12$ exactly, where $b_k$ and $g_k$ denote the numbers of boys and girls born in generation $k$.

This does not imply that the total numbers $B_k=b_1+\cdots+b_k$ and $G_k=g_1+\cdots+g_k$ of boys and girls born until generation $k$ fulfill the same property.

Consider for example the second generation. Then the distribution of $b_1$ is binomial $(N,\frac12)$, $g_1=N-b_1$, the conditional distribution of $b_2$ conditionally on $b_1$ or $g_1$ is binomial $(g_1,\frac12)$, and $g_2=g_1-b_2$. In particular, $\mathrm E(b_1)=\frac12N$ and $\mathrm E(b_2\mid b_1)=\frac12(N-b_1)$.

Consider the successive ratios $R_k=\frac{B_k}{B_k+G_k}$. Then $R_1=\frac{b_1}N$ hence $\mathrm E(R_1)=\frac12$. On the other hand, $R_2=\frac{b_1+b_2}{N+g_1}=\frac{b_1+b_2}{2N-b_1}$ hence $\mathrm E(R_2\mid b_1)=\frac{b_1+(g_1/2)}{2N-b_1}=\frac12\frac{N+b_1}{2N-b_1}$. By convexity, $$ \mathrm E(R_2)\gt\frac12\frac{N+\mathrm E(b_1)}{2N-\mathrm E(b_1)}=\frac12\frac{N+(N/2)}{2N-(N/2)}=\frac12, $$ hence $\mathrm E(R_2)\ne\frac12$.


Second edit

Counting the children family by family instead of generation by generation, one sees readily that $R_k\to R_\infty$ almost surely when $k\to\infty$, where $R_\infty=\frac{N}{N+\sigma_N}$ and $\sigma_N$ the sum of $N$ i.i.d. geometric random variables $\tau_i$ of parameter $\frac12$, such that $\mathrm P(\tau_i=n)=2^{-n}$ for every $n\geqslant0$. Further computations then show that $$ \mathrm E(R_\infty)=N\int_0^1\frac{u^{N-1}}{(2-u)^N}\mathrm du=\frac12+\frac34\frac1N+o\left(\frac1N\right). $$ In particular, $\mathrm E(R_k)\ne\frac12$ for every $k$ large enough (and probably for every $k\geqslant2$).

  • 0
    Thanks for the answes Didier. I am not fully understand your point. I am reading it carefuly so I can get it. I'll let you know. Thanks2012-03-05
  • 0
    Right. Please ask if some points need to be expanded.2012-03-05
  • 0
    So, ina given generation $r$ is a (kind of) distance between the new male and new female, and for a large population is near 0. Sound right. But we can not say the expectation of the ratio in time is costant to 0.5 like it is in the next answer; am I right? In the last line you mean that if each family decide to stop at a stopping time the situation doesent change? Thanks again.2012-03-05
  • 0
    My answer describes the almost sure behaviour. Turning to the behaviour in the mean, note that the distribution of $r_M$ is symmetric, since $r_M$ is the sum of a random number $M$ of i.i.d. centered $\pm1/2$ Bernoulli random variables. Hence yes, $E(b/(g+b))=1/2$ exactly, where $b$ and $g$ are the numbers of boys and girls born in a given generation. But beware, $E(g/b)\ne1$ (in fact, $g/b$ is not even defined since $b=0$ with positive probability).2012-03-05
  • 0
    Yes it is right, but in the question it seems to ask the proportion in the whole population (of children) and not only in one generation $n$ (and given $\mathcal{F_n}$). Suppose that $M_1$ is the number of boy in the first generation (it is a binomial). Then $p(1)=\frac{M_1}{N}$ and the mean is 0.5. In the second year we have a number of boys of $M_2$ where $M_2|M_1 \sim Bin(N-M_1,0.5)$. Then $p(2)=\frac{M_1+M_2}{N+(N-M_1)}$ and its exepectation is not 0.5 exactly.2012-03-05
  • 0
    Excellent point, see Edit.2012-03-05
  • 0
    Thanks for the answer. Just one thing I dont think $R_k$ converges to 0.5 as you wrote in the last line. As you said count by family and consider $T_i$ the s.t. of the number of girls before the first boy plus one (so T a geometric starting from 1). Than $\lim_k R_k=\frac{N}{\sum T_i}$ where $ \sum T_i$ is a negative binomial. For example for N=1 the expected value is $\log 2$.2012-03-06
  • 0
    Same comment as before. :-)2012-03-06
  • 0
    I just dont get how you define the expectation with the integral, should be easy but I dont see it now. Can you just wrote a quick line on that? Thanks2012-03-06
  • 0
    Write $R_\infty$ as the integral of the function $u\mapsto Nu^{N-1}u^{\sigma_N}$ on $(0,1)$ and take expectations, using $\mathrm E(u^\tau)=1/(2-u)$.2012-03-06
  • 0
    Ahahaha. Very smart! Thanks. Bye.2012-03-06
8

Since the probability of having a child of either particular gender is 1/2 and is independent, the percentage must be expected to remain constant at 1/2 boys and 1/2 girls. (Anytime anyone has a child we expect it to be a boy with probability 1/2, when they stop having children does not matter.)

Maybe it is illuminating to consider the first couple of years: At the end of the first year, we expect 1/2 of the couples to have had boys and 1/2 to have had girls, so we have 1/2 of the children are boys, and 1/2 are girls (or 50,000 boys and 50,000 girls using the N = 100,000 couples of the problem).

At the end of the second year, all of the couples with girls will have another child, 1/2 of them expected to be boys, and 1/2 expected to be girls. Hence, the overall population of children will still be 1/2 boys and 1/2 girls. (So, in particular with N = 100,000, we expect 50,000 of these to have already had a boy, and 50,000 to have another child. Of these 50,000 we expect 25,000 to have boys and 25,000 to have girls. Hence, the population of children at the end of the second year is 75,000 boys and 75,000 girls.)

This will continue until every couple has a boy... The child population from the previous year will be 1/2 boys and 1/2 girls, and the new children born in a particular year will be 1/2 boys and 1/2 girls, leaving the ratio of boys to girls unchanged.

  • 0
    Ok. This is the point of the book, I was thinking if the request was the expectation of the ratio. Consider N=1. At time 1, p(1) could be 0 or 1 with probability $1/2$. At time 2 p(2) could be 0 if two female (prob 0.25); 0.5 if female and male (prob 0.25); 1 if male and no child in the second year (prob 0.5). You can see that the expected value of the ratio is grater than 0.5.2012-03-05
  • 0
    Why? Are my calculations wrong?2012-03-05
  • 0
    Even if N = 1, the expected value is 1/2 male, 1/2 female. (Obviously, the realized value is likely to be different). But, if N = 1 there 0.5 chance of 1 boy, no girls. 0.25 chance of 1 boy, 1 girl, 0.125 chance of 1 boy, 2 girls, 0.0625 chance of 1 boy, 3 girls, etc. So while there is 0.5 probability of getting a boy in the first year, there is a probability of getting many more girls than boys before the couple stops having children.2012-03-05
  • 0
    Let T geometric (for k>0 $P(T=k)=2^{-k}$) than the ratio when the male is arrived is $p(T)=\frac{1}{T}$ ( we can think this as $lim_n p(n)$ ) and $E[p(T)]=\log 2$. in your answer you arecalculating the ratio of the expected number that is diferent to the expected of the ratio.2012-03-05
  • 0
    I'm not sure I understand your terminology. The expected number of boys is 1 and the expected number of girls is 1. So the expected ratio is 1:1, or probability 1/2 of each. (The expected 1 boy and 1 girl is the sum of the geometric series (1/2)^n).2012-03-05
  • 0
    That is what for me is wrong! Doing first the expectation and after the ratio in stead of the opposite. What You have done is $\frac{1}{E[T]}$. so, this is why I said interpretation in the title because i want to understand what the request wants.2012-03-05
  • 0
    Ok, I must have failed to understand your question. Sorry to have not been more helpful.2012-03-05
  • 0
    No sorry!!! I was very happy you answered me and if you want to add any comment or answer you are wellcome.2012-03-05