3
$\begingroup$

Let's consider a set of nodes $V$, and let some nodes be colored with one color choosen between two possible colors; denote the color $\alpha$ and $\beta$, with respectively $I>0$ and $K>0$ nodes colored with them. Note that we could have $K+I, and possibly $I\neq K$. The above coloring is fixed, and we're conditioning upon it.

We're going to generate the set of edges $E$ according to the Erdős–Rényi model (fixed a $p\in (0,1)$, each possible edge follows Bernoulli's law with parameter $p$); now, each uncolored node is going to take the most-viewed-by-him color: in other words, he counts how many nodes in his neighborhood have the first color, and how many have the second color, and eventually become colored with the most-frequent color he sees (if the two color have the same number of occurrences, the node keep uncolored).

Before observing $E$, the probability that a node $v$ take color $\alpha$ is $$\sum_{i=1}^{I} \Big\{ {{K}\choose{i}}p^i (1-p)^{K-i}\sum_{k=0}^{\min\{ i-1,K\}} {{I}\choose{k}}p^k (1-p)^{I-k}\Big\}$$ where $i$ counts the number of $\alpha$-colored nodes in $N(v)$ while $k$ counts the number of $\beta$-colored nodes in $N(v)$. This expression is not very handy, especially if we want to estimate the expected number of new $\alpha$-colored nodes.

I'm looking for useful inequality and estimation that can be used for nontrivial upper and lower bounds to the density above and to the expected value of the new $\alpha$-colored and $\beta$-colored nodes.

In particular, as a first step, I would investigate the case when $p=\frac{c}{n}$ for any $c>0$, $K+I<\frac{n}{\log n}$ and $I=\frac{K}{n^\epsilon}$ for any small $\epsilon>0$.

  • 0
    Your formula seems to be incorrect...2012-11-25
  • 0
    @Yury I can't see it. Why do you think so?2012-11-29
  • 0
    It should be: $\sum_{i=1}^{I} \sum_{k=0}^{\min(i-1,K)} {{I}\choose{i}}p^i (1-p)^{I-i} {{K}\choose{k}}p^k (1-p)^{K-k}$. Do you know whether $I+K$ is large compared with $1/p$? If it is, we can approximate binomial r.v. with Gaussian r.v.2012-11-29
  • 0
    @Yury oh, I get it, thanks. One of the cases I would investigate is having $I=\frac{K}{n^\epsilon}$ and $I+K<\frac{n}{\log n}$2012-11-29
  • 0
    I've fixed the formula in the text of the question.2012-11-29
  • 0
    What about $p$? Is it a constant? Is it $c/n$? (Since the formula doesn't involve $n$, it doesn't matter what $n$ is — it only matters how large $K+I$ is, compared with $1/p$.)2012-11-29
  • 0
    Yep. $p$ is $\frac{c}{n}$.2012-11-29
  • 0
    Then of course, most vertices will be uncolored, approximately $cI$ vertices will be colored with $\alpha$, and $cK$ vertices will be colored with $\beta$.2012-11-29
  • 0
    Sure. I would prove some concentration of measure about the number of new coloured nodes, applying Chernoff Bounds, being interested in particular in high probability (events whose complement has probability that is $O(n^{-\epsilon})$ with $\epsilon>0$). I've tried to use Le Cam theorem, but I didn't succeed.2012-11-29
  • 0
    @Yury, you say that we'll have (approximately) $cI$ and $cK$, but it seems to me that these values should turn out to be weighted as $\frac{c}{n^\epsilon}I$ and $c\frac{n^\epsilon-1}{n^\epsilon}K$. Shouldn't they?2012-11-29

2 Answers 2

1

Let us estimate the probability that we color a given vertex $u$ (which is not among $K+I$ colored vertices) with $\alpha$. Let us say that vertex $u$ has an $\alpha$-neighbor if at least one of its neighbors is colored with $\alpha$; similarly, we say vertex $u$ has a $\beta$-neighbor if at least one of its neighbors is colored with $\beta$. Below we will need the following estimate, for $T\in [1,1/p]$, $$(1-p)^T = e^{T\ln (1-p)} = e^{T(-p+O(p^2))}=1-Tp +O(p^2T^2).$$

First, we prove a lower bound. \begin{align} \Pr[u \text{ has color } \alpha] &\geq \Pr[u \text{ has an }\alpha\text{-neighbor and has no } \beta\text{-neighbor}]\\ & = (1-(1-p)^{I})\cdot (1-p)^K = (1+O(p^2(I+K)^2))\ p I \cdot (1-Kp) . \end{align}

On the other hand, \begin{align} \Pr[u \text{ has color } \alpha] &\leq \Pr[u \text{ has an }\alpha\text{-neighbor} ]\\ & = (1-(1-p)^{I}) = (1+O(p^2(I+K)^2))\ p I . \end{align} Let $\theta = (n-K-I)/n$. Let $n_\alpha$ be the number of vertices that we color in $\alpha$ (not counting those that were already colored with $\alpha$); similarly, $n_\beta$ be the number of vertices we color in $\beta$.

Then $$(1+O(p^2(I+K)^2))\ \theta\, n p I \cdot (1-Kp) \leq \mathbb{E}[n_\alpha] \leq (1+O(p^2(I+K)^2))\ \theta\, n p I.$$

Similarly, we get $$(1+O(p^2(I+K)^2))\ \theta\, n p K \cdot (1-Ip) \leq \mathbb{E}[n_\beta] \leq (1+O(p^2(I+K)^2))\ \theta\, n p K.$$

In the given range of parameters, for $p=c/n$, we have $$\mathbb{E}[n_\alpha] = (1+o(1)) \,c I,$$ and $$\mathbb{E}[n_\beta] = (1+o(1)) \,c K.$$

  • 0
    I've already analysed the case where a node gets a certain colour iff it is the only colour it sees, that is a ``stronger'' case you use to give bounds. I was interested in something finer (a better approximation of the given formula), however you're question is the only that gives a valid approximation and the bounty is expiring, than I thank you very much for your effort.2012-11-30
  • 0
    It seems to me that the problem could be stated within a simpler equivalent scenario: we toss $K$ red coins and $I$ blue coins that follow a Bernoulli$(\frac{c}{n})$ and ask what is the probability that the red heads are more than the blue ones. Then, if the two numbers are called $X$ and $Y$, I want the probability that $X-Y$ is positive. I've tried to get the density using characteristic functions but I'm not able to calculate the inverse Fourier Transform (maybe it's a silly approach 'cause give a worse problem than estimating the formula we start with). Do you have further suggestions?2012-11-30
  • 0
    (1) The bound I gave is pretty accurate in the range of parameters you specified. If you want a more accurate answer for this model you can add up the probabilities of events “$u$ has at least one $\alpha$-neighbor and no $\beta$-neighbors” and “$u$ has at least two $\alpha$-neighbors and exactly one $\beta$-neighbor.” (2) I don't understand the model you describe in your second comment.2012-11-30
1

The sum is $s=\mathbb P(U\lt V)$, where $U$ and $V$ are independent and binomial $(n-1,p)$. By symmetry, $s=\mathbb P(U\gt V)$ hence $2s=1-r$ with $r=\mathbb P(U=V)$.

Note that $r=\mathbb P(W=0)$ where $W=X_1+\cdots+X_{n-1}$ is the sum of $n-1$ i.i.d. steps $X_k$ taking values $\pm1$ or $0$ with respective probabilities $p(1-p)$, $p(1-p)$ and $p^2+(1-p)^2$.

Let $a=2p(1-p)$, then $\mathbb E(\mathrm e^{\mathrm itX_k})=a\cos(t)+1-a$ and $\mathbb E(\mathrm e^{\mathrm itW})=\mathbb E(\mathrm e^{\mathrm itX_k})^{n-1}$ by independence, hence $$ 2\pi r=\int_{-\pi}^{\pi}\mathbb E(\mathrm e^{\mathrm itW})\mathrm dt=\frac1{\sqrt{n}}\int_{-\pi\sqrt{n}}^{\pi\sqrt{n}}\left(a\cos(t/\sqrt{n})+1-a\right)^{n-1}\mathrm dt. $$ Since $\left(a\cos(t/\sqrt{n})+1-a\right)^{n-1}=\left(1-\frac{at^2}{2n}\right)^{n-1}\to\mathrm e^{-at^2/2}$, $$ 2\pi r\sim\frac1{\sqrt{n}}\int_{-\infty}^{+\infty}\mathrm e^{-at^2/2}\mathrm dt=\frac1{\sqrt{n}}\sqrt{\frac{2\pi}a}. $$ Finally, $$ s=\frac{1-r}2=\frac12-\frac{1+o(1)}{c(p)\sqrt{n}},\qquad c(p)=4\sqrt{\pi p(1-p)}. $$

  • 0
    First of all, I made a mistake in writing the sum, as could be seen from the rest of the text (I've written "let SOME nodes be colored with one color choosen between two possible colors", then there could be no simmetry). As far as I correctly understand your formalization of the problem, It seems that I haven't properly explained that the color are already assigned, we're conditioning on them, and what is stochastic are the set of edges.2012-11-17
  • 0
    I apologize for being inaccurate, I've edited the text of the problem.2012-11-17