0
$\begingroup$

Let $X_i$ and $Y_i$, $i:1,\ldots,n$, be continuous i.i.d. random variables, uniformly distributed over $(0,1)$. Say we sample from these $RV$s and retain only values complying with $|X_i-Y_i|>\delta$, where $\delta$ is some given small positive constant.

I would like to prove that, EDIT: $P\left(\frac{1}{n} \sum _{i=1}^n (Y_i-X_i)(1-2X_i)>\epsilon\right)\to1$ when $n\to\infty$.

Where $\epsilon$ is a small positive constant which depends on $\delta$, but not on $n$.

3 Answers 3

2

It is most probable that for every $n$, $ \mathrm P\left(\sum_{i=1}^n(X_i-Y_i)(1-2X_i)\gt0\right)\lt\frac12, $ and, for every $\delta$, $ \mathrm P\left(\sum_{i=1}^n(X_i-Y_i)(1-2X_i)\gt0\ \Bigg\vert\ \forall i,|X_i-Y_i|\gt\delta\right)\lt\frac12, $ hence no $\epsilon\gt0$ will do.


Regarding the revised version, let $Z_i=(Y_i-X_i)(1-2X_i)$. Then $(Z_i)_i$ is i.i.d. with mean $\mathrm E(Z_i)=\frac16$ hence, for every $\epsilon\lt\frac16$, the (weak) law of large numbers shows that $ \mathrm P\left(\frac1n\sum_{i=1}^nZ_i\gt\epsilon\right)\to1. $ Likewise, let $(X,Y,Z)$ distributed like $(X_1,Y_1,Z_1)$, $A_\delta=[|X-Y|\gt\delta]$, $U^\delta$ any random variable distributed like $Z$ conditional on $A_\delta$, and $u_\delta=\mathrm E(U^\delta)=\mathrm E(Z\mid A_\delta)$. Then, for every $\epsilon\lt u_\delta$, the (weak) law of large numbers applied to an i.i.d. sequence $(U^\delta_i)_i$ distributed like $U^\delta$ shows that $ \mathrm P\left(\frac1n\sum_{i=1}^nZ_i\gt\epsilon\ \Bigg\vert\ \forall i,|X_i-Y_i|\gt\delta\right)=\mathrm P\left(\frac1n\sum_{i=1}^nU^\delta_i\gt\epsilon\right)\to1. $ To complete the proof, it remains to estimate $u_\delta$. First note that, by invariance of $A_\delta$ with respect to the symmetry $(X,Y)\to(Y,X)$, $\mathrm E(Y-X\,;\,A_\delta)=0$. Hence, $u_\delta=2\mathrm E((X-Y)X\mid A_\delta)$. Since $\mathrm E((X-Y)X\mid A_\delta)=\mathrm E((Y-X)Y\mid A_\delta)$, $u_\delta=\mathrm E((X-Y)^2\mid A_\delta)$. This proves the claim that $u_\delta\gt0$.

One can go further since the density of $|X-Y|$ is $2(1-x)\,[0\leqslant x\leqslant 1]$ and, consequently, the density of $|X-Y|$ conditionally on $A_\delta$ is $f_\delta(x)=2(1-\delta)^{-2}(1-x)\,[\delta\leqslant x\leqslant 1]$.

Thus, $u_\delta=\int\limits_\delta^1x^2f_\delta(x)\,\mathrm dx=\frac16(1+2\delta+3\delta^2)$. In particular, any $\epsilon\lt\frac16$ goes, for every $0\leqslant\delta\lt1$, and $\epsilon=\frac16$ goes, for every $0\lt\delta\lt1$.

  • 0
    Of course, I meant $2(E(X^2)-E(X)^2)=2Var(X)$2012-07-15
1

Edit: here is a new version, with lighter computations and, hopefully, no error. My thanks to Did for pointing out some problems in my former proof. Of course, my answer is now essentially the same as Did...

Let $(X,Y)$ be a couple of random variables uniformly distributed in $[0,1]^2$. Let $\delta \in [0,1)$. The event $\{|X-Y|>\delta\}$ has probability:

$\int_{[0,1]^2} 1_{|x-y|> \delta} \ dx \ dy = 2 \int_{[0,1]} \int_{[0,1]} 1_{x > \delta + y} dx \ dy = (1-\delta)^2.$

Let $(\tilde{X}, \tilde{Y})$ be the random variables $(X,Y)$ conditioned by the event $\{|X-Y|>\delta\}$. It has the following density with respect to the Lebesgue measure on $[0,1]^2$:

$\frac{1_{x > y + \delta} + 1_{x < y - \delta}}{(1-\delta)^2}.$

Hence:

$\mathbb{E} ((\tilde{X}-\tilde{Y})(1-2\tilde{X})) = \frac{1}{(1-\delta)^2} \int_{[0,1]^2} (x-y)(1-2x) (1_{x > y + \delta} + 1_{x < y - \delta}) \ dx \ dy$

$\cdots = \frac{1}{(1-\delta)^2} \int_\delta^1 \int_0^{x-\delta} (x-y)(1-2x) \ dy \ dx + \frac{1}{(1-\delta)^2} \int_0^{1-\delta} \int_{x+\delta}^1 (x-y)(1-2x) \ dy \ dx $

The change of variables $(u,v) = (1-x,1-y)$ shows that these two integrals are the same, so we only need to compute one of them. Thanks to Wolfram,

$\int_\delta^1 \int_0^{x-\delta} (x-y)(1-2x) \ dy \ dx = -\frac{1}{2} \int_\delta^1 (2x-1) (x^2-\delta^2) \ dx = -\frac{(1-\delta)^2 (1+2\delta+3\delta^2)}{12}.$

Thus:

$\mathbb{E} ((\tilde{X}-\tilde{Y})(1-2\tilde{X})) = -\frac{1+2\delta+3\delta^2}{6}.$

This formula gives the good limits when $\delta$ goes to $0$ or $1$. It is also always negative (even for $\delta = 0$). By the law of large numbers, you should expect the sum to be negative for large $n$ almost surely.

  • 0
    @did: This computation is too complicated, it's not surprising that there are errors. I'm re-writing it in a simpler way.2012-07-15
0

[EDIT: the problem with this answer is that $I_A$ and $(Y_i-X_i)(1-2X_i)$ are dependent.]

Let $Z_i$ denote the summand multiplied by the indicator $I_A$ for $|X_i - Yi| > δ$, $Z_i=(Y_i-X_i)(1-2X_i)I_A$. We need to prove that $P\left(\bar{Z_n}>\epsilon\right)\to1$. Since the $Z_i$ are $i.i.d$ and bounded, from the weak law of large numbers $\bar{Z_n}$ converges in probability to $E(Z_i)$. It is therefore sufficient to show that $E(Z_i) > 0$. We proceed to evaluate $E(Z_i)$ analytically, $E(Z_i)=E(I_A)\left(E(Y_i)-2E(Y_iX_i)-E(X_i)+2E(X_i^2)\right)=2E(I_A)\left(E(X_i^2)-E(X_iY_i)\right)=2E(I_A)Var(X_i)>0$ since $0 by definition, $X_i$ and $Y_i$ are independent and with same means, and $Var(X_i)>0$.

  • 0
    See the version of my answer which addresses your new question (but **please** stop modifying your question like that).2012-07-15