6
$\begingroup$

For a sequence $X_1,X_2,X_3,\ldots$ of random variables, what it means to say $X_1$ is correlated with $X_2$ is unambiguous. It may be that the bigger $X_1$ is, the bigger $X_2$ is likely to be. If, as in conventional Kolmogorovian probability theory, we regard all these random variables as functions on a probability space, it makes sense to speak of different realizations $X_1(\omega_1), X_2(\omega_1), X_3(\omega_1),\ldots$ and $X_1(\omega_2), X_2(\omega_2), X_3(\omega_2),\ldots$. I.e. run the random process once; get one set of values of $X_1,X_2,X_3,\ldots$; run it again, get another, etc.

But intuitively, it would make sense to say that the size and shape of one Voronoi cell in something like a Poisson process in the plane are correlated with those of its neighbor. But if we run the process once and get neighboring Voronoi cells $X_1$ and $X_2$, and run the process again, then which cells in the second realization correspond to $X_1$ and $X_2$ in the first one? There seems to be no reasonable answer. A Voronoi cell has no immortal soul, a coin toss does. The first thing I think of is to let $X_1$ be the Voronoi cell that contains the origin. But then the expected value of its size is not the same as expected size of an arbitray Voronoi cell (it's bigger!), and which of its neighbors could be $X_2$? The very number of its neighbors varies from one $\omega$ to the next.

Is there some good way to rescue a concept of correlation of size and shape of neighboring Voronoi cells?

1 Answers 1

3

Empirical characteristics of a typical cell $\widehat C$ are usually defined by the ergodic limits $$ E(\varphi(\widehat C))=\lim_{R\to\infty}\frac1{|\mathcal C_R|}\sum\varphi(C)\cdot[C\in \mathcal C_R], $$ defined for every suitable function $\varphi$, where $\mathcal C_R$ is the almost surely finite collection of cells $C$ such that $C\subseteq B_R$ or $C\cap B_R\ne\varnothing$ or any similar notion, and where each $B_R$ is the ball of radius $R$ centered at $0$, or a similar domain increasing to the whole space when $R\to\infty$.

Likewise, empirical characteristics of typical neighbouring cells can be defined through the limits $$ E(\varphi(\widehat{C,C'}))=\lim_{R\to\infty}\frac1{|\mathcal C^{(2)}_R|}\sum\varphi(C,C')\cdot[(C,C')\in \mathcal C^{(2)}_R], $$ where $\mathcal C^{(2)}_R$ is the almost surely finite collection of couples of neighbouring cells $(C,C')$ such that $C\cup C'\subseteq B_R$ or $(C\cup C')\cap B_R\ne\varnothing$ or any similar notion.

These define the empirical distribution of a typical cell $\widehat C$ and the empirical distribution of a typical couple of neighbouring cells $\widehat{(C,C')}$. Note that the marginal distributions of the latter are not the former because cells with many edges have more neighbouring cells.

  • 0
    Well, I knew there was this possibility. But there are "laws of large numbers" that say something equals something, and in this case it seems that only one side of the "$=$" is defined. Is it impossible to define the other side? Maybe it is.2012-01-15
  • 0
    It would be somewhat naive to believe that nice expressions are available for every empirical characteristics of (C,C') since this is already false for C alone. See [this](http://www.univ-rouen.fr/LMRS/Persopage/Calka/NumbSides.pdf), for example.2012-01-15
  • 0
    I didn't expect "nice" expressions.2012-01-15
  • 0
    That paper you linked to looks interesting, since I didn't know that anyone knew any explicit expressions for these. Last I'd heard (so I'm not up to date on this!) these probabilities were known only numerically (although it's easy to show the expected value is exactly $6$).2012-01-15
  • 0
    @Did I would like to get something clarified about this answer please. Cannot we talk about the expectation of some characteristic of a cell, without always having to average it out in a finite collection $C$? For instance suppose we give an indexing order such that index 1 is the cell with the generating point closest to the origin, index two is the next closest and if there are more than one point at the same distance then we number them clock wise. Now the cells have "soul", so can't we talk about expectation of each of these cells separately without averaging it out?2013-12-21
  • 0
    @triomphe Then these sorted cells will have characteristics different one from another and different from the so-called emprirical distribution. For example the size of the cell around 0 is size-biased when compared to the distribution of the empirical cell. (This well known phenomenon is already observed in renewal models on the line.)2013-12-21
  • 0
    @Did Thank you and I think I understood your point. Could you also please explain the following two: If what we want is the properties of the ,say first 5 cells near to zero, then we need the labeling right, I mean ergodic properties wont give us the result we want in that case? and is there some literature that do that? Secondly if we need the ergodic properties why don't they average it over a specific labeling like the one I gave? I mean why are they using a frequency based empirical average as you have given in (1)?2013-12-21
  • 0
    "first 5 cells near to zero" I don't think this was studied, apart from the cell containing 0 (perhaps because even the definition of "first cells near to zero" is not clear). // "ergodic properties" These correspond to averaging on the cells in B(0,R) and taking the limit R→∞ (which seems the "physical" thing to do). A specific labeling would either be irrelevant or mess everything up. // Literature: check the papers on the website I linked to.2013-12-21