4
$\begingroup$

I have some preliminary findings that are a little strange.

I have a couple of related data sets that are small (~150) positive integer samples and that seem rather strange. Because they're small, I don't take their histograms seriously, so I've performed Gaussian kernel density estimation on both of them. (While I haven't been that careful about bandwidth selection, I've at least taken this into account, and doubt it is a significant issue in the present context--but I could be wrong.)

The weird thing is this: in both cases, I get a remarkably good-looking fit to a Cauchy distribution (I've done this by hand, no MLE stuff in this instance*, but I've corroborated the fits by looking at normal and log-log plots) which is centered at a positive number (i.e., the KDE basically looks like this). But as I've said, the samples are (and must be) positive integers, so the support of the distribution is the same. On the other hand, the Cauchy distribution fits very well on the positive region.

The only thing that I can imagine producing something like this besides a mistake on my part is that these samples might admit some kind of interpretation of sums of IIDRVs and that leads to the apparent Cauchy-ness due to the generalized CLT.

What might cause this? Explanations for mistakes are particularly welcome. However, please note my doubts about the kernel bandwidth being an issue.

*However, I've tried MLE fits to some other common distributions (e.g., gamma) and these are terrible, especially in comparison.

  • 0
    In what sense is the Cauchy a good fit? Just eyeballing it, or something more formal?2011-10-26

1 Answers 1

2

A simple way to produce heavy-tailed distributions is to consider (thin-tailed) distributions with a random parameter. Assume for instance that $N$ has a geometric distribution with parameter $A$ and that $A$ has a Beta distribution, that is, $ \mathrm P(A\in dx)=\frac1{\mathrm B(a,b)}x^{a-1}(1-x)^{b-1}\cdot[0\leqslant x\leqslant 1]\cdot\mathrm dx, $ and, for every nonnegative integer $n$, $ \mathrm P(N\geqslant n\mid A)=A^n. $ Then, $ \mathrm P(N\geqslant n)=\frac1{\mathrm B(a,b)}\int\limits_0^1x^{n+a-1}(1-x)^{b-1}\mathrm dx=\frac{\mathrm B(a+n,b)}{\mathrm B(a,b)}=\frac{\Gamma(a+b)\Gamma(a+n)}{\Gamma(a)\Gamma(a+b+n)}, $ hence $\mathrm P(N\geqslant n)$ is equivalent to a multiple of $n^{-b}$ when $n\to\infty$.

To get a tail equivalent to $n^{-1}$ as in a (discrete) Cauchy distribution, choose $b=1$, that is, $ \mathrm P(A\in dx)=ax^{a-1}\cdot[0\leqslant x\leqslant 1]\cdot\mathrm dx, $ in other words, assume that $A=U^{1/a}$ where $U$ is uniformly distributed on $(0,1)$.

  • 0
    Ah, you caught my mistake, thanks. I should have said Sibuya: this is implicit in my MO question.2011-11-01