1
$\begingroup$

I'm sure "jaggedness" isn't the right term to use here, so please correct me. I'm trying to quantify how jagged a distribution is. For example, this is moderately jagged:

distribution #1

This is really jagged:

distribution #2

Essentially, I want to quantify the number of peaks and the disparity between peaks and valleys in a meaningful way, normalized for size of data. So in the above example, distribution #2 has a larger amount of data than distribution #1, but I don't want that to affect the quantification. How would I do this?

2 Answers 2

1

Try calculating the $L_2$ norm. It's going to be much bigger for distribution #2. If the histogram is $h_i$, normalized so that the $L_1$ norm is $1$, i.e. $\sum_i h_i = 1$, then the $L_2$ norm (squared) is $L_2^2 = \sum_i h_i^2.$ One problem you're going to encounter is that the noise is proportional to $\sqrt{n}$, so you should normalize (divide) the $L_2^2$ norm by $n$ (I think). To make sure you have the correct normalization, sample your input and compare the normalized values; they should be about the same if you take all points or, say, only half of them (try even taking only, say, a tenth).

  • 0
    Compare instead $[1/4, 1/4, 1/4, 1/4]$ and $[0, 1/2, 0, 1/2]$.2011-02-08
1

You can even extend Yuval's proposal and compute:

$R(\alpha) = \frac{1}{1-\alpha}\ln \sum_i h_i^{\alpha}$

where the case $\alpha = 1$ is to be understood as a limit.

These are called Rényi indices or entropies and are a standard tool of multifractal analysis which is the study of "jagged" geometrical (or other) objects.

It's hard to find good references, I'd need to spend more time searching, and anyway that's why I started my own question here.

I found this paper which explains Rényi entropies and their connection to fractal dimensions. There's some talk about dynamical systems as well, but I think you can skip over most of it.