7
$\begingroup$

I would like to construct a measure to calculate the sparseness of a vector of length $k$.

Let $X = [x_i]$ be a vector of length $k$ such that there exist an $x_i \neq 0$ . Assume $x_i \geq 0$ for all $i$.

One such measure I came across is defined as $\frac{\sqrt{k} - \frac{\|X\|_1}{{\|X\|_2}}} {\sqrt{k} -1}\;,$ where $\|X\|_1$ is $L_1$ norm and $\|X\|_2$ is $L_2$ norm.

Here, $\operatorname{Sparseness}(X) = 0$ whenever the vector is dense (all components are equal and non-zero) and $\operatorname{Sparseness}(X) = 1$ whenever the vector is sparse (only one component is non zero).

This post only explains the when $0$ and $1$ achieved by the above mentioned measure.

Is there any other function defining the sparseness of the vector.

  • 0
    Thanks for pointing out the mistake. I have modified the description.2012-03-08

4 Answers 4

6

You could of course generalize your current measure

\begin{align} S(X) = \frac{\frac{k^{(1/m)}}{k^{(1/n)}} -\frac{\|X\|_m}{\|X\|_n} } {\frac{k^{(1/m)}}{k^{(1/n)}}-1} \end{align}

while preserving your properties you specified.

An interesting special case could be $m = 1, n \to \infty$, in which case the expression simplifies to

\begin{equation} S(X) = \frac{k-\frac{\|X\|_1}{\|X\|_c}}{k-1} \end{equation}

where $c = \infty$, (for some reason, mathjax refused to render when I inserted $\infty$ directly in the fraction)

  • 0
    Even if something fails to parse correctly in a mathjax preview it *should* work when you actually submit it and refresh.2012-03-12
3

There is a definition of sparsity, which is used (amongst others) in the compressed sensing literature, see e.g. here.

A vector $x\in \mathbb{C}^k$ is called $s$-sparse, if $|| x ||_0 = |\text{supp}(x)| \leq s$, that is, it has at most $s$ non-zero entries. Denote by $\Sigma_s$ the set of all such vectors. Then, the $s$-term approximation error of a vector $x\in \mathbb{C}^k$ is defined as $ \sigma_s(x)_p = \min_{y\in\Sigma_s} ||x-y||_p. $

Now this quantity equals $0$, if your vector $x$ is $s$-sparse, and will be greater than $0$ otherwise. Note that you now have two parameters $s$ and $p$ to tune this "measure". Clearly, you get your definition of sparsity if you set $s=1$.

2

Disclaimer: This post considers the case in which you do not mind some computational effort to get nice sparseness value. For something new, please skip to part 2.

Part 1

I agree with Mikael, that kind of generalization is nice, what's more, with Mikael's formula it is intuitive from where it came from: the most basic notion for vector sparseness would be $\frac{\text{number of indices $k$ such that }X_k = 0}{\text{total number of indices}}.$

However, by this definition $\langle 0, 0, \ldots, 0\rangle$ is sparse, but vector $\langle c, c, \ldots, c \rangle$ is not. Still, it is easy to fix it: $\frac{\text{number of indices on which }X_k - c = 0}{\text{total number of indices}}\,,$ where $c$ is some average of $X$, e.g. $c = \|X\|_\infty$. The problem with this measure is that it is not easy to count the number of indices. To alleviate for that, we could approximate the number of indices by $\frac{\|X\|_1}{\|X\|_\infty}$. Naturally we need some normalization, and by that we arrive at Mikael's special case: $\frac{k-\frac{\|X\|_1}{\|X\|_\infty}}{k-1}.$

But the average we took as an example $x = \|X\|_\infty$ isn't the only one. Similarly we could approximate the number of indices in different fashions: $\frac{\|X\|_m}{\|X\|_n}$ would do for any $m < n$, and the normalization is just $\frac{\frac{\|C\|_m}{\|C\|_n}-\frac{\|X\|_1}{\|X\|_\infty}}{R_\max-R_\min}, R_\max = \frac{\|C\|_m}{\|C\|_n}, R_\min = \frac{\|D\|_m}{\|D\|_n},$ where $C = \langle c, c, \ldots, c \rangle$ and $D = \langle c, 0, 0, \ldots, 0 \rangle$ for any $c \neq 0$.

Part 2

Still, this measure is rather weird, because intuitively it more depends on the sizes of the values, than how many different numbers there are. I am not sure that this is a property we would want to have. There is a different measure that can take that into account, i.e. measure based on entropy. Interpreting $X$ as the samples, one can calculate $ -\sum_i P(X = i) \log_k P(X = i) .$

To soften this a bit just pick any distribution you want (best specific to your application), e.g. $F_\mu = N(\mu, \sigma^{2})$, set $F = \frac{1}{k}\sum_i F_{X_i},$ and then calculate differential entropy ($f$ is the density function of $F$): $-\int_\mathbb{R} f(x) \ln f(x) \ dx$ or even better relative entropy if you have some reference measure (e.g. the very same $F_\mu$ adjusted a bit might do the trick). Of course, all this has to be scaled to $[0,1]$ what makes the formulas even nastier, however, in my opinion, it catches the notion of sparseness pretty good. Finally, you can combine the two approaches in infinitely many ways, to get even more sparseness models!

2

The paper "Comparing Measures of Sparsity" by Hurley and Rickard has a wide range of sparsity measures considered, along with the analysis of their conformance to a bunch of properties one would consider desirable for such measures.

The one you mention is listed under the name of Hoyer measure and bears most of these properties (except the one, D4, that makes sense only if we wish to compare vectors of different lengths).