1
$\begingroup$

I have a picture consisting of a two-dimensional array of ordered triples (red, green, blue) of real numbers from 0 to 1. I'm looking for something like a norm on pictures which expresses the range of colors used.

The idea is that a grayscale image should have norm 0 and an image with pure red, green, and blue should have norm 1. Here's the key part: A colorized image (black and red, for example, instead of black and white) should have norm 0 just like a grayscale image. So if all colors are linear combinations of two colors the norm is 0, and the extent to which a third basis is needed to represent the colors used (even if only for one pixel) is the norm.

Any ideas on how to formalize this?

My first instinct is to change bases to (hue, saturation, lightness) and look at the maximum difference of hues (mod 1). But then two points with colors $(\varepsilon, 0, 0)$ and $(0, \varepsilon/2, \varepsilon/2)$ would seem very distant where they actually represent colors which are very close (near-black).

It seems natural at this point to transform the problem into one of geometry and look at the color bicone, but what is the most natural metric to use here? Euclidean? L1? Something else?

Of course other approaches would be welcome.

  • 0
    @Rahul: I know the "keep getting distracted by answering newer questions instead" phenomenon very well :-)2012-07-31

1 Answers 1

1

Sounds like what you're looking for is how far the colours in the image, seen as points in the RGB cube, deviate from a one-dimensional manifold.

In a grayscale image, all the colours lie on the line segment $(t,t,t)$ for $t \in [0,1]$. On a black-and-red image, they lie on $(t,0,0)$, and on a red-and-white image, on $(1,t,t)$. A sepia-toned image contains colours on a curve that passes through black, brown, cream, and white; whether you consider this one-dimensional depends on whether you're only looking for "flat" affine subspaces or general curved submanifolds of the RGB cube.

Discovering curved manifolds in data is pretty hard, but the affine case is easy. If all the points lie close to a line, their covariance matrix will have only one large eigenvalue. The best-fitting line is the one that passes through the mean of the points and is parallel to the corresponding eigenvalue. The sum of squares of the remaining eigenvalues is the residual variance, which tells you how much the points deviate from this line. The square root of this quantity is something you might call the "residual standard deviation", and it behaves very much like the "norm" you're looking for.

(See my previous answer for more on the relationship between the shape and dimensionality of a data set and the eigenvectors of its covariance matrix.)

  • 0
    It would be nice to be able to detect sepia-toned pictures, but I think the question is hard enough already. :)2012-07-31