0
$\begingroup$

Basically I need to store (in a database) and subsequently fast compare histograms of two images.

Let's say I have an image and the green histograms is:

Where basically I have 4 buckects, and green pixel from 0 to 63 are 15 and so on.

How could I condense this graph in a numeric value so I can easly compare it to a new image?

The most simple solution is to create a string like this:

15.12.22.9 ^  ^ >--|--------number of pixel from 0-64    >--------number of pixel from 64-127    (and so on) 

The problem with this solution is that i can't easly compare 2 images with similar value. Let's say I have another images with the green histogram is:

14.12.22.9 ^ > 14 pixel instead of 15 

This new image is basically the same with just 1 pixel less of 0-63 range. But I can't easly compare this 2 strings

  • 0
    And, consider for example we choose 50 buckets for each channel (RGB), that would mean a table with 150 field on the table. Not good2012-06-01

1 Answers 1

1

Some options:

  1. Store the full information, eg as a array of numbers (if your DB supports some ARRAY datatype), or as a hexadecimal string, or a JSON string, or whatever representation that suits you (but not "a numeric value", that does not make much sense)
  2. Depending on what you consider to be a good measure of similarity, you could store, say, three numbers per channel (what would correspond to a decimated 3-buckets histogram)

The advantage of 1 is that you retain the full information and coul do a good comparison. The disadvantage is that it might be difficult to do compute the difference, specially inside the database, specially for varying number of buckets.