2
$\begingroup$

I have the following data

data = [9 12 8 11 10 10 11 12 11 11 12 8 11 9 9 12 8 11 10 12 8 11 12 8 11 10 18 20 10 18 20 24 28 30 31 32 33 33 34 33 32 34 35 32 33 31 30 37 38 39 40 39 40 38 37 40 41 38 37 36 33 32 34 35 32 33 31 30 32 34 31 30 28 26 25 23 20 18 15 12 11 10 10 11 12 11 11 9 12 8 11 10 12 8 9 11 7 9 10 7 3 4 2 1 5 4 3 7 8 9 10 5 4 6 11 12 8 11 12 12 13 14 15 12 13 15 15 16 17 18 19 17 16 15 19 20 21 20 18 17 16 19 21 18 20 24 28 30 31 32 33 30 31 35 36 38 40 42 41 41 42 46 48 47 45 42 40 41 43 40 39 39 38 36 32 34 31 30 28 26 25 23 20 18 15 12 11 10 12 8 9 11 7 9 10 12 11 10 11 10 11 12 11 11 9 12 8 11 10 10 11 12 11 11 9 12 8 11 10 14 15 18 17 16 15 18 19 14 15 17 18 19 20 21 19 18 17 16 15 14 12 15 15 15 16 17 18 18 19 20 20 20 21 19 17 16 17 18 19 22 24 26 29 30 35 37 40 42 44 44 46 48 48 49 50 50 51 52 53 51 52 55 58 57 51 56 57 52 50 49 52 52 51 49 48 46 46 42 43 45 41 42 40 41 43 40 38 37 34 32 28 25 19 15 14 13 12 8 5 3 7 8 9 11 12 10 7 9 14 13 11 10 8 5 4 3 5 7 8 9 10 12 11 6 8 10 7 6 8];

and I need to find 3 peaks in this data vector. For example:

IMAGE

What mathematical function can I use to do so?

Note 1: I always need to find 3 peaks (this value doesn't change), but the X and Y axis range may change.

Note 2: The last peak/curve will always be higher than the other two.

Note 3: I'm not a mathematician. :)

Note 4: I am using Matlab to do it.

1 Answers 1

2

If you can assume thresholds for the peaks (for instance a lower one $Y<20$ and a higher one $Y>30$ from your picture), then it is quite easy to compute the peaks.

Set initial max to zero, then compare all successive values in the array with the current maximum (i.e. the largest value found in the array so far). Each time you cross the lower threshold reset the maximum, once the reset occurs the recorded maximum is a true one if it is bigger than the higher threshold.


update on 02/02 :

Here is some C code relative to what I said above :

#define low_threshold 20
#define high_threshold 30

max = 0;
imax = 0;

for(i=0; i max) { imax=i; max=data[i]; }

    if(data[i] < low_threshold)
    {
        if(max >= high_threshold) printf("peak : data[%d]=%d\n", imax, max);

        max = 0;
        imax = 0;
    }
}

And you get the expected result :

peak : data[56]=41
peak : data[162]=48
peak : data[282]=58

This is working well because the data itself has well separated peaks, with approximately the same high, so we were able to set thresholds manually.

Assuming your data always looks like this you can compute the absolute max and set for instance the $thresholds = max/2 \pm max/10$.


But there is no miracle recipe for general data, with a lot of oscillating or many local peaks. Let's take for instance the following example : $$2x+10\sin(x)+2\cos(10x)$$ enter image description here

Because the value reached in each peak is growing as well as the minimum in the valleys, the threshold method doesn't work anymore. In this case you would have to smooth the data, and do some data analysis to detect reasonably lenghty ascending data and reasonably lengthy descending data and conclude for a local peak when the difference between the max and the min is reasonably large enough. And again you'll have to define what means reasonable for your data.


So as you can see, you'll have to make assumptions on your data anyway. From the picture you provided, the threshold method seems adapted.

  • 0
    Thanks, @zwim. Do you mean, something like [this](https://i.stack.imgur.com/DSzsN.png)? Then I need to search for the max only inside the "region of interest"? I think it is totally possible, but how can I define the lower and upper thresholds? Using mean or median?2017-02-01
  • 0
    Thanks a lot @zwim. Actually, in my real data, the value reached in each peak is growing as the minimum in the valleys, but I am using linear regression to "put it down", cause I don't need to know the exactly Y axis value. As you said, I think the threshold method seems to work.2017-02-02