1
$\begingroup$

need a suggestion/advice/inputs from the mathematicians here.

I have collected data of number of instructions/sec executed by a processor every second, number of loads/stores performed every second and stored the array of values in 2 files.

I also simultaneously collected data of the energy consumed every second by the entire computer and stored it in a file.

Assuming,

(eg :

 INstructions/sec   loads and stores/sec    energy consumed by computer/sec       12344                    2134124                    234       345345                   2342123                    234       324525                   4443214                    345       345324                   1432132                    234        .                          .                       132        .                          .                       321        .                          . 

[A] to be an array of number of instructions executed/sec every second

[B] is the array of loads and stores/sec every second, (

[C] is the array of energy values every second, how can i find an equation connecting A, B and C ?

Should I use curve fitting ? What is the approach I should employ? Eigen vectors ? I'm a bit out of touch with Mathematics since it's been quite long.

The equation can be of any form.. Not just [A]x + [B]y =[C]

  • 2
    If you allow equations of "any form", there are infinitely many equations that will exactly fit the data. So you shouldn't be asking "How can I find *any* equation that connects these variables?" but rather, "Given this data, how should I try to find what the actual relationship is between these variables?" This is a problem in regression analysis, and it seems to me that not wanting to specify the form of the equation could introduce many subtle complications. You would probably get a better discussion at the statistical analysis site, http://stats.stackexchange.com/2011-05-01

1 Answers 1

2

As a case of Mom's Corollary, I'm going to speak on behalf of a few statistical packages. I'm assuming that your final goal is just to find a relationship between the instructions, loads, and energy per second rather than learning how to do the math by hand to find such a relationship. The problem with doing these things by hand is two-fold: there seems to be too much data for you to reasonably manipulate by hand, and regression analysis is not necessarily an easy task.

If you suspect that the relationship is linear, then I point you to Wikipedia's page on linear regression. In particular, there is something called "Ordinary Linear Regression" that is very useful. I also point you to this site that happens to do OLS for you. All you have to do is input the numbers, and it calculates it for you.

On the other hand, if you want to do more, than I suggest you look into some statistical software. There are some excellent statistical software packages out there: MatLab happens to be truly excellent at finding regression (fitting a curve to data) and has a function that will test literally thousands of different curves against the data to find the best-fit. If you would prefer a free alternative, there are also some excellent open-source alternatives: I recommend Octave and R.

As I happen to be very particular to R for some reason, I will also direct you to their wiki and their website.

As a brief disclaimer: In no way am I affiliated with R personally, and I note that even though I link to their page I by no means am trying to give them any revenue (it's open source, there's not really revenue to be had) nor garner any advantage.

  • 0
    @Sharat: you can, it just represents a line drawn through a higher dimensional space. As you add more dimensions, i.e. add more variables, it is possible for more complex relationships.2011-05-02