1
$\begingroup$

I want to do a multiple regression over an experimental result shown as 3D-Plot and heatmap in following Images. Sorry as a new user i am not allowed to post them directly but it is just a link to imgur!

3D Plot:

3D-Plot

Heatmap:

Heatmap

If I just investigate the images there is obviously a connection between the variables and the output.

At the Moment I get a very bad coefficient of determination ($R^2$), already trying several combinations.

Let say $x_1$ is the first variable and $x_2$ is the second. $Y$ is the results depending on $x_1$ and $x_2$. $x_1x_2$ is the product of $x_1$ and $x_2$. I also define the reciprocal for every value as $x_{1_{re}} , x_{2_{re}}$ and $\left(x_1 x_2\right)_{re}$ (for example $\frac{1}{x_1}$)

The best R-squared I can obtain is at $0.49$ with the following formula $y \approx x_{1_{re}} + x_{2_{re}} + \left(x_1 x_2\right)_{re}$

If I use a simple model like $y \approx x_1 + x_2 + x_1x_2$ it gets even worse to $0.32$

Somebody can help me out and point in the right direction!

Is the regression really so bad? Is there perhaps another formula I should try?

  • 1
    This might be better received over at stats.stackexchange.com . That said, we don't call it Multiple Linear Regression for nothing. Your data seems to oscillate significantly, at least based on the 3D Plot. If you could post a view of the data as a point cloud, it would be easier (for me at least) to visualize. Given how dispersed the data seems we wouldn't expect a linear model, even with an interaction term, to have a particularly high $R^2$. Where is this data drawn from?2012-06-08

1 Answers 1

1

Thanks for you help. This data is drawn from a experiment using a combination of Artifical Neural Networcs and Simulated Annealing. In Short Terms: 1. Generate specified number of data-sets from a given function (X,Y,Z) 2. Train a Multi-Layer-Perceptron 3. Do Simulated Annealing as presenting new data to the MLP to find an optimum

This data is the combination of: X = number of datasets Y = hidden neurons of the MLP Z = difference between the optimum obtained from SA and the real optimum

I just draw a Scatterplot for visualizing and I think you are right, the data is to much oscillating for a good $R^2$. But any ideas left to find an approximation?

http://i.imgur.com/vIO78.png

  • 0
    What is black? what is red?2012-11-23