0
$\begingroup$

I'm trying to get a linear regression slope and intercept for a large set of huge numbers.

I'm doing this on a computer, but I keep getting overflow errors (attempting to calculate a number too large for a standard data type). I figured I'd ask this here, since it's primarily a math question.

How can I "normalize" the input set, so that I don't have an overflow? Or perhaps, there is another method for calculating the slope and intercept that wouldn't result in multiplication of all the X values, summation of X*Y, etc.

(y, x) 2103.00 @ 1233687329.20 2104.00 @ 1233687329.50 2103.00 @ 1233687329.20 2104.00 @ 1233687329.50 2105.00 @ 1233687329.80 2106.00 @ 1233687330.10 2107.00 @ 1233687330.40 2108.00 @ 1233687330.70 2109.00 @ 1233687331.00 2110.00 @ 1233687331.30 2111.00 @ 1233687331.60 2112.00 @ 1233687331.90 2113.00 @ 1233687332.20 2114.00 @ 1233687332.50 2115.00 @ 1233687332.80 2116.00 @ 1233687333.10 2117.00 @ 1233687333.40 2118.00 @ 1233687333.70 2119.00 @ 1233687334.00 

For example, trying to get the slope / intercept for this data set in Excel or Numbers will just result in an error.

Is there a way to normalize the set prior to doing the regression (and after to get the right answer), or perhaps a less intensive way of getting the regression?


Update: Normalizing by subtracting from Y doesn't work.

x  (x-5)    y 1   -4  1 2   -3  2 3   -2  4 

slope works fine: 1.5

intercept non-"normalized": -0.66666

intercept "normalized": 6.83333 <-- problem, can't just add 5 to intercept to get the value

  • 0
    To get the non-normalised intercept from the normalised, you need to subtract 5*slope = 7.5.2011-09-05

1 Answers 1

3

Subtract 1233687329 from each of your x values (in other words, do the change of variables $t = x - 1233687329$). Then you can change back to $x$ if you wish, although for most purposes $t$ would be a more sensible variable to use.

  • 2
    If the equation in the $(t,y)$ variables is $y = m t + b$, and $t = x - c$, then $y = m (x - c) + b = m x + (b - m c)$. So the $y$ intercept when using the $(x,y)$ variables is $b - m c$.2011-09-05