I'm trying to get a linear regression slope and intercept for a large set of huge numbers.
I'm doing this on a computer, but I keep getting overflow errors (attempting to calculate a number too large for a standard data type). I figured I'd ask this here, since it's primarily a math question.
How can I "normalize" the input set, so that I don't have an overflow? Or perhaps, there is another method for calculating the slope and intercept that wouldn't result in multiplication of all the X values, summation of X*Y, etc.
(y, x) 2103.00 @ 1233687329.20 2104.00 @ 1233687329.50 2103.00 @ 1233687329.20 2104.00 @ 1233687329.50 2105.00 @ 1233687329.80 2106.00 @ 1233687330.10 2107.00 @ 1233687330.40 2108.00 @ 1233687330.70 2109.00 @ 1233687331.00 2110.00 @ 1233687331.30 2111.00 @ 1233687331.60 2112.00 @ 1233687331.90 2113.00 @ 1233687332.20 2114.00 @ 1233687332.50 2115.00 @ 1233687332.80 2116.00 @ 1233687333.10 2117.00 @ 1233687333.40 2118.00 @ 1233687333.70 2119.00 @ 1233687334.00
For example, trying to get the slope / intercept for this data set in Excel or Numbers will just result in an error.
Is there a way to normalize the set prior to doing the regression (and after to get the right answer), or perhaps a less intensive way of getting the regression?
Update: Normalizing by subtracting from Y doesn't work.
x (x-5) y 1 -4 1 2 -3 2 3 -2 4
slope works fine: 1.5
intercept non-"normalized": -0.66666
intercept "normalized": 6.83333 <-- problem, can't just add 5 to intercept to get the value