I have a set of points in 3D (x,y,z). I ordered these points from the lowest to highest. So, I want to used linear regression to fit a line through these ordered points and then to find out a break point where the that exhibits the greatest residual occurs.
How do I fit a model with piecewise linear regression
-
0@anh: please do not use answers to make comments. – 2011-04-26
1 Answers
I assume that you know how to do linear regression (if not, you can Google it). To find the optimal break point, you have to iterate over all possible breakpoints. If you calculate all the sums that you need from scratch for each breakpoint, the number of required operations is quadratic in the number of points. You can do this more efficiently, with the number of operations linear in the number of points, as follows:
Start out with the breakpoint at one end (so all points are on one side and none on the other), and calculate the sums you need for the regression ($\sum x_iy_i$ etc.). These will be all $0$ on the empty side and will include all the data points on the other side. Then in each step move the breakpoint by one, and instead of recalculating all the sums from scratch, just add the appropriate term (e.g. $x_iy_i$ if you're moving the breakpoint past data point $i$) to the one sum (the one that started out empty) and subtract them from the other. That only requires a constant number of operations for each potential position of the break point.
-
0In the case with 2 lines fitted to data: If you aim for large correlation values instead of alike correlation values you easily get a solution where one line is fitted to 2 data points only, and therefore seems to have an excellent fit, but the regression depth is bad (few data points per fitted line). [Regression depth](http://www2.parc.com/csl/members/bern/regression.html) – 2011-04-26