0
$\begingroup$

I am trying to figure out how to implement a cyclic running Regression Line computing algorithm using Least Squares for streaming time series data in the most efficient way. In other words, having $LS(s)$ im looking for a way to compute $LS(s)$ after adding an element to s and removing the kth oldest element from it.

Given a set $s$ of $x,y$ samples I was thinking maybe it is possible to compute $LS(s\bigcup\{a\})$ (for some new sample $a$) and $LS(s-\{a\})$ (for some sample $a$ in $s$, specifically the oldest sample) given $LS(s)$, is there a known solution to this?

  • 0
    It is not clear what you are trying to do. When you are varying the sample, what are you dropping? Observations or variables? If variables, maybe you are looking at a problem how to calculate one OLS coefficient without calculating the other. See whuber's answer here: http://stats.stackexchange.com/questions/46151/how-to-derive-the-least-square-estimator-for-multiple-linear-regression2017-02-17
  • 0
    in every iteration i want to add a new sample and remove the k'th oldest one, example of such a running algorithm is the onlione algorithm that calculates variance here https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance I want the same only for Least squares instead of variance, and with the ability to remove the kth oldest element from the calculation2017-02-17

1 Answers 1

1

Assuming that I properly understand.

If you use normal equations for the regression $y=a+bx$, $a$ and $b$ are given solving the equations $$S_y=n a+b S_x$$ $$S_{xy}=a S_x+b S_{xx}$$ Now, you want to add point $n+1$ and remove point $1$ from the regression set. So, the equations are now $$S'_y=n a'+b' S'_x$$ $$S'_{xy}=a' S'_x+b' S'_{xx}$$Computing the terms we have $$S'_y=S_y+y_{n+1}-y_1$$ $$S'_x=S_x+x_{n+1}-x_1$$ $$S'_{xy}=S_{xy}+x_{n+1}y_{n+1}-x_1y_1$$ $$S'_{xx}=S_{xx}+x_{n+1}^2-x_1^2$$ which make the updating process quite simple.

For sure, when this is done, do not forget to replace make the update $S=S'$ and to shift the indices $k=k-1$ to be ready for the next step.

  • 0
    This seem to work! Thank you!2017-02-18
  • 0
    @OfekRon. You are very welcome ! This was a sketch of the idea. You can make it more general preserving all $(x_i,y_)$ considering that, at a given time, you have $M$ data points and that you perform the regression using the last $n$ of them.2017-02-19
  • 0
    What if $x$ gets bigger while $x_i-x_{i-1}$ is constant (1 in my case)?, since im only intrested on the regression line for last $n$ items, is it possible to reuse $x_0$ to $x_n$ (0 to n in my case) as the x's? if it is, how would you change your iteration move?2017-02-20
  • 0
    How do you apply weightings on this?2017-02-20
  • 0
    @OfekRon. Same idea, for sure if the weights are independent from eachother. Just write again the normal equations with weights and proceed the same way.2017-02-21
  • 0
    can you edit the answer with the formulas with weights? you can also answer me here http://math.stackexchange.com/questions/2153699/how-to-apply-weightings-to-least-squares-slope-formula2017-02-21