I have implemented in MATLAB the recursive least-squares algorithm given in, for instance, Hayes's excellent Statistical Digital Signal Processing and Modeling (p. 541ff).
However, when I run the algorithm on real data with a forgetting factor $\lambda<1$ (e.g. $\lambda = 0.9$), after about 2000 updates I always reach a point where my filter coefficients explode in size and obviously overflow. Stepping through the algorithm with a debugger reveals that the coefficients of $P$, the inverse correlation matrix, grow exponentially bigger due, I believe, to the division by $\lambda$ at each timestep.
I've done a literature search but found very little useful information. I found a paper by Bouchard who recommends to force $P$ to be symmetric, but that didn't help.
Is there a published version of the RLS algorithm that "fixes" these numerical instability problems? Or does anyone know what else I should try?