4
$\begingroup$

I have a fairly large array, a billion or so by 500,000 array. I need to calculate the singular value decomposition of this array. The problem is that my computer RAM will not be able to handle the whole matrix at once. I need an incremental approach of calculating the SVD. This would mean that I could take one or a couple or a couple hundred/thousand (not too much though) rows of data at one time, do what I need to do with those numbers, and then throw them away so that I can address memory toward getting the rest of the data.

People have posted a couple of papers on similar issues such as http://www.bradblock.com/Incremental_singular_value_decomposition_of_uncertain_data_with_missing_values.pdf and http://www.jofcis.com/publishedpapers/2012_8_8_3207_3214.pdf.

I am wondering if anyone has done any previous research or has any suggestions on how should go on approaching this? I really do need the FASTEST approach, without losing too much accuracy in the data.

  • 0
    Hmm..okay I am also just concerned about the accuracy of the final result, seeing how starting with an SVD based off of many vectors as opposed to a single vector might make a much more precise answer, but I will take a go at this method and start off with one vector to see if it'll still work appropriately.2012-08-09

2 Answers 2

2

You could compute the SVD of randomly chosen submatrices of your original matrix, as shown e.g. in the 2004 paper by Drineas, Frieze, Kannan, Vempala and Vinay, and scale the result to obtain an approximate SVD of the original matrix. There has been quite a bit of additional work on randomized matrix methods since then. The grandfather of all this is the Kaczmarz method of 1939 for solving the problem $Ax = b$, if only one row of $A$ at a time is accessible.

It might also be useful to check if maybe a few top singular values are sufficient for your purposes. If so, Lanczos methods (e.g.) will result in additional time savings.

  • 0
    I'm really liking Lanczos method so far. I think I might go with that! Thank you for your help!2012-08-16
0

So is there any other property that the matrix has? For example, is it sparse or is it symmetric, real/complex, etc... As there are optimized algorithms for various situations, SVD may not be the best option -- it would be helpful to know what you are trying to get from the SVD.

  • 0
    This is more of a comment than an answer...2012-08-15