0
$\begingroup$

If I compute the eigenvalues and eigenvectors using numpy.linalg.eig (from Python), the eigenvalues returned seem to be all over the place. Using, for example, the Iris dataset, the normalized Eigenvalues are [2.9108 0.9212 0.1474 0.0206], but the ones I currently have are [9206.53059607 314.10307292 12.03601935 3.53031167].

The problem I'm facing is that I want to find out how much percentage of the variance each component brings, but with the current Eigenvalues I don't have the right values.

So, how can I transform my eigenvalues so that they can give me the correct proportion of variance?

Edit: Just in case it wasn't clear, I'm computing the eig of the covariance matrix (The process is called Principal Component Analysis).

  • 0
    Do you just not need to normalize your eigenvalues so that the sum is equal to one or have I misunderstood?2012-08-20

2 Answers 2

1

The sum of the eigenvalues equals the trace of the matriz. For a $N \times N$ covariance matriz, this would amount to $N VAR$ - where VAR is the variance of each variable (assuming they are equal - otherwise it would be the mean variance). Put in other way, the mean value of the eigenvalues is equal to the mean value of the variances.

And that's pretty much what can be said. Perhaps you are computing a covariance matriz by just multiplying the data matrices? If so, you just should divide by $N.

0

I replicate the dataset in your question.and got the normalized Eigenvalues are [2.9108 0.9212 0.1474 0.0206].

first to get this answer your need to normalize the data :1) subtract the mean for each column (each variable over time),2)divided demeaned column by its variance (variance after demean) 3) then you do PCA which will give you the answer you want [2.9108 0.9212 0.1474 0.0206].

Following are my matlab codes for how I carry out PCA for your dataset clear all;

% data - MxN matrix of input data % (M dimensions, N trials) % signals - MxN matrix of projected data % PC - each column is a PC % V - Mx1 matrix of variances filename = 'iris.xls';sheet = 1;xlRange = 'A:D'; x=xlsread(filename,sheet,xlRange );

data=x'; [M,N] = size(data); % subtract off the mean for each dimension mn = mean(data,2); data = data - repmat(mn,1,N); for kk=1:M data(kk,:)=data(kk,:)/sqrt(var(data(kk,:))); end %calculate the covariance matrix covariance = 1 / (N-1) * data * data'; %covariance=M/sum(diag(covariance))*covariance; % find the eigenvectors and eigenvalues [PC, V] = eig(covariance); % extract diagonal of matrix as vector V = diag(V); % sort the variances in decreasing order [junk, rindices] = sort(-1*V); V = V(rindices); PC = PC(:,rindices); % project the original data set signals = PC' * data; VV(:,1)=V; VV(:,2)=V/M; disp(VV);