6
$\begingroup$

I understand how to create random variables with a prespecified correlational structure using a Cholsesky decomposition. But I would like to be able to solve the inverse problem: Given random variables $X_1, X_3, \dots X_n$ ,and two different linear sums of those variables $V_1=a_{11}X_1+a_{12}X_2+\dots a_{1n}X_n$, and $V_2=a_{21}X_1 + a_{22}X_2 +\dots+a_{2n}X_n$, I wish to calculate the correlation between the $V_1$ and $V_2$.

I have searched for terms like "linear combination random variables correlation" and have found plenty of material discussing how the correlation affects the variance of the sum of random variables. Unfortunately I have found nothing that seems to relate to the problem described. I would appreciate any information at all, including either an appropriate book chapter or web page reference.

3 Answers 3

5

The covariance is bilinear, hence $ \mathrm{Corr}(V_1,V_2)=\frac{\mathrm{Cov}(V_1,V_2)}{\sqrt{\mathrm{Var}(V_1)\mathrm{Var}(V_2)}}=\frac{\sum\limits_{j=1}^n\sum\limits_{k=1}^na_{1j}a_{2k}\mathrm{Cov}(X_j,X_k)}{\sqrt{\mathrm{Var}(V_1)\mathrm{Var}(V_2)}}, $ where, for $i=1$ and $i=2$, $ \mathrm{Var}(V_i)=\sum\limits_{j=1}^n\sum\limits_{k=1}^na_{ij}a_{ik}\mathrm{Cov}(X_j,X_k). $

  • 0
    Many thanks. I did not expect to be given the answer but it is very helpful. It also gives me some ideas about how to find a reference.2012-03-29
1

To have an example after Didier's answer I give a short command-sequence in MatMate-language, which is nearly pseudo-code:

1) ==========================================================

; Generate some random-data n=200  // number of measures in a variable vx=6   // number of X-variables vv=2   // number of V-variables  X = randomn(vx,n)  // generate randomdata rowwise in X, normal distribution X = abwzl(X)       // make X rowwise centered, so the row-means are zero  A = randomu(vv,vx) // generate random-coefficients a_{r,c} 

2) =============================================================

; make data in V according to variable composition given in A  V = A * X    // because X has centered data, V has also centered data, so              // row-means in V are also zero  ; make matrix of variances/covariances of V the usual way covV = V * V' / n    //  ' is transpose-symbol  ; since we have also     ;   covV = V * V' /n  ;        = (A * X ) * (A * X)' /n ;        = A * X  * X' * A' /n ;        = A * (X  * X' /n) * A'  ;        = A * covX * A' ; ; we can compute covV also by covX and A   covX = X * X' /n  ;and      covV = A * covX * A'  ; make correlations using matrix-division by sqrt of diagonal D = diag(covV)    D = D ^# -0.5     // binary operation-symbol followed                      // by "#" means: do operation elementwise   D = mkdiag(D) corV  = D * covV * D 
-1

to directly answer the question ....

COR(aX+bY,cW+dZ) =

[ac*COR(X,W)+ad*COR(X,Z)+bc*COR(Y,W)+bd*COR(Y,Z)] / [(a^2+b^2)(c^2+d^2)]^.5

(where a,b,c,d are constants and X,Y,W,Z are random variables)