6
$\begingroup$

I understand how to create random variables with a prespecified correlational structure using a Cholsesky decomposition. But I would like to be able to solve the inverse problem: Given random variables $X_1, X_3, \dots X_n$ ,and two different linear sums of those variables $V_1=a_{11}X_1+a_{12}X_2+\dots a_{1n}X_n$, and $V_2=a_{21}X_1 + a_{22}X_2 +\dots+a_{2n}X_n$, I wish to calculate the correlation between the $V_1$ and $V_2$.

I have searched for terms like "linear combination random variables correlation" and have found plenty of material discussing how the correlation affects the variance of the sum of random variables. Unfortunately I have found nothing that seems to relate to the problem described. I would appreciate any information at all, including either an appropriate book chapter or web page reference.

3 Answers 3

5

The covariance is bilinear, hence $$ \mathrm{Corr}(V_1,V_2)=\frac{\mathrm{Cov}(V_1,V_2)}{\sqrt{\mathrm{Var}(V_1)\mathrm{Var}(V_2)}}=\frac{\sum\limits_{j=1}^n\sum\limits_{k=1}^na_{1j}a_{2k}\mathrm{Cov}(X_j,X_k)}{\sqrt{\mathrm{Var}(V_1)\mathrm{Var}(V_2)}}, $$ where, for $i=1$ and $i=2$, $$ \mathrm{Var}(V_i)=\sum\limits_{j=1}^n\sum\limits_{k=1}^na_{ij}a_{ik}\mathrm{Cov}(X_j,X_k). $$

  • 0
    Many thanks. I did not expect to be given the answer but it is very helpful. It also gives me some ideas about how to find a reference.2012-03-29
1

To have an example after Didier's answer I give a short command-sequence in MatMate-language, which is nearly pseudo-code:

1) ==========================================================

; Generate some random-data
n=200  // number of measures in a variable
vx=6   // number of X-variables
vv=2   // number of V-variables

X = randomn(vx,n)  // generate randomdata rowwise in X, normal distribution
X = abwzl(X)       // make X rowwise centered, so the row-means are zero

A = randomu(vv,vx) // generate random-coefficients a_{r,c}

2) =============================================================

; make data in V according to variable composition given in A 
V = A * X    // because X has centered data, V has also centered data, so
             // row-means in V are also zero

; make matrix of variances/covariances of V the usual way
covV = V * V' / n    //  ' is transpose-symbol

; since we have also    
;   covV = V * V' /n 
;        = (A * X ) * (A * X)' /n
;        = A * X  * X' * A' /n
;        = A * (X  * X' /n) * A' 
;        = A * covX * A'
;
; we can compute covV also by covX and A 

covX = X * X' /n

;and     
covV = A * covX * A'

; make correlations using matrix-division by sqrt of diagonal
D = diag(covV) 
  D = D ^# -0.5     // binary operation-symbol followed 
                    // by "#" means: do operation elementwise
  D = mkdiag(D)
corV  = D * covV * D
-1

to directly answer the question ....

COR(aX+bY,cW+dZ) =

[ac*COR(X,W)+ad*COR(X,Z)+bc*COR(Y,W)+bd*COR(Y,Z)] / [(a^2+b^2)(c^2+d^2)]^.5

(where a,b,c,d are constants and X,Y,W,Z are random variables)