7
$\begingroup$

I'm especially interested in SL$(2,\mathbb C)$, i.e. $2\times2$ matrices with determinant one, in which case I'm looking for a transformation from $\begin{pmatrix}a&b\\c&d\end{pmatrix}$ to $\begin{pmatrix}\frac{a+d}2&x\\ y&\frac{a+d}2\end{pmatrix}$ (the trace is conserved). Does such a similarity transformation exist? What about general $n\times n$ matrices?

Bonus points for an analytical formula (even if only for the 2x2 case).

  • 0
    @Tobias: I had to improve my solution which ran sometimes into local minima and did not arrive at the correct result. A simple correction of the code was seemingly enough, see my improved answer2013-02-26

3 Answers 3

1

$ \left( \begin{array}{ccc} p & q \\ r & s \end{array} \right) $ $ \left( \begin{array}{ccc} a & b \\ c & d \end{array} \right) $ $ =\left( \begin{array}{ccc} \frac{a+d}{2} & x \\ y & \frac{a+d}{2} \end{array} \right) $

$2pa+2qc=a+d$

$pb+qd=x$

$ra+sc=y$

$2rb+2sd=a+d$

$2bra+2sda=a^{2}+da$

$2bra+2bsc=2by$

$s=\frac{2by-a^{2}-da}{2bc-2da}$

$r=\frac{by-scb}{ab}$

$p=\frac{2ax-b^{2}-cb}{2ad-2bc}$

$q=\frac{x-pb}{d}$

  • 0
    @Tobias : maybe I've an sign-error. I'll chec$k$ my derivation again2011-10-13
3

I will present the final algorithm first, with follow up descriptions. Use the matrix $\pmatrix{d_0 & x \\ y & d_1}$ To obtain the complex $c$ and $s$ values, follow these steps (the text hopefully becomes clear with the later descriptions): \begin{align} y_s &= y - \bar{x} \quad&\text{ from the skew part}\\ \delta &= d_1 - d_0 \\ \delta_s &= \operatorname{Imaginary}\{\delta \} \\ \Delta &= \delta_s^2 + y_s\bar{y}_s & \text{the discriminant in the formulation for the sine} \\ s_0 &= j(\delta_s + \sqrt{\Delta}) & \text{the scaled sine value to diagonalize the skew part} \\ x_0 &= \delta y_s s_0 + y_s^2 x - s_0^2 y & \text{the first rotation's (scaled) result}\\ k_2c_2 &= j|x_0|y_s + x_0 \bar{s}_0 & \text{ the final (scaled) rotation values} \\ k_2s_2 &= j|x_0|s_0 - x_0\bar{y}_s \\ \end{align}

$y_s$ and $\delta_s$ are all the information needed from the skew Hermitian portion of the matrix, needed to diagonalize the skew Hermitian component to give a result of zero (for the new $x$ and $y$) in the skew Hermitian part of the result. This means that $x=\bar{y}$ in the result (first step). $y_s$ is not divided by $2$ as it would normally be in the calculation of the skew portion, since it is an unnecessary scale factor that is removed in the final scaling for $c$ and $s$.

Solving a particular quadratic (see this previous question) gives $k_0c_0=y_s$ and $k_0s_0 = j(\delta_s + \sqrt{\Delta})$, as described. The $k_0$ need never be found; it is a consistent scale factor that is removed in the final result (thus not included as a variable).

After $c_0$ and $s_0$ are applied to the matrix, the only necessary information from that is the $x$ value, named $x_0$ here. Thus it is the only calculated intermediate similarity value. From it, the complex phase is required, and $c_2$, $s_2$ are the final result. Find $k_2$ to scale these for unitary action, and the result gives equal diagonal values. See the python code at the end for more mundane details regarding cases where $c,s=0,0$ occur.


The algorithm tests well and works, here is a hopefully enlightening description.

First for reference, the unitary (complex Givens) rotation on a 2x2 matrix gives:

\begin{align} & \pmatrix{c & s \\ -\bar{s} & \bar{c} \\ } \pmatrix{d_0 & x \\ y & d_1 \\ } \pmatrix{\bar{c} & -s \\ \bar{s} & c \\ } \\ =& \pmatrix{c & s \\ -\bar{s} & \bar{c} \\ } \pmatrix{d_0\bar{c}+x\bar{s} & -d_0 s + x c \\ y \bar{c} + d_1 \bar{s} & -y s + d_1 c \\ } \\ =& \pmatrix{d_0c\bar{c} + d_1s\bar{s} + c\bar{s}x + \bar{c}sy & (d_1 - d_0)cs + x c^2 - ys^2 \\ (d_1 -d_0)\bar{c}\bar{s} -x\bar{s}^2 + y\bar{c}^2 & d_0s\bar{s} + d_1c\bar{c} - c\bar{s}x - \bar{c}sy } \end{align}

To set $d_1 = d_0 $ we solve $d_0c\bar{c} + d_1s\bar{s} + c\bar{s}x + \bar{c}sy = d_0s\bar{s} + d_1c\bar{c} - c\bar{s}x - \bar{c}sy $ or $\overbrace{(d_1 - d_0)}^{\delta}(s\bar{s} - c\bar{c}) + c\bar{s}(2x) +\bar{c}s(2y) = 0 \tag{1}$ Without loss of generality, use real $c\in [0,1]$, and the fact $c\bar{c} + s\bar{s} =1$ (from the unitary form of the similarity) to have $s\bar{s} - c\bar{c} = (1 - c\bar{c}) - c\bar{c} = 1 - 2c\bar{c}$ and $s = (1 - c^2)^{\frac{1}{2}}e^{j\beta}$

Then (1) becomes $\delta(1-2c^2) + 2c(1 - c^2)^{\frac{1}{2}}\left[e^{-j\beta}x +e^{j\beta}y\right]=0 \tag{2}$

From here it seems best to perform an intermediate similarity to achieve $|x| = |y|$. When that is true, the ellipse term $e^{-j\beta}x +e^{j\beta}y$ is easier to deal with. To achieve it, diagonalize the skew portion of the matrix, giving $y = \bar{x}$. This then causes the ellipse to "collapse" to a line on the real axis. The phase angle $\measuredangle xe^{j\beta} = \pm\frac{\pi}{2}$ will give the ellipse term as zero. So with $\beta = \frac{\pi}{2} + \measuredangle x$ we have the phase of $s$. For the magnitude, use the now reduced equation (2)

$\delta(1-2c^2)=0 \tag{3}$ We see here that $c=\frac{1}{\sqrt{2}}$ solves it, or using the scaled forms, $|c| = |s|$: \begin{align} c &= |x| \\ s &= j|x|e^{j\beta} = x\\ \end{align}

Here is the complete python code:

  def hollow(d0, x, y, d1):   ''' return c,s for a 2x2 unitary (complex Givens) with result of diagonals equal '''   ys=y - x.conjugate()   if ys==0: # if the matrix's skew portion is already diagonal     c = complex(0, abs(x))     s = x # s has the same phase as x and |c| = |s|   else:     d = d1 - d0     ds =  d.imag     ys2 = (ys*ys.conjugate()).real     rad = ds*ds + ys2     s0 = complex(0,ds + math.sqrt(rad)) # can do plus or minus square root here     x0 = d*ys*s0 + ys*ys*x - s0*s0*y # the x after diagonalizing the skew portion     if x0==0: c,s = ys, s0     else:       ax0 = abs(x0)       s1 = complex(0,abs(x0))       c = ys*s1 + x0*s0.conjugate()       s = s0*s1 - x0*ys.conjugate()   n = abs(complex(abs(s),abs(c))) # the square root of the sum of the norm squared of c and s   c=c/n   s=s/n   return c,s   

The function is called hollow since if the trace is zero, the result is a zero diagonal matrix, also known as a hollow matrix.

  • 0
    Nice algorithm. +12013-02-26
2

[update]: adapted the symbols S, b and c to the convention in the OP, sign error corrected [/update]

What I get for an orthogonal similarity transformation is (using r for $\small \cos(x) $ and s for $\small \sin(x) $ with some rotation-angle x and $\small S^{-1} \cdot A S $ for the matrix-multiplication

$ \small \begin{array} {rr|rr} & & r & s \\ & & -s & r \\ \hline \\ a & b & -s b+ar & rb+as \\ c & d & -sd+cr & rd+cs \\ \hline \\ r & -s & -srb+s^2d+ar^2-csr & r^2b-srd+asr-cs^2 \\ s & r & -s^2 b-srd+cr^2+asr & srb+r^2d+csr+as^2 \end{array} $

Using the abbreviations $\small r_2 = r^2-s^2=\cos(x)^2-\sin(x)^2 = \cos(2 x)$ and $\small s_2 = 2 r s= 2 \cos(x) \sin(x) = \sin(2 x)$ for the angle-duplication then I get for the resulting matrix

$ \small S^{-1} \cdot A \cdot S = \begin{array} {r|r|} (a+d)+(a-d)r_2-(b+c)s_2 & (b-c)+(a-d)s_2 + (b+c)r_2 \\ \hline \\ -(b-c) + (a-d)s_2 + (b+c)r_2 & (a+d) - (a-d)r_2 + (b+c)s_2 \\ \end{array} \cdot {1 \over 2} $

Then to have the diagonal-entries equal, the term $\small (b+c)s_2 -(a-d)r_2 $ must be zero.

[update]


The generalization for higher n seems obvious. Assume the diagonalelements $\small d_1,d_2,d_3 $, then each similarity-rotation on one pair of columns / rows modifies only two of that elements. If we denote one transformation between the columns/rows $\small T_{c_1,c_2} $ and $\small A_{c_1,c_2} = T_{c_1,c_2}^{-1} \cdot A \cdot T_{c_1,c_2} $ then the diagonal-elements behave like this over the iteration of transformations T :

$\small \begin{array} {rrr} T_{1,2}: & (d_1+d_2)/2 &, (d_1+d_2)/2 &, d_3 \\ T_{1,3}: & (d_1+d_2)/4+d_3/2 &, (d_1+d_2)/2 &, (d_1+d_2)/4+d_3/2 \\ T_{2,3}: & (d_1+d_2)/4+d_3/2 &, (d_1+d_2)3/8+d_3/4 &,(d_1+d_2)3/8+ d_3/4 \\ \ldots \end{array} $
and I think this is not too difficult to show, that iterations of this converge.


I've done an example using my (somehow primitive) MatMate-program. But I think the code will be selfexplaining enough to be translated to some other programming language.

[Update 3]: The macro had to be updated to overcome the local-minimum-problem.
We introduce a dynamic selection of the x,y-axes according to the smallest and greatest element in the diagonal of the currently iterated matrix

// Macro definitions macrodef rotpair // rotates matrix M in one plane (=x,y) using rotation-matrix t1 m1 = t'*m*t                           // get a temporary working copy                                       // get values a,b,c,d from submatrix  a,b,c,d = v(m1[x,x]),v(m1[x,y]),v(m1[y,x]),v(m1[y,y])  s_2,c_2 = a - d, b + c              // determine cos(2 phi) and sin(2 phi)  phi = -arccs(c_2,s_2)/2             // determine required rotation-angle phi  t1 = rotsp(einh(n),x,y,cos(phi),sin(phi)) // create rotation-matrix                                       //    for one x/y plane-rotation  m2 = t1' * m1 * t1                    // do similarity-rotation  t = t*t1                              // append current rotation to accumulator macroend  macrodef init  set randomstart=41  m = (randomu(n,n,-10,10))  // create some randommatrix of size n x n  t = einh(n)   // rotation-matrix, accumulates all rotations while iterating  dg = diag(m) '  // get the diag of the initial matrix  protocol = dg   // initialize some protocol for the documentation of the                  // diagonal elements macroend           macrodef run  // pairwise rotations over all pairs of coordinates   x,y = v(iminzl(dg)), v(imaxzl(dg))  // store indexes of smallest and largest                                        // diagonal-element into x and y-"coordinates"   macroexec rotpair                   // do rotation   dg = diag(m2) '   protocol = {protocol,  dg}     // append current diagonal to protocol  macroend 

 // commands in dialog:     n=5  // use matrix-size n=5     macroexec init   macroexec run    // repeat this until convergence   // commands in dialog:     n=10  // use matrix-size n=10     macroexec init   macroexec run   // repeat this until convergence 

Results

 // result: (n=5 size=5x5)   -4.2684,  7.7193,  3.0452, -0.8330, -9.9136   -4.2684, -1.0972,  3.0452, -0.8330, -1.0972   -0.6116, -1.0972, -0.6116, -0.8330, -1.0972   -0.6116, -0.8544, -0.8544, -0.8330, -1.0972   -0.8544, -0.8544, -0.8544, -0.8330, -0.8544   -0.8544, -0.8437, -0.8544, -0.8437, -0.8544   -0.8544, -0.8490, -0.8544, -0.8437, -0.8490   -0.8490, -0.8490, -0.8544, -0.8490, -0.8490   -0.8490, -0.8490, -0.8517, -0.8490, -0.8517   -0.8490, -0.8490, -0.8504, -0.8504, -0.8517   -0.8504, -0.8490, -0.8504, -0.8504, -0.8504   -0.8504, -0.8497, -0.8504, -0.8497, -0.8504   -0.8504, -0.8497, -0.8504, -0.8500, -0.8500   -0.8500, -0.8500, -0.8504, -0.8500, -0.8500   -0.8500, -0.8500, -0.8502, -0.8500, -0.8502   -0.8501, -0.8500, -0.8501, -0.8500, -0.8502   -0.8501, -0.8501, -0.8501, -0.8500, -0.8501   -0.8501, -0.8501, -0.8501, -0.8501, -0.8501    // result: (n=10 size=10x10, 40 iterations)           -4.2684,  8.0316,  7.0596, -6.1777,  3.5646, -6.5801,  9.0093,  5.8538,  2.9013, -9.6980    -4.2684,  8.0316,  7.0596, -6.1777,  3.5646, -6.5801, -0.3444,  5.8538,  2.9013, -0.3444    -4.2684,  0.7257,  7.0596, -6.1777,  3.5646,  0.7257, -0.3444,  5.8538,  2.9013, -0.3444    -4.2684,  0.7257,  0.4410,  0.4410,  3.5646,  0.7257, -0.3444,  5.8538,  2.9013, -0.3444   ...   ...     0.9696,  0.9696,  0.9695,  0.9696,  0.9695,  0.9698,  0.9695,  0.9696,  0.9696,  0.9696     0.9696,  0.9696,  0.9696,  0.9696,  0.9695,  0.9696,  0.9695,  0.9696,  0.9696,  0.9696     0.9696,  0.9696,  0.9696,  0.9696,  0.9695,  0.9696,  0.9695,  0.9696,  0.9696,  0.9695     0.9696,  0.9696,  0.9696,  0.9696,  0.9695,  0.9695,  0.9695,  0.9696,  0.9696,  0.9695 

[Update 2]:[obsolete, I found a better solution] I tried the same routine simply using n=10 instead of n=5, and with the same initializing of the randomnumber generator, so the solution should be reproducable. Unfortunately the iteration seems to run into a local minimum, such that the process converges to a non-equal solution. Here is the result near the limit:

// result ... 1.6419, 1.6419, 1.6419, 1.6419, 1.6419,-6.5801, 9.0093, 5.8538, 2.9013,-9.6980        
  • 0
    @Gottfried My function hollow() may be extended; not to have the least norm squared in the diagonal (as it works now) but to have the most possible using a unitary similarity. Thus finding the nearest normal matrix in the $2 \times 2$ case. Would this be the varimax, or maybe infomax in two dimensions? I am more familiar with eigen-decomposition than I am with statistical analysis.2013-02-26