2
$\begingroup$

I try to calculate parameters of some mathematical model.

$T = k_1\cdot aa^T + k_2\cdot a(1-a)^T+k_3 \cdot(1-a)a^T$

$T$ is $N\times N$ matrix of measurements

$a$ is a column vector of parameters of the model, $a:\{a_i\}, 0

$k_1, k_2, k_3 > 0$ are known constants (well, not known, they come from assumptions)

$1$ in $(1-a)$ stands for a vector composed of ones

How to find approximate $a$?

  • 0
    First of all, the equations are not linear. Second, have you tried writing everything out in the case $N=2$, to see what the equations wind up looking like?2012-02-14

1 Answers 1

2

Here are some ideas and observations, not a complete answer. It looks like a nonlinear regression problem to me, but I would be curious about the nature of the model and its constraints, whether any additional assumptions could be made, and whether someone better qualified would have a look at this.

Let $a=(a_1~\cdots~~a_N)^T\in\mathbb{R}^{N\times1}$, $1\equiv1_{N\times1}\in\mathbb{R}^{N\times1}$, and note that $ \left( a(1-a)^T \right)^T =(1-a)a^T, $ so that $k_2$ and $k_3$ are coefficients of transposed matrices, $a(1-a)^T=a\cdot1_{1\times N}-aa^T$ and $(1-a)a^T=1\cdot a^T-aa^T.$ Let $A=k_1-k_2-k_3$, $B=k_2$ and $C=k_3$, define the residual error matrix $ \begin{eqnarray} E &=& k_1aa^T+k_2a(1-a)^T+k_3(1-a)a^T - T \\ &=& A\;aa^T+B\;a\cdot 1^T+C\;1\cdot a^T-T \end{eqnarray} $ so that $ E_{ij} = A\;a_ia_j+B\;a_i+C\;a_j-T_{ij}. $ Now we want $E\approx0$. Does this look like a (nonlinear least squares) regression problem to anyone?

Notice that $A=C=0\ne B$ would model dependence of $T_{ij}$ on $i$ (equal rows $Ba_i=\frac1N\sum_jT_{ij}$), that $A=B=0\ne C$ would model dependence on $j$ (equal columns $Ca_j=\frac1N\sum_iT_{ij}$), and that a dominant $A$ models a (symmetric) covariance matrix. Another ideal case would be when we could write $Aa_ia_j+Ba_i+Ca_j=(B_1a_i+B_0)(C_1a_j+C_0)-B_0C_0.$ How could we exploit such a factorization, and what condition would $T$ need to satisfy (perhaps $NB_0C_0=\sum_{ij}T_{ij}$)?

Two (or three) possible target functions (to minimize) come to mind: $EE^T$, $E^TE$ and $||E||^2$. (The first two are not identical unless $T=T^T$. Unfortunately we don't know that $T$ is symmetric, and the whole point of having distinct $B$ & $C$ must be to allow for nonsymmetric $T$, so it would probably be of no use to take $\frac{T+T^T}{2}$ in place of $T$ above.)

If we take $||E||^2=\sum_{ij}E_{ij}^2$ as our target function, then minimizing it boils down to solving the equations $ 0=\frac12\frac{\partial}{\partial a_k}||E||^2 =\frac12\frac{\partial}{\partial a_k}\sum_{ij} \left(A\;a_ia_j+B\;a_i+C\;a_j-T_{ij}\right)^2 $ $ =(2C^2)a_k^3 +3C(A+B)a_k^2 +\left[(A+B)^2-2CT_{kk}+\sum_{i\ne k}(A+Ca_i)^2+(B+Ca_i)^2\right]a_k $ $ +\left[-(A+B)T_{kk} +\sum_{i\ne k} (A+Ca_i)(Ba_i-T_{ki})+ (B+Ca_i)(Aa_i-T_{ik}) \right] $ (or something similar, if I have made an error) which involves cubic degree and interdependence, and so does not look very tractable. One could use the standard tricks involving average values of (rows/columns of) $T$ to get a little further, and use Gröbner Bases to find a solution (if one exists). Or, one could avoid this and use in stead the standard iterative approach for nonlinear least squares regression.