3
$\begingroup$

I have a set of numbers $x_i$ and I know sums of certain subsets $y_i=\sum x_{\sigma_k}$. All $x_i>0$ and I'm looking for a simple solution.

With some internet research I found that this might be related to problems in signal processing. So basically I have given a vector $\mathbf{y}$ and a matrix $\mathbf{A}$ with $y_i>0$ and $A_{ij}\in\{0,1\}$. I'm looking for a solution to the vector $\mathbf{x}$ ($x_i\geq 0$) with

$\mathbf{A}\mathbf{x}=\mathbf{y}$

where this linear equation is underdetermined.

Apparently to complete this problem several norms to minimize on $\mathbf{x}$ are possible. For my particular task it's not clear whether I need L0, L1 or L2 norm, so any solution will do - as long as it's simple. Approximate solution like iterative approaches are also fine.

Can you suggest a way to solve this problem? I'm looking for a reference to an algorithm which I can understand as a non-mathematician. Even better would be an open source implementation that I can download. And it would be perfect if it were a Python solution.

  • 0
    So, you want to minimize x, subject to the constraints Ax=y (equality constraint) and x >= 0 (inequality constraint). Is that correct? I think that's called a "quadratic programming" problem. I don't know a Python implementation of QP, though. Maybe you can solve it using the scipy.optimize.2012-12-17

3 Answers 3

-1

If you have MATLAB simply use the "\" or "/" operator will do the trick. It solves the equation Ax = y in a least squared sense if it's under-determined. It determines the solution by solving |Ax-y|^2 = min. If you don't have Matlab you can do it manually by defining your error E = (Ax-y) * (Ax-y)' and then calculating the partial differentials dE/dyi setting them to zero. This give and system of linear equations that matches the number of elements and can therefore be solved using a regular linear equation solver.

  • 0
    It comes close, but I'm afraid it doesn't guarantee a positive solution.2012-12-16
4

You can solve the system in a least-squares sense:

$\mathbf{Ax}=\mathbf{y}$ $\mathbf{A^{T}Ax}=\mathbf{A^{T}y}$ $\mathbf{Jx}=\mathbf{r}$ $\mathbf{x}=\mathbf{J^{-1}r}$

where $\mathbf{J=A^{T}A}$ and $\mathbf{r=A^{T}y}$.

Note that $\mathbf{J^{-1}r=(\mathbf{A^{T}A)^{-1}A^{T}y}}$ which is application of left pseudoinverse of $\mathbf{A}$ - this obtain the least-squares solution in $\mathbf{x}$ if $\mathbf{A}$ is overdetermined or have full rank - but this may no be our case.

The $\mathbf{J}$ is $n\times n$ and is possibly rank-deficient (underdetermined solution).

The SVD of $\mathbf{J}$ is then

$\mathbf{J}=USV^{T}$

where

$U$ is $n\times n$ orthogonal.

$V$ is $n\times n$ orthogonal.

$S$ is $n\times n$ diagonal, with diagonal elements $\sigma_{1} \geq \sigma_{2} \geq \cdots \geq \sigma_{n} > 0$.

The solution of your linear system is given by

$\mathbf{x}=\mathbf{J^{-1}r}=\left(USV^{T}\right)^{-1}=VS^{-1}U^{T}\mathbf{y}$

or more specifically:

$\mathbf{x}=\sum_{i=1}^{n}\frac{u_{i}^{T}\mathbf{r}}{\sigma_{i}}v_{i}$

where $u_{i}\in \mathbb{R}^{m}$ and $v_{i}\in \mathbb{R}^{n}$ are i-th columns of $U$ and $V$, respectively.

We can extend the above sum for rank-deficient cases:

$\mathbf{x}=\sum_{\sigma_{i}\neq 0}\frac{u_{i}^{T}\mathbf{r}}{\sigma_{i}}v_{i}+\sum_{\sigma_{i}=0}\tau_{i}v_{i}$

where $\tau_{i}$ are arbitrary coefficients (any choice of $\tau_{i}$ satisfies your linear system).

It should be noted that by choosing $\tau_{i}=0$ yield minimum-norm solution, which is usually the most desirable one in undetermined and ill-conditioned systems (where singular values are almost zero).

Source: Nocedal, Wright: "Numerical Optimization, Second Edition", chapter 10.2 Linear Least-Squares Problems, p. 250

  • 0
    Ah, I see. With adding the transpose it makes more sense :) Thanks. I need to look into it more closely, because I also need to incorperate the non-negativity conditions.2012-12-17
3

Time warp to 2018 in case it helps the next person...

It sounds like you were trying to solve the underdetermined system of equations $Ax=y$ subject to the constraint $x\geq0$. As you mentioned, this cannot be solved eactly so you have to minimise the p-norm of $||Ax-y||_p$ where $p$ is some number. The Euclidean norm ($p=2$) is the simplest way to go. You can formulate your problem as a positive semidefinite quadratic programming problem subject to box constraints. There are many specialised solvers out there that can help you with this problem.

To formulate: $\text{min }\space\space f(x)=||Ax-y||_2\space\space\space \text{ s.t. }\space x\geq0$

$\begin{align*} f(x)=||Ax-y||_2 &=x^TA^TAx-2y^TAx+\text{const.} \\ &=\frac{1}{2}x^T(2A^TA)x+(-2y^TA)x+\text{const.} \\ &=\frac{1}{2}x^TQx+c^Tx+\text{const.} \end{align*}$

with the obvious substitutions for $Q$ and $c$ (ignore the constant). Clearly, if this was unconstrained, we could solve it in the way others have mentioned. $\nabla f(x^*)=0=2A^TAx^*-2A^Ty\implies A^TAx^*=A^Ty$

However, you have inequality constraints so you cannot proceed analytically. Instead, you should use a quadratic programming solver to find the optimal solution.

There is a potential pitfall here. You mentioned that your problem was underdetermined. That is, your matrix $A$ has more columns than rows. In that case, your matrix $Q=2A^TA$ will be positive semi-definite but not positive definite. Many solvers will require $Q$ to be positive definite (as they will assume that $Q$ can be factorised via Cholesky decomposition). In English, what this means is that your underdetermined system may have many solutions and a lot of solvers expect there to be a unique solution.

Therefore, you should find a solver that can handle positive semidefinite $Q$ matrices or employ the standard trick of slightly perturbing $Q$ such that it becomes positive definite. That is

$Q\rightarrow \tilde{Q}=Q+\epsilon \mathbb{1}$

where you add a very small amount of the identity matrix to $Q$ to make it positive definite.


As you asked specifically about Python, there are some excellent solvers for Python out there if you look. I like the quadprog solver which implements Goldfarb-Idnani's active set algorithm. It requires $Q$ to be PD so you will need to use the trick suggested above if you use quadprog.

  • 0
    @Gerenuk in theory, $\epsilon$ can be arbitrarily small but, in practice, a very small perturbation could lead to numerical instabilities.$A$balance must be struck between stability and not changing the solution too much. Standard way to check pd? Check the diagonal is all positive and, if so, check whether the Cholesky factorisation succeeds. If it succeeds, it is PD. In the case above, however, the matrix $A$ will, at best achieve full row rank and so $Q$ cannot possibly achieve full rank. Hence, It will be positive semidefinite.2018-04-15