5
$\begingroup$

I am wondering if $f(\mathbf{x})$ is convex on the input of a vector of $n$ positive reals $\mathbf{x}$:

$$f(\mathbf{x})=\operatorname{Tr}[(\mathbf{A}+\operatorname{diag}(\mathbf{x}))^{-1}]$$

where $\mathbf{A}$ is a positive-definite $n\times n$ matrix of reals, $\operatorname{diag}(\mathbf{x})$ returns a diagonal matrix with values from $\mathbf{x}$ placed on the diagonal, and $\operatorname{Tr}[\mathbf{M}]$ is the trace of matrix $\mathbf{M}$.

I understand that I can prove convexity via taking the second derivative, but I am not sure how to it in the matrix case. I also know that for positive-definite $\mathbf{A}$, $g(\mathbf{y})=\frac{1}{2}\mathbf{y}^T\mathbf{A}\mathbf{y}+\mathbf{y}^T\mathbf{b}$ is strictly convex, but I am not sure how that applies here (if it applies at all).

1 Answers 1

3

Consider $g(t) = {\bf u}^T (C + tB)^{-1} \bf u$ for any real vector $\bf u$, where $C$ and $B$ are real symmetric matrices and $C+tB$ is positive definite. Note that the first derivative of $(C + tB)^{-1}$ with respect to $t$ is $- (C + tB)^{-1} B (C + tB)^{-1}$, and the second derivative is $2 (C+tB)^{-1} B (C+tB)^{-1} B (C+tB)^{-1}$. So $$g''(t) = 2 {\bf u}^T (C+tB)^{-1} B (C+tB)^{-1} B (C+tB)^{-1} {\bf u} = 2 {\bf v}^T (C+tB)^{-1} \bf v$$ where ${\bf v} = B (C+tB)^{-1} \bf u$. Since $(C+tB)^{-1}$ is positive definite when $C+tB$ is positive definite, $g''(t) \ge 0$, i.e. $g$ is convex as long as $C+tB$ is positive definite. Now $Tr (C+tB)^{-1} = \sum_j {\bf u}_j^T (C+tB)^{-1} {\bf u}_j$ for an orthonormal basis ${\bf u}_j$, so this is also convex. And $f(t {\bf x} + (1-t) {\bf y})$ is of this form, with $C = A + \text{diag}({\bf y})$ and $B = \text{diag}({\bf x} - {\bf y})$, when $0 \le t \le 1$ (note that for $0 \le t \le 1$, $t {\bf x} + (1-t) {\bf y}$ has nonnegative entries, and the sum of a positive definite matrix and a nonnegative diagonal matrix is positive definite). That says $f$ is convex.

  • 0
    Thank you for your proof -- it's very neat! I'd like to make sure that I understand it correctly: in your proof $g(t)$ is convex with respect to $t$ and arbitrary symmetric $B$ (as long as $C+tB$ are positive definite. Then $f(\mathbf{x})$ is convex with respect to $\mathbf{x}$ because it can written as $g(t)$ given the conditions we have imposed on $\mathbf{x}$. Is this the idea?2012-06-01
  • 1
    $f({\bf x})$ itself isn't written as $g(t)$, but its restriction to any line segment in its domain can be written as a sum of functions of the form $g(t)$. A function on a convex domain is convex iff its restriction to any line segment in its domain is convex.2012-06-01
  • 0
    Interesting. Is the last sentence of your comment a known theorem/lemma? And is there a good reference for its proof and related results? I apologize for asking all these questions, it's just I am interested in not only learning the details particular proof but also *how* to construct proofs of similar nature... Again, thank you!2012-06-01
  • 0
    I am wondering if one can simplify the proof even further. As you stated, a function on a convex domain is convex iff its restriction to any line segment in its domain is convex. Take arbitrary positive real vector $\mathbf{x}$. Now $\mathbf{B}=(\mathbf{A}+\operatorname{diag}(\mathbf{x}))^{-1}$ is positive definite. Thus, $\mathbf{u}^T\mathbf{B}\mathbf{u}$ is convex for any $\mathbf{u}$ including orthonormal basis vector, and we are done since trace is the sum over those. This looks simpler, without need to take the derivative with respect to $t$. What do you think? Anything I did wrong?2012-06-02
  • 1
    @M.B.M.: the fact that a function on a convex domain is convex iff its restriction to any line segment in its domain is convex is immediate from the definition: $f$ is convex if $f(tx + (1-t) y) \le t f(x) + (1-t) f(y)$ for $0 \le t \le 1$, $x$ and $y$ in the domain. This is a statement about $f$ on the line segment from $x$ to $y$.2012-06-03
  • 0
    I don't understand your "simplified proof". How do you know $u^TB u$ is convex (as a function of $x$, not of $u$) without taking the derivative?2012-06-03
  • 0
    Thank you, I get it now. And yes, one can't get away without checking the derivative, so my "proof" doesn't work. I really appreciate you taking the time to respond to my queries and pointing out a way to prove convexity that I didn't know before. Again, thank you!2012-06-03