This is part-1 of a series of questions regarding (ultimately), in my implementing a co-ordinate descent algorithm, but I have broken it up into parts as I try to solve it 'by hand' first, so that I can better understand.
So basically, I am given a vector, $\begin{bmatrix} x_{1} & x_{2} & x_{3} \end{bmatrix} $, and I am trying to minimize a function of this vector, given by:
$ H(x_{1}, x_{2}, \cdots, x_{N}) = \displaystyle\sum\limits_{i=1}^N \displaystyle\sum\limits_{j=1}^N \frac{\lvert x_{i} - x_{j} \rvert^{p}}{p} $
where here obviously for the sake of simplicity, $N=3$, and I have chosen $p=2$. What this is trying to do is minimize the sum of the total absolute differences (raised to a power) of all the samples of the vector. Now, I know the answer in the end is $x_{1} = x_{2} = x_{3} = c$, where $c$ is just some constant. In other words, a flat line.
So what I did, was take the partial derivatives of $H$ as a function of $x_{1}, x_{2}$ and $x_{3}$, and set them all to 0. The final equations I come up with are:
$ 2x_{1} - x_{2} - x_{3} = 0 \\ -x_{1} + 2x_{2} -x_{3} = 0 \\ -x_{1} - x_{2} + 2x_{3} = 0 $
My questions are as follows:
Is this correct?
- If so, how does one determine the fact that all x's must be equal to each other from here?
What would be be partial derivative equations if p = 1?
One bonus ease of implementation question:
- Is there an easy way to show all those results in wolfram alpha for this particular case? (I am new to it, but also need a quick way to test my hand calcs through it).
Thanks in advance!