I have a question about least squares and about what happens, if the function that we minimize, $E(P)$, is not linear in its parameters $P$.
Assume we want to minimize a function (the exact terms are not so important, these are just examples) $ E(\mathbf{p}) = = \frac{1}{2}\| E_a(\mathbf{p}) + \alpha E_b(\mathbf{p}) \|^2$ where $ E_a(\mathbf{p}) = \sum_i\| x_i-p_i\|^2$ and $ E_b(\mathbf{p}) = \sum_i\| y_i-p_i\|^2$ where $\mathbf{p}\in\mathbb{R}^N$, $x_i$ and $y_i$ are scalars, and $p_i$ refers to the $i$-th element of $\mathbf{p}$.
Now, on the way to setting up a (hopefully linear) equation system, computing the partial derivative of $E(\mathbf{p})$ wrt. $p_m$ gets us $ \frac{\partial E(\mathbf{p})}{\partial p_m} = \| E_a(\mathbf{p}) + \alpha E_b(\mathbf{p}) \| \Bigl( \frac{\partial E_a(\mathbf{p})}{\partial p_m} + \alpha\frac{\partial E_b(\mathbf{p})}{\partial p_m} \Bigr) $
My questions: Is $\frac{\partial E(\mathbf{p})}{\partial p_m}$ really linear in $\mathbf{p}$ (despite of the additional $\|\cdot\|^2$)?