I have been doing some self study in multivariable calculus. I get the geometric idea behind looking at a derivative as a linear function, but analytically how does one prove this?
I mean if $f'(c)$ is the derivative of a function between two Euclidean spaces at a point $c$ in the domain... then is it possible to explicitly prove that $f'(c)[ah_1+ bh_2] = af'(c)[h_1] + bf'(c)[h_2]\ ?$ I tried but I guess I am missing something simple.
Also how does the expression $f(c+h)-f(c)=f'(c)h + o(\|h\|)$ translate into saying that $f'(c)$ is linear?
Fréchet derivative
-
0So is there a proof based on that expression is what I need to know. – 2012-08-30
4 Answers
There are basically two popular approaches to differentiability, in high dimension. Let us consider a map $f \colon \mathbb{R}^n \to \mathbb{R}$ and a point $c \in \mathbb{R}$. The case of a vector-valued function is slightly more involved.
- The function $f$ is differentiable at $c$ if the partial derivatives $\frac{\partial f}{\partial x_1}(c),\ldots,\frac{\partial f}{\partial x_n}(c)$ exist and moreover $f(x)=f(c)+\sum_{j=1}^n \frac{\partial f}{\partial x_j}(c)h_j + o(\|h\|)$ as $h \to 0$. In this case, the map $f'(c) \colon h \mapsto \left\langle \begin{pmatrix} \partial f(c)/\partial x_1 \\ \vdots \\ \partial f(c)/\partial x_n \end{pmatrix} \mid \begin{pmatrix} h_1 \\ \vdots \\ h_n \end{pmatrix}\right\rangle = \sum_{j=1}^n \frac{\partial f}{\partial x_j}(c)h_j$ is the derivative of $f$ at $c$.
- The function $f$ is differentiable at $c$ if there exists a (continuous) linear map $f'(c) \colon \mathbb{R}^n \to \mathbb{R}$ such that $f(x)=f(c)+f'(c)h + o(\|h\|)$ as $h \to 0$. In this case, the derivative is $f'(c)$.
As you can see, the linearity of $f'(c)$ is inside the definition 2. In case 1. it is immediate to check that $f'(c)$ is linear.
The second definition can be given in infinite-dimensional normed spaces, where also the continuity of linear maps is not for free. Definition 1. is confined to finite-dimensional normed spaces.
-
0Well $\ f'(c)(h) $ would mean the linear derivative map of f at c acting on the vector $ \ h $ , if I say \ f: R^n -> R^m then my h would be a vector which ensures the vector in $ \ R^n $ so that the vector $ \ c+h $ would be within the confines of a suitable neigbourhood of c. – 2012-08-30
You do not have to prove that derivative is a linear operator becaouse derivative is a linear operator by its definition.
For example, in case of a numerical functions $f:\mathbb{R}\rightarrow\mathbb{R}$ if the derivative exists in some point, say $x_0$, then $f'(x_0)\in\mathbb{R}$. And, just as for all real numbers, we have that $f'(x_0)(a_1h_1+a_2h_2)=a_1f'(x_0)h_1+a_2f'(x_0)h_2$.
Behaviour of the derivative for more general functions, say between Banach spaces, is identical.
To be more precise I include definition of the derivative:
Suppose we are given a function $f:X\rightarrow Y$ where $X$ and $Y$ are Banach spaces with norms $||.||_X$ and $||.||_Y$ respectively. Let $x_0\in X$. We say that the function $f$ is differentiable at the point $x_0$ iff there exists a continous linear operator $L:X\rightarrow Y$ such that the following holds:
$f(x_0+h)-f(x_0)=L(h)+r(h)$
where the function $r:X\rightarrow Y$ satisfies
$\frac{||r(h)||_Y}{||h||_X}\rightarrow 0\ \ \ \mbox{as}\ \ \ h\rightarrow 0.$
We call the operator $L$ the derivatove of a function $f$ at the point $x_0$ and denote $Df(x_0)=L$ or $f'(x_0)=L$.
-
0(modulo translation by $f(x_0)$ and small error $r(.)$ of course) – 2012-08-30
It is not true that $f'(c_1+c_2)=f'(c_1)+f'(c_2)$ in general. However, the derivative $f'(c)$ is the matrix of the differential $df_c$ and the expression which you write shows why $df_c(h) = f'(c)h$ is linear in the $h$-variable. It is simply matrix multiplication and all matrix multiplication maps are linear. The subtle question is if $f'(c)$ exists for a given $f$ and $c$.
Recall that $f: \mathbb{R} \rightarrow \mathbb{R}$ has a derivative $f'(a)$ at $x=a$ if $ f'(a) = \lim_{ h \rightarrow 0} \frac{f(a+h)-f(a)}{h} $ Alternatively, we can express the condition above as $ \lim_{ h \rightarrow 0} \frac{f(a+h)-f(a)-f'(a)h}{h} =0.$ This gives an implicit definition for $f'(a)$. To generalize this to higher dimensions we have to replace $h$ with its length $||h||$ since there is no way to divide by a vector in general. Recall that $v \in \mathbb{R}^n$ has $||v|| = \sqrt{v \cdot v}$. Consider $F: U \subseteq \mathbb{R}^m \rightarrow \mathbb{R}^n$ if $dF_{a}: \mathbb{R}^m \rightarrow \mathbb{R}^n$ is a linear transformation such that $ \lim_{ h \rightarrow 0} \frac{F(a+h)-F(a)-dF_a(h)}{||h||} =0 $ then we say that $F$ is differentiable at $a$ with differential $dF_a$. The matrix of the linear transformation $dF_{a}: \mathbb{R}^m \rightarrow \mathbb{R}^n$ is called the Jacobian matrix $F'(a) \in \mathbb{R}^{m \times n}$ or simply the derivative of $F$ at $a$. It follows that the components of the Jacobian matrix are partial derivatives of the component functions of $F = (F_1,F_2, \dots , F_n)$ $ J_F = \left[ \begin{array}{cccc} \partial_1 F_1 & \partial_2 F_1 & \cdots & \partial_n F_1 \\ \partial_1 F_2 & \partial_2 F_2 & \cdots & \partial_n F_2 \\ \vdots & \vdots & \vdots & \vdots \\ \partial_1 F_m & \partial_2 F_m & \cdots & \partial_n F_m \\ \end{array} \right] = \bigl[\partial_1F \ | \ \partial_2F \ | \cdots | \ \partial_nF\ \bigr] = \left[ \begin{array}{c} (\nabla F_1)^T \\ \hline (\nabla F_2)^T \\ \hline \vdots \\ \hline (\nabla F_m)^T \end{array} \right]. $
-
0Correct. That is what I intended by the statment "linear in $h$" – 2012-08-30
Assume $f\colon\mathbb{R}^n\to\mathbb{R}^m$ has continuous partial derivatives of the first order. Then it is a standard result that $f$ is Fréchet differentiable, i.e., $f(x+h)=f(x)+f'(x)h+o(\lvert h\rvert)$ where $f'(x)$ is a linear map $\mathbb{R}^n\to\mathbb{R}^m$. Moreover, the matrix of this map is what you would expect, consisting of the partial derivatives of the component functions.
-
0@Simi$n$ore: Oy! Thanks. ☺ – 2012-08-30