I have been doing some self study in multivariable calculus. I get the geometric idea behind looking at a derivative as a linear function, but analytically how does one prove this?
I mean if $f'(c)$ is the derivative of a function between two Euclidean spaces at a point $c$ in the domain... then is it possible to explicitly prove that $$f'(c)[ah_1+ bh_2] = af'(c)[h_1] + bf'(c)[h_2]\ ?$$ I tried but I guess I am missing something simple.
Also how does the expression $f(c+h)-f(c)=f'(c)h + o(\|h\|)$ translate into saying that $f'(c)$ is linear?
Fréchet derivative
-
0Already in dimension one the derivative can be seen as a linear map: $h \mapsto f'(c)h$. What is your definition of derivative, in higher dimension? – 2012-08-30
-
0Yes in 1 dimension the derivative map is just a multiplication of h by a real number. But in general are the linearity properties provable? – 2012-08-30
-
2Again: what is your definition of derivative? The answer to your question is yes, but the "proof" depends on the definition you are working with. According to some authors, the derivative is a liner map *by definition*. – 2012-08-30
-
0a useful exercise is to show that the limit set to zero definition I give in my post is really the same as the $o(h)$ expression you write. Also, if you want a nice book on these things you might see Edwards Advanced Calculus text, the price isn't bad and it has many interesting examples. – 2012-08-30
-
0I am sorry if I left my question a bit vague. Well its not assumed that it is Linear in the definition.The second equation is sort of how the Derivative is defined which is got by the limit based one which I am sure you know of. – 2012-08-30
-
0So is there a proof based on that expression is what I need to know. – 2012-08-30
4 Answers
There are basically two popular approaches to differentiability, in high dimension. Let us consider a map $f \colon \mathbb{R}^n \to \mathbb{R}$ and a point $c \in \mathbb{R}$. The case of a vector-valued function is slightly more involved.
- The function $f$ is differentiable at $c$ if the partial derivatives $$\frac{\partial f}{\partial x_1}(c),\ldots,\frac{\partial f}{\partial x_n}(c)$$ exist and moreover $$f(x)=f(c)+\sum_{j=1}^n \frac{\partial f}{\partial x_j}(c)h_j + o(\|h\|)$$ as $h \to 0$. In this case, the map $$f'(c) \colon h \mapsto \left\langle \begin{pmatrix} \partial f(c)/\partial x_1 \\ \vdots \\ \partial f(c)/\partial x_n \end{pmatrix} \mid \begin{pmatrix} h_1 \\ \vdots \\ h_n \end{pmatrix}\right\rangle = \sum_{j=1}^n \frac{\partial f}{\partial x_j}(c)h_j$$ is the derivative of $f$ at $c$.
- The function $f$ is differentiable at $c$ if there exists a (continuous) linear map $f'(c) \colon \mathbb{R}^n \to \mathbb{R}$ such that $$f(x)=f(c)+f'(c)h + o(\|h\|)$$ as $h \to 0$. In this case, the derivative is $f'(c)$.
As you can see, the linearity of $f'(c)$ is inside the definition 2. In case 1. it is immediate to check that $f'(c)$ is linear.
The second definition can be given in infinite-dimensional normed spaces, where also the continuity of linear maps is not for free. Definition 1. is confined to finite-dimensional normed spaces.
-
0Thank you. I get it now. I guess there's no point in trying to prove what I wanted with the expression I used in the old fashioned way of proving linearity. – 2012-08-30
-
0I think that you should definitely give a meaning to the expression $f'(c)h$. What is this? Once you have the answer, linearity will be an immediate consequence. – 2012-08-30
-
0Well $\ f'(c)(h) $ would mean the linear derivative map of f at c acting on the vector $ \ h $ , if I say $ \ f: R^n -> R^m $ then my h would be a vector which ensures the vector in $ \ R^n $ so that the vector $ \ c+h $ would be within the confines of a suitable neigbourhood of c. – 2012-08-30
You do not have to prove that derivative is a linear operator becaouse derivative is a linear operator by its definition.
For example, in case of a numerical functions $f:\mathbb{R}\rightarrow\mathbb{R}$ if the derivative exists in some point, say $x_0$, then $f'(x_0)\in\mathbb{R}$. And, just as for all real numbers, we have that $f'(x_0)(a_1h_1+a_2h_2)=a_1f'(x_0)h_1+a_2f'(x_0)h_2$.
Behaviour of the derivative for more general functions, say between Banach spaces, is identical.
To be more precise I include definition of the derivative:
Suppose we are given a function $f:X\rightarrow Y$ where $X$ and $Y$ are Banach spaces with norms $||.||_X$ and $||.||_Y$ respectively. Let $x_0\in X$. We say that the function $f$ is differentiable at the point $x_0$ iff there exists a continous linear operator $L:X\rightarrow Y$ such that the following holds:
$$f(x_0+h)-f(x_0)=L(h)+r(h)$$
where the function $r:X\rightarrow Y$ satisfies
$$\frac{||r(h)||_Y}{||h||_X}\rightarrow 0\ \ \ \mbox{as}\ \ \ h\rightarrow 0.$$
We call the operator $L$ the derivatove of a function $f$ at the point $x_0$ and denote $Df(x_0)=L$ or $f'(x_0)=L$.
-
0Thank you, that's more of what I was looking for. At the risk of repeating myself I am saying that this means that we are modelling the linearity property already observed in 1-D for thr higher dimensional versions by including it inherently in the definition. Is that right? – 2012-08-30
-
0@Vishesh. I edited my answer to provide you the definition of a derivative. Everything should be clear for you now. There is no need to separete the 1-dimensional case, multidemnsional case or even infinite-dimensional case. In one dimensional case you just make use of the fact that the space of all continous linear functionals $\mathbb{R}\rightarrow\mathbb{R}$ is isomorphic to the reals $\mathbb{R}$. If function is differentiable it behaves around the considered point just in the same way as its differential at that point. – 2012-08-30
-
0(modulo translation by $f(x_0)$ and small error $r(.)$ of course) – 2012-08-30
It is not true that $f'(c_1+c_2)=f'(c_1)+f'(c_2)$ in general. However, the derivative $f'(c)$ is the matrix of the differential $df_c$ and the expression which you write shows why $df_c(h) = f'(c)h$ is linear in the $h$-variable. It is simply matrix multiplication and all matrix multiplication maps are linear. The subtle question is if $f'(c)$ exists for a given $f$ and $c$.
Recall that $f: \mathbb{R} \rightarrow \mathbb{R}$ has a derivative $f'(a)$ at $x=a$ if $$ f'(a) = \lim_{ h \rightarrow 0} \frac{f(a+h)-f(a)}{h} $$ Alternatively, we can express the condition above as $$ \lim_{ h \rightarrow 0} \frac{f(a+h)-f(a)-f'(a)h}{h} =0.$$ This gives an implicit definition for $f'(a)$. To generalize this to higher dimensions we have to replace $h$ with its length $||h||$ since there is no way to divide by a vector in general. Recall that $v \in \mathbb{R}^n$ has $||v|| = \sqrt{v \cdot v}$. Consider $F: U \subseteq \mathbb{R}^m \rightarrow \mathbb{R}^n$ if $dF_{a}: \mathbb{R}^m \rightarrow \mathbb{R}^n$ is a linear transformation such that $$ \lim_{ h \rightarrow 0} \frac{F(a+h)-F(a)-dF_a(h)}{||h||} =0 $$ then we say that $F$ is differentiable at $a$ with differential $dF_a$. The matrix of the linear transformation $dF_{a}: \mathbb{R}^m \rightarrow \mathbb{R}^n$ is called the Jacobian matrix $F'(a) \in \mathbb{R}^{m \times n}$ or simply the derivative of $F$ at $a$. It follows that the components of the Jacobian matrix are partial derivatives of the component functions of $F = (F_1,F_2, \dots , F_n)$ $$ J_F = \left[ \begin{array}{cccc} \partial_1 F_1 & \partial_2 F_1 & \cdots & \partial_n F_1 \\ \partial_1 F_2 & \partial_2 F_2 & \cdots & \partial_n F_2 \\ \vdots & \vdots & \vdots & \vdots \\ \partial_1 F_m & \partial_2 F_m & \cdots & \partial_n F_m \\ \end{array} \right] = \bigl[\partial_1F \ | \ \partial_2F \ | \cdots | \ \partial_nF\ \bigr] = \left[ \begin{array}{c} (\nabla F_1)^T \\ \hline (\nabla F_2)^T \\ \hline \vdots \\ \hline (\nabla F_m)^T \end{array} \right]. $$
-
0Thanks a lot, that was very detailed. I think you meant $ \ f'(c)(h_1+h_2)=f'(c)(h_1)+f'(c)(h_2) $ – 2012-08-30
-
0Correct. That is what I intended by the statment "linear in $h$" – 2012-08-30
Assume $f\colon\mathbb{R}^n\to\mathbb{R}^m$ has continuous partial derivatives of the first order. Then it is a standard result that $f$ is Fréchet differentiable, i.e., $$f(x+h)=f(x)+f'(x)h+o(\lvert h\rvert)$$ where $f'(x)$ is a linear map $\mathbb{R}^n\to\mathbb{R}^m$. Moreover, the matrix of this map is what you would expect, consisting of the partial derivatives of the component functions.
-
0Typo: $f(x+h)=f(x)+f'(x)h+o(\|h\|)$. – 2012-08-30
-
0@Siminore: Oy! Thanks. ☺ – 2012-08-30