1
$\begingroup$

Why, if the function $f$ from $U$ into $\mathbb{R}^m$ is itself a linear map $L$, is the derivative of the function $f$ at the point $x$ equal to $L$, for all $x$ in $U$?

Moreover, why is the derivative of the inclusion map of $U$ into $\mathbb{R}^n$ at any point $x$ in $U$ the identity transformation of $\mathbb{R}^n$?

  • 0
    Can you expand a bit about why you think they might possibly _not_ be equal? Without further explanation, it can look like you're just waiting to get a homework answer you can copy in without understanding any of the material.2017-01-06
  • 0
    Crudely speaking, the derivative of a function is the best local linear approximation of the function. So if the original function is linear, its derivative is identical to the function itself.2017-01-06
  • 0
    Let f be a linear map. It means that f(x1+x2) = f(x1)+f(x2) for every two element x1, x2 in U and f(ax)=a f(x). We know that the derivative of the function f at the points x1, x2 in the direction h are given by the formulas: 〖df〗_x1 (h)=(f(x1+th)-f(x1))/t=(f(x1)+tf(h)-f(x1))/t=f(h) 〖df〗_x2 (h)=(f(x2+th)-f(x2))/t=(f(x2)+tf(h)-f(x2))/t=f(h) 〖df〗_(x1+x2) (h)=(f(x1+x2+th)-f(x1+x2))/t=(f(x1)-f(x2)+tf(h)-f(x1)-f(x2))/t=f(h) So, 〖df〗_(x1+x2) (h) ≠ 〖df〗_x1+ 〖df〗_x2. The derivative is not a linear map??2017-01-06
  • 0
    If we take an example f(x,y)=5+3x-4y. The derivative of the function f is given by: dz=df=[ 3 -4] [■(dx@dy)]. If we want to find the best linear approximation of this function at the point (1,3), then: f(1,3)= - 4 z- (-4) = [ 3 -4] [■(x-1@y-3)]=3x-4y+9≠3x-4y+52017-01-06
  • 0
    @JorisBierkens: The derivative of a linear map is a constant, not "the function itself".2017-01-06
  • 0
    @hardmath: this is a matter of definition. In general it is mathematically sound to think of a derivative as a linear map. For example in a Banach space, the Frechet derivative of a function $f : X \rightarrow \mathbb R$ in $x$ is defined as the *linear map* $L : X \rightarrow X$ satisfying $\lim_{h \rightarrow 0} \frac{|f(x+h) - f(x) -L h|}{\|h\|} = 0$. Also in differential geometry, the (exterior) derivative $df$ of a mapping $f : M \rightarrow \mathbb R$ in $x$ is a linear mapping. This context is also implied here.2017-01-06
  • 0
    @ Henning Makholm I am having difficulties in proving why the derivative of a linear mapping is linear. Let f be a linear map. It means that f(x1+x2) = f(x1)+f(x2) for every two element x1, x2 in U and f(ax)=a f(x). We know that the derivative of the function f at the points x1, x2 in the direction h are given by the formulas: 〖df〗_x1 (h)=(f(x1+th)-f(x1))/t=(f(x1)+tf(h)-f(x1))/t=f(h) 〖df〗_x2 (h)=(f(x2+th)-f(x2))/t=(f(x2)+tf(h)-f(x2))/t=f(h) 〖df〗_(x1+x2) (h)=(f(x1+x2+th)-f(x1+x2))/t=(f(x1)-f(x2)+tf(h)-f(x1)-f(x2))‌​/t=f(h). So,〖df〗_(x1+x2) (h) ≠ 〖df〗_x1+ 〖df〗_x2.2017-01-06
  • 0
    @ Joris Bierkens If we take an example f(x,y)=5+3x-4y. The derivative of the function f is given by: dz=df=[ 3 -4] [■(dx@dy)]. If we want to find the best linear approximation of this function at the point (1,3), then: f(1,3)= - 4 z- (-4) = [ 3 -4] [■(x-1@y-3)]=3x-4y+9≠3x-4y+52017-01-06

1 Answers 1

1

First think about why it's true in the $n=1$ case. A linear function is $f(x)=ax$. The derivative at any point is $f'(x)=a$. You can think of $a$ as a $1\times 1$ matrix satisfying $\lim_{h\to 0}\frac{f(x+h)-f(x)-a\cdot h}{h}=0$.

The case for higher dimensions is analogous. Suppose $f(x)=Ax:\mathbb{R}^n\to\mathbb{R}^n$ is linear and represented by the matrix $A$. Fix $x_0\in\mathbb{R}^n$. Then to find the differential at $x_0$, we need to find a matrix $D$ satisfying

$$\lim_{h \rightarrow 0} \frac{\|A(x_0+h)-Ax - Dh\|}{\|h\|} = 0$$

where $h$ is also a vector in $\mathbb{R}^n$. But choosing $D=A$, this simplifies to

$$\lim_{h \rightarrow 0} \frac{\|A(x_0+h)-Ax_0 - Ah\|}{\|h\|} $$

$$\lim_{h \rightarrow 0} \frac{\|Ax_0+Ah-Ax_0 - Ah\|}{\|h\|}$$

$$\lim_{h \rightarrow 0} \frac{\|0\|}{\|h\|}$$

$$=0$$

So $A$ is the differential at $x_0$. Since $x_0$ was arbitrary, $A$ must be the differential at every point.


Another way to see this is to realize that the differential is just the matrix of partial derivatives. For simplicity I'll use the $2\times 2$ case. If the matrix of $f$ is

$$A=\begin{bmatrix}a&b\\c&d\end{bmatrix}$$

Then $f$ can be written

$$f\binom{x}{y} = \binom{f_1(x,y)}{f_2(x,y)} = \binom{ax+by}{cx+dy}$$

Computing the matrix of partial derivatives at any point gives

$$D=\begin{bmatrix}\frac{\partial f_1}{\partial x}&\frac{\partial f_1}{\partial y}\\\frac{\partial f_2}{\partial x}&\frac{\partial f_2}{\partial y}\end{bmatrix} = \begin{bmatrix}a&b\\c&d\end{bmatrix} = A$$

Which again shows that $A$ is the differential at every point.


If $g:U\to\mathbb{R}^n$ is the inclusion, then locally the map is just

$$g\begin{pmatrix}x_1\\\vdots\\x_n\end{pmatrix}=\begin{pmatrix}g_1(x_1,...,x_n)\\\vdots\\g_n(x_1,...,x_n)\end{pmatrix}=\begin{pmatrix}x_1\\\vdots\\x_n\end{pmatrix}$$

This is a linear function represented by the matrix $I_n$, so by the argument above the differential is also $I_n$. Alternatively, computing the partials at any point gives

$$D=\begin{pmatrix} \frac{\partial g_1}{\partial x_1} & \cdots & \frac{\partial g_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial g_n}{\partial x_1} & \cdots & \frac{\partial g_n}{\partial x_n} \\ \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & \cdots & 0 \\ 0 & 1 & 0 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{pmatrix} = I_n$$


Part of the confusion arises from the fact that in the $1\times 1$ case $f:\mathbb{R}\to\mathbb{R}$, the derivative (or differential) is two things: at each point it's the linear transformation $d$ that satisfies $\lim_{h\to 0}\frac{f(x+h)-f(x)-d\cdot h}{h}=0$; in this case $d$ is just a $1\times 1$ matrix AKA a real number, and thus $f'(x)$ is also a function $\mathbb{R}\to\mathbb{R}$. The analogous statement in $\mathbb{R}^n$ is that the differential is an $n\times n$ matrix at each point and thus a function $f'(x):\mathbb{R}^n\to\mathbb{R}^{n^2}$. For linear maps $f$, $f'$ is a constant function whose constant value is the matrix representing $f$.

  • 0
    Thank you very much for your great explanation. Now I am clear how I can find the derivative for higher dimensions at any point of a linear map.2017-01-06