27
$\begingroup$

In Leibniz notation, the 2nd derivative is written as $\dfrac{\mathrm d^2y}{\mathrm dx^2}\ ?$

Why is the location of the $2$ in different places in the $\mathrm dy/\mathrm dx$ terms?

  • 0
    @Doug: rereading these comments, I find I was being unusually flippant. More seriously: of course the Leibniz notation is used by almost everyone at times, including me. What is "unforunate" is the sort of yes/no answer that you get (or give!) when asked whether $\frac{dy}{dx}$ is actually a ratio of two quantities.2011-08-21

5 Answers 5

25

Somewhat mundanely,

$ \frac{d}{dx}\left(\frac{d}{dx}(y)\right) = \frac{d}{dx}\left(\frac{dy}{dx}\right) = \frac{d\,dy}{dx\,dx} = \frac{d^2 y}{dx^2} $

  • 2
    @WillR Not my answer; and the length of the comment is also related to the length of the misunderstanding in the question being asked, so... no. In fact, the notation is very useful and very flexible and indicative; for example, it immediately tells you the correct units to use for the $n$th derivative (units of $y$ divided by (units of $x$)${}^n$. So... no. The real answer is that it needs proper interpretation. Just line $\sin^2(x)$ is correctly interpreted as $(\sin(x))^2$, and not as a product of $s$, $i$, $n^2$, and $x$; or, for that matter, $\sin($n$)$, where the two $n$s are different.2017-12-07
22

Purely symbolically, if we accept that $dy = f'(x)\,dx$, and treat $dx$ as a constant, then $d^2y = d(dy) = d(f'(x)\,dx) = dx\,d(f'(x)) = dx\,f''(x)\,dx = f''(x)\,(dx)^2,$ so dividing yields: $\frac{d^2y}{(dx)^2} = \frac{d^2y}{dx^2} = f''(x).$

As to where this notation actually comes from, though: My guess is that it comes from a time when mathematicians primarily thought of $dx$ and $dy$ as "infinitesimal quantities." There are ways of doing so rigorously (via non-standard analysis), and perhaps there is a way of making this notation rigorous that way.


However, we can still give rigorous meaning to these calculations without appealing to non-standard analysis by using the language of bilinear forms.

If $f$ is differentiable, we can define a map \begin{align*} df\colon \mathbb{R} & \to L(\mathbb{R}; \mathbb{R}) \\ df(x)(dx) & = f'(x)\,dx. \end{align*} Here, $L(\mathbb{R};\mathbb{R})$ denotes the set of linear maps from $\mathbb{R} \to \mathbb{R}$, and $dx$ is simply a real number. Going one step further, we can consider the map $d^2f = d(df)\colon \mathbb{R} \to L(\mathbb{R};L(\mathbb{R};\mathbb{R})).$ By identifying $L(\mathbb{R}; L(\mathbb{R}; \mathbb{R}))$ with the set of bilinear maps $B(\mathbb{R} \times \mathbb{R};\mathbb{R})$, we have the bilinear map $d^2f(x)(dx^1, dx^2) = dx^1\, f''(x) \,dx^2$ whose associated quadratic form is $d^2f(x)(dx) = f''(x)\,(dx)^2.$ It is now perfectly legal to divide on both sides by $(dx)^2$, obtaining $\frac{d^2f}{dx^2} = f''(x).$

  • 1
    @JosephGarvin: I'm using $df$ as shorthand for $d(f)$, and using $d(df)$ for $d(d(f))$. The operator $d$ inputs functions $f \colon \mathbb{R} \to V$ (where $V$ is a real Banach space) and outputs a certain function called $d(f) \colon \mathbb{R} \to L(\mathbb{R}; V)$. Defining the operator $d$ means defining $d(f)$ for all $f$. Conversely, defining $d(f)$ for every $f$ defines the operator $d$. The definition when $V = \mathbb{R}$ as in my post. For general $V$, see the [Wikipedia article](https://en.wikipedia.org/wiki/Fr%C3%A9chet_derivative#Higher_derivatives) I keep referencing.2017-12-10
6

The $d$ is meant to represent the "change in". And the Leibniz notation is meant to remind you that you are computing the ratio between the change in $y$ and the change in $x$.

When you take the second derivative, you are computing how the derivative is changing as $x$ changes; that is, you are trying to compute $\frac{d(y')}{dx}.$ Now, $y'$ is itself a rate of change: it is the rate at which $y$ changes. So the "numerator" of the differential notation is telling you that you are trying to consider the change in the change in $y$, not the change in $y^2$ (which is what "$dy^2$" would represent).

So you are trying to describe the change in "the-change-in-$y$", relative to how $x$ is changing. $x$ is only changing "once", so you should have a single $d$ in the "denominator" (remember, not really a denominator). So why $x^2$? Because you are trying to figure out the change of blah as $x$ changes, and blah is a rate of change as $x$ changes as well. So you are taking $x$ twice, but considering only one change. Hence, single $d$, but $x$ squared.

  • 0
    @JosephGarvin: No, it doesn't; "blah" in this case is whatever it is that $y'$ is measuring. If $y$ is position, and $x$ is time, with $y'$ you are trying to figure out how the rate of change of position over time; if you are trying to figure out the second derivative, you are trying to to figure out the rate of change of the rate of change, over time squared (which is why if position is measured in miles and time in hours, velocity is measured in miles per hour, but acceleration, the rate of change of velocity, is measured in miles per hour **squared**). `blah` in that example is `velocity`.2017-12-07
1

There is no possible way of understanding why Leibniz invented the notation he did unless you think about calculus the way Leibniz did, using infinitesimal numbers.

Take the velocity $dx/dt$. Leibniz would have described it as the ratio of two infinitesimals. (Nonstandard analysis shows that this idea can be made rigorous, but in any case limits didn't exist in Leibniz's time.) The numerator is an infinitesimal number with units of meters. The denominator is an infinitesimal with units of seconds. You divide them, and it gives m/s.

In the acceleration, $d^2x/dt^2$, the numerator is written to suggest something with units of meters, and the denominator to suggest units of seconds squared, giving the correct units of m/s$^2$.