24
$\begingroup$

In Leibniz notation, the 2nd derivative is written as $$\dfrac{\mathrm d^2y}{\mathrm dx^2}\ ?$$

Why is the location of the $2$ in different places in the $\mathrm dy/\mathrm dx$ terms?

  • 13
    Mostly to confuse you. (I am somewhat serious.)2011-03-05
  • 9
    It's somewhat logical that squaring the operator $\frac{d}{dx}$ results in $\frac{d^2}{dx^2}$ though... now let it operate on $y$.2011-03-05
  • 0
    @Pete L. Clark: Really? I mean, there are ways of making the notation rigorous...2011-03-05
  • 5
    @Jesse: not *really* really. I agree that the notation is sensible (as evidenced by the answers below). But from a pedagogical perspective, I have found that explaining it is often more trouble than it's worth: it's tempting to switch to $y''$ instead. *Really* really the entire Leibniz notation, suggesting a ratio of differentials, is somewhat unfortunate. You could make it rigorous using (e.g.) nonstandard analysis, but the average student -- even if she is a future mathematician -- is not going to see or use that perspective.2011-03-05
  • 0
    @Jesse: the same remarks apply to your (nice) explanation: try it out on a calculus student, or even a typical undergrad math major, and see what kind of response you get.2011-03-05
  • 0
    @Pete: For calculus students, sure, I can imagine that explaining the notation probably is an irritation. And while the average mathematician may not ever use non-standard analysis, surely she would encounter Hessians and Frechet derivatives and such at some point.2011-03-05
  • 0
    @Pete: Ah, sorry, I type slowly and missed your response. Yes, I suppose you're right... I guess it's simply my hope that the Frechet derivative perspective is given its fair time at some point, at least in upper-division math courses.2011-03-05
  • 0
    @Pete I find your perspective of the notation "unfortunate" interesting considering that most of the scholars after Newton and Leibniz used Leibniz's notation much more often.2011-08-21
  • 0
    @Doug: rereading these comments, I find I was being unusually flippant. More seriously: of course the Leibniz notation is used by almost everyone at times, including me. What is "unforunate" is the sort of yes/no answer that you get (or give!) when asked whether $\frac{dy}{dx}$ is actually a ratio of two quantities.2011-08-21

5 Answers 5

25

Somewhat mundanely,

$$ \frac{d}{dx}\left(\frac{d}{dx}(y)\right) = \frac{d}{dx}\left(\frac{dy}{dx}\right) = \frac{d\,dy}{dx\,dx} = \frac{d^2 y}{dx^2} $$

  • 6
    Wait, I am confused with the denominator $dx*dx=dx^2$ how is this possible?2015-10-14
  • 2
    @KennyGuy maybe they think $dx$ as *one* variable (just like $x$) which make $dx*dx=dx^2$?2017-05-26
  • 3
    This answer doesn't explain anything because the last jump in equality doesn't follow the normal multiplication rules. I don't think anybody asking the original question would be helped by this. Even if I read other answers and accept dx as a constant (in which case at least putting dx in parenthesis before squaring would help), why not dy? Why can you separately square just the d?2017-12-07
  • 0
    @JosephGarvin: Are you trying to understand, or just trying to be contrarian? I ask because your tone is particularly combative. The $d$ on top is not the same as the $d$ in the bottom. The $d$ on top is an operator, whereas the $d$ in the bottom is just half of the symbol "$dx$". So "$dx^2$" in interpreted as $(dx)(dx)$; just like $\sin^2$ does not mean $s$ times $i$ times $n$ squared, here $dx\,dx$ does not mean "$d$ times $x$ times $d$ times $x$"; it means `dee ex`; single thing; and $dx^2$ does not mean "$d$ times $x$ times $x$", it means `object dee ex`, squared.2017-12-07
  • 1
    @ArturoMagidin: Two things: 1) if an explanation the length of your comment is needed to, let's say, "justify" the final equality, then that should be part of the answer; and 2) if the $d$ on the top is different to the $d$ on the bottom, then they shouldn't look the same, i.e., the real answer is that the notation is, let's say, "not optimal".2017-12-07
  • 2
    @WillR Not my answer; and the length of the comment is also related to the length of the misunderstanding in the question being asked, so... no. In fact, the notation is very useful and very flexible and indicative; for example, it immediately tells you the correct units to use for the $n$th derivative (units of $y$ divided by (units of $x$)${}^n$. So... no. The real answer is that it needs proper interpretation. Just line $\sin^2(x)$ is correctly interpreted as $(\sin(x))^2$, and not as a product of $s$, $i$, $n^2$, and $x$; or, for that matter, $\sin($n$)$, where the two $n$s are different.2017-12-07
22

Purely symbolically, if we accept that $dy = f'(x)\,dx$, and treat $dx$ as a constant, then $$d^2y = d(dy) = d(f'(x)\,dx) = dx\,d(f'(x)) = dx\,f''(x)\,dx = f''(x)\,(dx)^2,$$ so dividing yields: $$\frac{d^2y}{(dx)^2} = \frac{d^2y}{dx^2} = f''(x).$$

As to where this notation actually comes from, though: My guess is that it comes from a time when mathematicians primarily thought of $dx$ and $dy$ as "infinitesimal quantities." There are ways of doing so rigorously (via non-standard analysis), and perhaps there is a way of making this notation rigorous that way.


However, we can still give rigorous meaning to these calculations without appealing to non-standard analysis by using the language of bilinear forms.

If $f$ is differentiable, we can define a map \begin{align*} df\colon \mathbb{R} & \to L(\mathbb{R}; \mathbb{R}) \\ df(x)(dx) & = f'(x)\,dx. \end{align*} Here, $L(\mathbb{R};\mathbb{R})$ denotes the set of linear maps from $\mathbb{R} \to \mathbb{R}$, and $dx$ is simply a real number. Going one step further, we can consider the map $$d^2f = d(df)\colon \mathbb{R} \to L(\mathbb{R};L(\mathbb{R};\mathbb{R})).$$ By identifying $L(\mathbb{R}; L(\mathbb{R}; \mathbb{R}))$ with the set of bilinear maps $B(\mathbb{R} \times \mathbb{R};\mathbb{R})$, we have the bilinear map $$d^2f(x)(dx^1, dx^2) = dx^1\, f''(x) \,dx^2$$ whose associated quadratic form is $$d^2f(x)(dx) = f''(x)\,(dx)^2.$$ It is now perfectly legal to divide on both sides by $(dx)^2$, obtaining $$\frac{d^2f}{dx^2} = f''(x).$$

  • 0
    If dx and dy referred to infinitesimal quantities I would expect both dx and dy to be treated as a unit. But only dx is, and I'm still left with no idea what it would mean to write $d$ or $d^2$ by itself. Why do the Ds in dy square but not the Ds in dx? If I accept your definition of a map named df, what is the definition of the function/map you're referring to as just $d$ within $d(df)$? Or are you defining a map whose whole name is $d(df)$?2017-12-07
  • 4
    $d$ on top is an operator; $dx$ on the bottom is a variable.2017-12-07
  • 0
    @JosephGarvin: Everything _before_ the "However, we can still..." is purely symbolic; I haven't made any rigorous claims in that top section. So instead, let's talk instead about the stuff _after_ the "However...." which is rigorous. As Arturo says, I'm using $dx$ (and $dx^1$, $dx^2$) as an ordinary real number (so, a variable), whereas $d$ itself is an operator. See, for instance, the Wikipedia article on [Frechet derivatives](https://en.wikipedia.org/wiki/Fr%C3%A9chet_derivative#Higher_derivatives), but note that the article uses $D$ instead of $d$.2017-12-07
  • 0
    Is $dy$ an application of the $d$ operator to $y$? If so wouldn't it be more consistent (and sane) to write $d(d(y))$ rather than $d(dy)$? With this notation I'm rolling the dice every time trying to figure out what is a one/two letter real/operator.2017-12-09
  • 0
    I think the lower half has the same problem. You write a definition for an operator -- it's unclear if you are defining an operator called $df$ or if you are defining an operator called $d$ and naming the function it takes as an argument $f$. You write $d(df)$ -- if I assume the former (you were defining $d$) then it would be clearer to write $d(d(f))$ -- if I assume the latter, then you have defined $df$ without defining $d$.2017-12-10
  • 1
    @JosephGarvin: I'm using $df$ as shorthand for $d(f)$, and using $d(df)$ for $d(d(f))$. The operator $d$ inputs functions $f \colon \mathbb{R} \to V$ (where $V$ is a real Banach space) and outputs a certain function called $d(f) \colon \mathbb{R} \to L(\mathbb{R}; V)$. Defining the operator $d$ means defining $d(f)$ for all $f$. Conversely, defining $d(f)$ for every $f$ defines the operator $d$. The definition when $V = \mathbb{R}$ as in my post. For general $V$, see the [Wikipedia article](https://en.wikipedia.org/wiki/Fr%C3%A9chet_derivative#Higher_derivatives) I keep referencing.2017-12-10
6

The $d$ is meant to represent the "change in". And the Leibniz notation is meant to remind you that you are computing the ratio between the change in $y$ and the change in $x$.

When you take the second derivative, you are computing how the derivative is changing as $x$ changes; that is, you are trying to compute $$\frac{d(y')}{dx}.$$ Now, $y'$ is itself a rate of change: it is the rate at which $y$ changes. So the "numerator" of the differential notation is telling you that you are trying to consider the change in the change in $y$, not the change in $y^2$ (which is what "$dy^2$" would represent).

So you are trying to describe the change in "the-change-in-$y$", relative to how $x$ is changing. $x$ is only changing "once", so you should have a single $d$ in the "denominator" (remember, not really a denominator). So why $x^2$? Because you are trying to figure out the change of blah as $x$ changes, and blah is a rate of change as $x$ changes as well. So you are taking $x$ twice, but considering only one change. Hence, single $d$, but $x$ squared.

  • 1
    Doesn't your explanation for $d^2y$ undermine your explanation for $dx^2$? If $dy/dx$ refers to the ratio of change in y to change in x, and I accept $d^2y$ as meaning the change in the change in y, one would still expect by your rationale for $d^2y/dx^2$ to refer to the ratio of the change in the change in y to the change in $x^2$. Also no idea what blah is meant to represent.2017-12-07
  • 0
    @JosephGarvin: No, it doesn't; "blah" in this case is whatever it is that $y'$ is measuring. If $y$ is position, and $x$ is time, with $y'$ you are trying to figure out how the rate of change of position over time; if you are trying to figure out the second derivative, you are trying to to figure out the rate of change of the rate of change, over time squared (which is why if position is measured in miles and time in hours, velocity is measured in miles per hour, but acceleration, the rate of change of velocity, is measured in miles per hour **squared**). `blah` in that example is `velocity`.2017-12-07
1

There is no possible way of understanding why Leibniz invented the notation he did unless you think about calculus the way Leibniz did, using infinitesimal numbers.

Take the velocity $dx/dt$. Leibniz would have described it as the ratio of two infinitesimals. (Nonstandard analysis shows that this idea can be made rigorous, but in any case limits didn't exist in Leibniz's time.) The numerator is an infinitesimal number with units of meters. The denominator is an infinitesimal with units of seconds. You divide them, and it gives m/s.

In the acceleration, $d^2x/dt^2$, the numerator is written to suggest something with units of meters, and the denominator to suggest units of seconds squared, giving the correct units of m/s$^2$.

1

Glossing over a few issues for clarity,

If I wanted you to differentiate, say $3x^4$ twice, I could ask the question in a variety of ways such as,

1) Find the second derivative of $3x^4$

2) If $f(x)=3x^4$, find $f''(x)$

3) If $y=3x^4$ find $\frac{d^2y}{dx^2}$

The latter is an accepted lazy corruption of the more technically correct,

Find, $$ \big(\frac{d}{dx}\big)^2 y$$ or find $$ \big(\frac{d}{dx}\big)^2 (3x^4)$$ So, essentially, you have spotted that mathematicians are quite lazy, when they can get away with it !

Although we write $\frac{d}{dx}$ this isn't a fraction in the sense that $\frac{2}{5}$ is. Perhaps best to park that thought for now, although maybe 'expanding the brackets' of $\big(\frac{d}{dx}\big)^2$ as $\frac{d^2}{dx^2}$ rather than $\frac{d^2}{(dx)^2}$ is a reminder of that.

All of this does makes it harder for beginners to make sense of the notation. There are many, many more examples I could give of such 'lazy corruption'. I think of it as being like learning any foreign language, where there are always all sorts of quirks and customs that break a general rule.

Once you understand the 'lazy corruption' in the context of its surroundings, the meaning is, more often than not, actually, perfectly clear.

This answer was given when this question was asked again, in March 2019 here : Why $x^2$ in $\frac{d^2y}{dx^2}$?