267
$\begingroup$

In algebra, all quadratic problems can be solved by using the quadratic formula. I read a couple of books, and they told me only HOW and WHEN to use this formula, but they don't tell me WHY I can use it. I have tried to figure it out by proving these two equations are equal, but I can't.

Why can I use $x = \dfrac{-b\pm \sqrt{b^{2} - 4 ac}}{2a}$ to solve all quadratic equations?

  • 1
    Because it is derived from the general form $ax^2 + bx + c =0$. Maybe try substituting this into the equation?2016-12-29

22 Answers 22

506

I would like to prove the Quadratic Formula in a cleaner way. Perhaps if teachers see this approach they will be less reluctant to prove the Quadratic Formula.

Added: I have recently learned from the book Sources in the Development of Mathematics: Series and Products from the Fifteenth to the Twenty-first Century (Ranjan Roy) that the method described below was used by the ninth century mathematician Sridhara. (I highly recommend Roy's book, which is much broader in its coverage than the title would suggest.)

We want to solve the equation $ax^2+bx+c=0,$ where $a \ne 0$. The usual argument starts by dividing by $a$. That is a strategic error, division is ugly, and produces formulas that are unpleasant to typeset.

Instead, multiply both sides by $4a$. We obtain the equivalent equation $4a^2x^2 +4abx+4ac=0.\tag{1}$ Note that $4a^2x^2+4abx$ is almost the square of $2ax+b$. More precisely, $4a^2x^2+4abx=(2ax+b)^2-b^2.$ So our equation can be rewritten as $(2ax+b)^2 -b^2+4ac=0 \tag{2}$ or equivalently $(2ax+b)^2=b^2-4ac. \tag{3}$ Now it's all over. We find that $2ax+b=\pm\sqrt{b^2-4ac} \tag{4}$ and therefore $x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}. \tag{5}$
No fractions until the very end!

Added: I have tried to show that initial division by $a$, when followed by a completing the square procedure, is not a simplest strategy. One might remark additionally that if we first divide by $a$, we end up needing a couple of additional "algebra" steps to partly undo the division in order to give the solutions their traditional form.

Division by $a$ is definitely a right beginning if it is followed by an argument that develops the connection between the coefficients and the sum and product of the roots. Ideally, each type of proof should be presented, since each connects to an important family of ideas. And a twice proved theorem is twice as true.

73

Here is a slightly less ad-hoc approach to deriving the formula.

You look at the polynomial $ax^2+bx+c$ and you think of it as being composed of two kinds of indeterminates: coefficients $a$,$b$,$c$, and variable $x$. What you wish to do is if $ax^2+bx+c=a(x-r_1)(x-r_2)$ you want find an expression for $r_1$ and $r_2$ in terms of $a,b,c$ involving only the operations $+,-,\times,\div$ and $\sqrt[n]{}$.

But how are $r_1$ and $r_2$ related to $a,b$ and $c$? If you look at the expression $ax^2+bx+c=a(x-r_1)(x-r_2)$, it is easy to compute that $b=-a(r_1+r_2)$ and $c=ar_1r_2$.

Intuitively because you know that $(r_1+r_2)=-\frac ba$, determining $r_1$ and $r_2$ is the same as determining $(r_1-r_2)$. Let $E=(r_1-r_2)$ and note that $2r_1=(r_1+r_2)+(r_1-r_2)=-\frac ba+E=$ and $2r_2=(r_1+r_2)-(r_1-r_2)=-\frac ba-E$, so we already have most of our quadratic formula: $r_1,r_2=\frac{-b}{2a}\pm\frac{E}2$

All we need to do then, is express $E=(r_1-r_2)$ using $+,-,\times,\div,\sqrt[n]{}$ in terms of $a,b,c$. In order to do this, we need to take a small detour to see what expressions in $+,-,\times,\div$ and $a,b,c$ could possible be.

Note that the coefficients $b=-a(r_1+r_2)$ and $c=r_1r_2$ are symmetric functions in $r_1$ and $r_2$ in the sense that if you exchange $r_1$ with $r_2$ for each other, the values of $b$ and $c$ do not change. Furthermore, $b$ and $c$ are in fact scalar multiples of the so-called elementary symmetric functions, which have the property that any symmetric function (in $2$ variables) can be expressed uniquely as a polynomial (quotient of polynomials for our purposes) in them.

In particular, we can "symmetrize" the quantity $E=(r_1-r_2)$ to obtain the discriminant $D=(r_1-r_2)^2$ which is in some sense "the smallest" symmetric function of $r_1$ and $r_2$ that becomes 0 if $r_1=r_2$. Technically, though, the above is the discriminant only when $a=1$ because our coefficients $b$ and $c$ are elementary symmetric functions scaled by $a$, so we define the general discriminant to be $D=a^2(r_1-r_2)^2$. Because $D$ is symmetric and $b$ and $c$ are (up to a multiplicative factor) elementary symmetric, we should be able to express $D$ as a polynomial in $b$ and $c$.

We do so in a somewhat ad-hoc matter (though there are algorithms that will do this procedurally): $D=a^2(r_1-r_2)^2$ so $D=a^2(r_1^2-2r_1r_2+r_2^2)$ hence $D=a^2(r_1^2+2r_1r_2+r_2^2-4r_1r_2)$ and finally $D=a^2(r_1+r_2)^2-a^24r_1r_2$ giving us $D=b^2-4ac$

Evidently, now we have that $\sqrt{D}=a(r_1-r_2)=aE$ and so $E=\frac{\sqrt{D}}a$. This allows us to rewrite our formula so far to get from $r_1,r_2=\frac{-b}{2a}\pm\frac{E}{2}$ to $r_1,r_2=\frac{-b}{2a}\pm\frac{\sqrt{D}}{2a}$ and finally $r_1,r_2=\frac{-b\pm\sqrt{b^2-4ac}}{2a}$


The only strange question is: why did we only have to take one square root in order to get the formula, i.e. why did the quantity $E=(r_1-r_2)$ turn out to be a square root of a nice polynomial in $a,b,c$? That is where modern Galois theory comes in.

What's really happening is this: the first four suggest that you think of the coefficients as living in the field $F$ (a set of expressions such that adding, subtracting, multiplying, or dividing any two of them gives another expression in the set) consisting of $\{\dfrac {p(a,b,c)}{q(a,b,c)}\}$ where $p$ and $q$ are polynomials in three variables (and rational coefficients). Then $r_1$ and $r_2$ will generate an extension field $E$ of $F$, that is, the smallest field $E$ that contains $F$ and also $r_1$ and $r_2$. Galois theory says that this extension field $E$ will be a $2!=2$-dimensional vector space over $F$ and hence a single square root will be sufficient to generate $E$. Thus we need an expression in the coefficients (symmetric expression in the roots) whose square root is an expression in the roots, but not symmetric, and a natural choice then is the most elementary anti-symmetric function known as the Vandermonde determinant which is precisely $(r_1-r_2)$ in this case (anti-symmetric=swapping two variables flips the sign, obviously the square of an anti-symmetric function is a symmetric function).

For general polynomials, the extension field will be of higher dimension, and so you will need to take possibly several roots of different orders. Galois theory allows us to compute what these roots ought to be and in what order (giving us the cubic and quartic formulas in a way that is not ad-hoc at all), and also shows that the general degree $5$ and above polynomial does not have a formula involving only $+,-,\times,\div,\sqrt[n]{}$. (Some people feel frightened by this, because taking roots should invert the raising of powers, but this is not the case because the order of operations matters...) Now, if the coefficients of the higher degree polynomial satisfy some additional relations (i.e. are not completely independent from each other), then Galois theory also gives procedures for computing formulas for those cases and also for determining what such relations ought to be.

  • 3
    @VladimirSotirov I know who Galois is but unfortunately I've never learnt his theory yet. Maybe you're right, but still it's hard for me to understand. I'm sorry for that, I'm just an 8th grader.2014-04-20
56

Probably the easiest way to understand where the quadratic formula comes from is by 'completing the square': solving equations of the form '$x^2$=whatever' is easy, so let's see if we can put our quadratic equation ($ax^2+bx+c=0$) in that form.

The first thing to do is divide by $a$; of course this doesn't work if $a=0$, but then if that's the case our formula wasn't quadratic in the first case! This gives us $x^2+{b\over a}x+{c\over a}=0$. Now, that $b\over a$ term keeps us from having a clean square - but if we remember how to square a sum of two numbers - $(m+n)^2=m^2+2mn+n^2$ - then by substituting $x$ for $m$, we can see that our $n$ should be half of the linear term: $(x+{b\over 2a})^2 = x^2+{b\over a}x + {b^2\over 4a^2}$. But now the constant term isn't right; we have to adjust it to make it $c\over a$. A correction of $({c\over a}-{b^2\over 4a^2})$ will do this; we get $(x+{b\over 2a})^2+({c\over a}-{b^2\over 4a^2}) = 0$.

But this is exactly what we wanted; we can move that second term over to the right and get $(x+{b\over 2a})^2 = {b^2\over 4a^2}-{c\over a}$. Getting the right-hand-side cleaned up a little bit makes it ${b^2-4ac\over 4a^2}$ - just multiply the numerator and denominator of $c\over a$ by $4a$ and combine terms. Now, we can go ahead and take the square root of both sides: $x+{b\over 2a} = \pm \sqrt{b^2-4ac\over 4a^2} = \pm {\sqrt{b^2-4ac}\over\sqrt{4a^2}} = {\pm\sqrt{b^2-4ac}\over 2a}$. The last step is to subtract $b\over 2a$ from both sides, finally giving the familiar: $x = {-b\pm\sqrt{b^2-4ac}\over 2a}$

  • 3
    It is not necessary to think in terms of absolute values (or even in terms of real numbers). The point is that if $x^2 = y^2$, then $x = \pm y$. Therefore in algebra whenever you extract$a$square root you have precisely$a$$\pm$ ambiguity and should put a $\pm$ in front of it. But we already have a $\pm$ in front of the square root, so it is okay to take any one square root of $4a^2$, namely $2a$. Or, thinking in terms of real numbers and that $\sqrt{}$ always denotes the positive square root, $\pm |a| = \{a,-a\} = \pm a$.2011-07-03
55

Proof without words.

completing the square

This one shows that $ax^2+bx+c=a\left(x+\dfrac b{2a}\right)^2+c-\dfrac{b^2}{4a}$ from which the quadratic formula can be easily derived.

Credits to LucasVB.

I hope this helps.
Best wishes, $\mathcal H$akim.

42

The other answers tell you where the formula "comes from" (namely, from completing the square). If you are just happy checking that the formula gives the correct solutions whatever $a$, $b$ and $c$, you may verify that the identity $ aX^2+bX+c=a\left(X-\frac{-b+\sqrt{b^2-4ac}}{2a}\right)\left(X-\frac{-b-\sqrt{b^2-4ac}}{2a}\right) $ holds for every $a$, $b$ and $c$.

  • 0
    Yes. However, it is not immediately clear that this something about the roots of the polynomial in the case 4ac>b^2.2016-12-10
39

Pre-note: Since the asker needs an insight, I'd present a non-rigorous proof/intuition.

Try working backwards!

Assuming that the quadratic formula holds true,$x = {-b \pm\sqrt{b^2 - 4ac} \over 2a} $Isolate $x$.$\begin{align}2ax &=& -b \pm \sqrt{b^2 - 4ac} \\ 2ax + b & = &\pm\sqrt{b^2 - 4ac} \\ (2ax + b)^2 & = & b^2 - 4ac \\ 4a^2x^2 + 4abx + b^2& = & b^2 - 4ac \\ 4a^2x^2 + 4abx + 4ac & = & 0 \\ ax^2 + bx + c & = & 0 \end{align}$ Note that we divided both sides by $4a$ in the last step assuming a $\ne$ $0$, which is what we have learnt all along—in the polynomial $a_nx^n + a_{n - 1}x^{n - 1}\cdots a_0x^0$, $a_n \ne 0$.

Also, if you see everything from bottom to top, you'd almost get André Nicolas' proof!

  • 0
    From strict logic standpoint, if $p \implies q$ then, not necessarily, $q \implies p$ (except under certain conditions, which, incidentally happen to be true in this case, but nonetheless must be explicitly stated). $p = $ "If $x = \alpha$ or $\beta$ then $ax^2+bx+c=0$" and $q=$ "If $ax^2+bx+c=0$ then $x=\alpha$ or $\beta$"2015-09-18
29

First, let's examine an analogous simpler special case. Why does the difference of squares formula $\rm\: x^2 - a^2\ =\ (x-a)\ (x+a)\:$ always work? Well, let's consider the obvious proof. Expanding the RHS we obtain $\rm\ (x-a)\ (x+a)\ =\ x^2 - a\ x + x\ a - a^2\ $ which indeed equals $\rm x^2 - a^2\ $ as long as $\rm\ a\ x = x\ a\ $ for all $\rm\:x\:,\:$ i.e. as long as $\rm\:a\:$ commutes with all elements of the ring. Because the proof employed only the commutative law in addition to the standard ring axioms (most notably the distributive law) this difference of squares formula works in all commutative rings. However, generally it fails in noncommutative rings, e.g. rings involving difference or differential operators.

In fact, if we consider both $\rm\:x\:$ and $\rm\:a\:$ as indeterminates, then we can specialize the "generic" factorization $\rm\: x^2 - a^2 = (x-a)\ (x + a)\:$ in the ring $\rm\:\mathbb Z[x,a]\:$ to any commutative ring by using an evaluation homomorphism mapping $\rm\:x,a\:$ to specific values in the target ring (such an evaluation map always exists by the universal property of polynomial rings). So this formula is an identity of commutative rings, a formula universally true, i.e. true in every commutative ring. Many other well-known formulas and proofs are of this sort, e.g. the binomial theorem, resultant formulas which determine if polynomials have a common root, the Cayley-Hamilton theorem, etc.

A somewhat similar remark holds true for the well-known derivation of the quadratic formula (see e.g. André's answer here). However, it is not truly a universal formula for commutative rings because, in addition to the commutative ring axioms, we have invoked some special properties in its derivation. Namely, we have assumed that $\rm\,2a\,$ is invertible, and we have assumed that the discriminant has a square root in the ring. So the proof of the quadratic formula goes through in any commutative ring satisfying these two additional hypotheses. More technically we could carry out the proof generically in a ring where such elements exist, say $\rm\:e = 1/(2a),\ d^2 = b^2 - 4ac\:,\:$ so the proofs works naturally in the ring $\rm\:\mathbb Z[a,b,c,d,e]/(2ae-1, d^2-b^2-4ac)\:.\:$ Therefore any invocation of the quadratic formula can be obtained simply by specializing the proof in this generic ring, just as we did above for the difference of squares formula.

Such "generic" or "universal" proofs can yield quite nontrivial results, e.g. one can "generically" algebraically cancel "apparent singularities" in one fell swoop, before evaluation - thus avoiding alternative dense topological arguments. For example, see this slick proof of Sylvester's determinant identity $\rm\ det\ (1+AB)=det\ (1+BA)\ $ that proceeds by universally cancelling $\rm\ det\ A\ $ from the $\rm\ det\ $ of $\rm\ \ (1+A\ B) A = A (1+B\ A)\,$ in $\,\rm\Bbb Z[a_{\,ij},b_{\,ij}]\,$ where the matrix entries are indeterminates $\rm\,a_{\,ij},b_{\,ij}.\,$ Such proofs exploit to the hilt the universal properties of formal polynomials (vs. less general polynomial functions - see here for much more on this distinction).

Remark $\ $ It's worth emphasizing that in general rings it is possible for quadratic equations to have more than $2$ roots, e.g. $\rm\:x^2 = 1\:$ has roots $\rm\:\pm1,\:\pm3\, \in\, \mathbb Z/8,\,$ the ring of integers modulo $8.\,$ Thus plugging one root of the discriminant into the quadratic formula doesn't necessarily yield all the roots of a quadratic. Such anomalies cannot occur in domains, i.e. rings without zero divisors, where $\rm\ xy = 0\ \Rightarrow\ x=0\ \ or\ \ y=0.\,$ Indeed, a polynomial $\rm\ f(x)\in D[x]\ $ has at most $\rm\ deg\ f\ $ roots in the ring $\rm\:D\ $ iff $\rm\ D\:$ is a domain. For a simple proof see this answer, where I illustrate it constructively in $\rm\ \mathbb Z/m\ $ by showing that, given any $\rm\:f(x)\:$ with more roots than its degree, we can quickly compute a nontrivial factor of $\rm\,m\,$ via a $\rm\:gcd.\,$ The quadratic case of this result is at the heart of many integer factorization algorithms, which attempt to factor $\rm\:m\:$ by searching for a nontrivial square root in $\rm\: \mathbb Z/m,\,$ e.g. a square root of $1$ that is not $\:\pm 1$.

  • 12
    @Pete Tr$y$ as I may, I cannot even begin to imagine how you can seriously attempt to misconstrue said little mathematical pun as "denigrating the work of others". In any case, it is certainly your prerogative to withhold your upvote due to your misinterpretation of the pun. In the future could you please refrain from discussing such matters in comments to my answers. Comments are supposed to be about mathematics, not bi$z$arre misin$t$erpre$t$$a$tions of puns.2011-07-04
18

The formula comes from the technique known as completing the square.

18

I translate from my 1968 Algebra book by Sebastião e Silva and Silva Paulo, because I really loved it, and still love.

Consider any quadratic equation

$ax^{2}+bx+c=0,\qquad (a\neq 0).\qquad (1)$

Multiplying both sides by $1/a$ we get the equivalent equation

$x^{2}+\frac{b}{a}x+\frac{c}{a}=0.$

We are going to show that it is possible to find $h$ and $\alpha $ such that

$x^{2}+\frac{b}{a}x+\frac{c}{a}=(x+h)^{2}-\alpha .$

Expanding the RHS gives

$x^{2}+\frac{b}{a}x+\frac{c}{a}=x^{2}+2hx+h^{2}-\alpha .$

This means, applying the method of undetermined coefficients, that

$\left\{ \begin{array}{l} \frac{b}{a}=2h \\ \frac{c}{a}=h^{2}-\alpha. \end{array} \right. $

Hence

$\left\{ \begin{array}{l}h=\frac{b}{2a} \\ \alpha=h^{2}-\frac{c}{a}=\frac{b^{2}}{4a^{2}}-\frac{c}{a}=\frac{b^{2}-4ac}{4a^{2}}.\end{array}\right. \qquad (2)$

In this way the given equation is reduced to the binomial equation in $x+h$

$\left( x+h\right) ^{2}-\alpha =0\qquad \text{equivalent to}\qquad \left( x+h\right) ^{2}=\alpha $

and thus it is satisfied when

$x+h=\sqrt{\alpha }\qquad \text{or}\qquad x+h=-\sqrt{\alpha },$

i.e. when

$x=-h+\sqrt{\alpha }\qquad \text{or}\qquad x=-h-\sqrt{\alpha }.$

Denoting the first value of $x$ by $x_{1}$ and the second by $x_{2}$ and replacing $h$ and $\alpha $ by their expressions given by $(2)$, yields

$x_{1}=\frac{-b+\sqrt{b^{2}-4ac}}{2a}\qquad \text{or}\qquad x_{2}=\frac{-b-\sqrt{b^{2}-4ac}}{2a}.\qquad (3)$

As it can be seen nowhere the method of completing the square was mentioned. Rather the method of undetermined coefficients was fully explained previously.

12

You could also prove it using Tschirnhaus Transformation.

Let $\,x=y-\frac{b}{2a}$.$\,$ Then $\,ax^2+bx+c=0\,$ becomes $\begin{align}ay^2-\frac{b^2}{4a}+c&=0\\ \iff y^2&=\frac{b^2-4ac}{4a^2}\\ \iff y&=\frac{\pm\sqrt{b^2-4ac}}{2a}\\ \iff x&=\frac{\pm\sqrt{b^2-4ac}}{2a}-\frac{b}{2a}\\ \iff x&=\frac{-b\pm\sqrt{b^2-4ac}}{2a}\end{align}$


In general, the substitution for $\, a_nx^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0\,$ is $\,x=y-\frac{a_{n-1}}{na_n}$,

which leaves the polynomial without a term of degree $n-1$.

  • 3
    I am very confused about the downvotes. OP asked for a proof, and I gave a proof that no one else has given here yet. Andre Nicolas too simply proved it just by multiplying both sides by $4a$ and completing the square, and yet he got all the upvotes while I got downvoted to oblivion.2015-02-19
11

Why? Because we let $a$, $b$, and $c$ be anything (usually real numbers).
So our result doesn't depend on the coefficients, only on the fact that we had a polynomial of degree two (i.e. Quadratic)

  • 9
    I thought that was implied by "degree two"!2011-07-04
11

This question was raised over two years ago and already received a number of nice answers. Let me just mention a really elementary approach to the general quadratic equation (in $\mathbb{R}$ or $\mathbb{Q}$).

Step 1. The standard identity \begin{equation} (x + y)^2 = (x-y)^2 + 4xy \qquad \qquad (1) \end{equation}

Step 2. Given the area of a rectangle and the difference between its two sides, determine its length and width. The difference between the length $x$ and the width $y$ is $x - y$, and the area is $xy$. Now by (1), $x +y$ is the positive square root of $(x-y)^2 + 4xy$. Finally, $x = \frac{1}{2}(x+y) + (x-y)$ and $y = \frac{1}{2}(x+y) + (x-y)$.

Step 3. Consider the equation $ax^2 + bx + c = 0$ and suppose that $a \not= 0$. Dividing by $a$ yields $x^2 + \frac{b}{a}x +\frac{c}{a} = 0$ or $x(x + \frac{b}{a}) = -\frac{c}{a}$. Setting $y = x + \frac{b}{a}$, we get $xy= -\frac{c}{a}$ and $x - y = -\frac{b}{a}$. The analogy with Step 2 is clear and the solution is the same. Putting everything together leads to the quadratic formula.

I found this approach as a child, although each step took me several years... I wonder whether it was ever used to teach the quadratic formula.

8

Most answers are explaining the method of completing the square. Although its the preferred method, I'll take another approach.

Consider an equation $~~~~~~~~~~~~~~~~~~~~~ax^{2}+bx+c=0~~~~~~~~~~~~~~~~~~~~(1)$We let the roots be $\alpha$ and $\beta$. Now, $~~~~~~~~~~~~~~~~~~~~~x-\alpha = x-\beta = 0~~~~~~~~~~~~~~~~~~~~~~~~$ $~~~~~~~~~~~~~~~~~~~~~k(x-\alpha)(x-\beta)=0~~~~~~~~~~~~~~~~~~~~(2)$ Equating equation 1 and 2 (k is a constant), $ax^{2}+b{x}+c=k(x-\alpha)(x-\beta)$ $ax^{2}+b{x}+c=k(x^{2}-\alpha x-\beta x+\alpha \beta)$ $ax^{2}+b{x}+c=kx^{2}-k(\alpha+\beta )x+k\alpha \beta)$ Comparing both sides, we get $a=k~;~b=-k(\alpha +\beta)~;~c=k\alpha \beta$ From this, we get $\alpha + \beta = \frac{-b}{a}~~;~~~\alpha \beta = \frac{c}{a}$

Now, to get the value of $\alpha$, we follow the following procedure : First we take out the value of $\alpha - \beta$, so that we can eliminate one term and find out the value of another. $(\alpha-\beta)^{2} = \alpha ^{2}+ \beta ^{2} - 2 \alpha \beta$ Now we'll add $4 \alpha \beta $ on both the sides $(\alpha-\beta)^{2} +4 \alpha \beta = \alpha ^{2}+ \beta ^{2} + 2 \alpha \beta$ $(\alpha-\beta)^{2} +4 \alpha \beta = (\alpha + \beta )^{2} $ $(\alpha-\beta)^{2} = (\alpha + \beta )^{2} -4 \alpha \beta $ $\alpha-\beta = \pm \sqrt{(\alpha + \beta )^{2} -4 \alpha \beta } $ Substituting the values of $\alpha + \beta$ and $\alpha \beta$, we get, $\alpha-\beta = \pm \sqrt{(\frac{-b}{a} )^{2} -\frac{4c}{a} } $ $\alpha-\beta = \pm \sqrt{\frac{b^{2}-4ac}{a^{2}} } $ or $~~~~~~~~~~~~~~~~~~~~~\alpha-\beta = \frac{\pm \sqrt{b^{2}-4ac}}{a} ~~~~~~~~~~~~~~~~~~~~~(3)$ Adding $Eq^{n} (2)~and~(3)$, we get, $2 \alpha = \frac{-b \pm \sqrt{b^{2}-4ac}}{a}$ $\alpha = \frac{-b \pm \sqrt{b^{2}-4ac}}{2a}$

  • 0
    you could rather write it as $x_1-\alpha = x_2-\beta = 0$, just to differentiate between the roots. This proof might be boring but it's not at all complicated.2014-11-25
7

I'll just let the algebra speak for itself.

For $ a \ \ne \ 0 $

If

$ax^2 = bx + c $

Then

$ x = \frac{b \ \pm \sqrt{b^2 \ + \ 4ac}}{2a} $

PROOF:

$ 4aax^2 = 4abx + 4ac $

$(2ax)^2 + b^2 = 4abx + 4ac + b^2 $

$(2ax)^2 - 4abx + b^2 = b^2 + 4ac $

$ (2ax - b)^2 = b^2 + 4ac $

$ 2ax - b = \pm \sqrt{b^2 + 4ac} $

$2ax = b \pm \sqrt{b^2 + 4ac} $

$ x = \frac{b \pm \sqrt{b^2 + 4ac}}{2a} $

Q.E.D.

SHORTCUT

Using the standard full definition , replace b with -b and c with -c to get the formula I derived in this post.

For $ a \ne 0 $

If

$ ax^2 + bx + c = 0 $

Then

$ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} $

Make the replacements in the 'if'

$ ax^2 + (-b)x + (-c) = 0 $

$ ax^2 = bx + c $

Make the replacements in the 'then'

$x = \frac{-(-b) \pm \sqrt{(-b)^2 - 4a(-c)}}{2a} $

$ x = \frac{b \pm \sqrt{b^2 + 4ac}}{2a} $

Q.E.D.

5

The proof of the quadratic formula takes advantage of completing the square. $ax^2 + bx + c = 0, \ a \neq 0, \ a, \ b, \ c \ \in \mathbb R$ $ax^2 + bx = -c$ $x^2 + \dfrac{b}{a}x = -\dfrac{c}{a}$ $x^2 + \dfrac{b}{a}x + \left(\dfrac{b}{2a}\right)^2 = -\dfrac{c}{a} + \left(\dfrac{b}{2a}\right)^2$ $\left(x+\dfrac{b}{2a}\right)^2 = -\dfrac{c}{a} + \dfrac{b^2}{4a^2}$ $\left(x+\dfrac{b}{2a}\right)^2 = -\dfrac{4ac}{4a^2} + \dfrac{b^2}{4a^2}$ $\left(x+\dfrac{b}{2a}\right)^2 = \dfrac{-4ac+b^2}{4a^2}$ $\left(x+\dfrac{b}{2a}\right)^2 = \dfrac{b^2-4ac}{4a^2}$ $x+\dfrac{b}{2a} = \pm \sqrt{\dfrac{b^2-4ac}{4a^2}}$ $x+\dfrac{b}{2a} = \pm \dfrac{\sqrt{b^2-4ac}}{2a}$ $x=-\dfrac{b}{2a} \pm \dfrac{\sqrt{b^2-4ac}}{2a}$ $\boxed{x=\dfrac{-b \pm \sqrt{b^2-4ac}}{2a}}$ This concludes the proof.

5

The simplest quadratic equation is $x^2-a=0$, since it has the solutions $x=\pm \sqrt{a}$.

The good news is that every quadratic equation can be reduced to this simple quadratic equation: The equation $ax^2+bx+c=0$ means $x^2+\frac{b}{a} x + \frac{c}{a}=0$. Writing $p=\frac{b}{a}$, $q=\frac{c}{a}$, we only have to solve $x^2+px+q=0$. If we write $x=y-\frac{p}{2}$, then a little calculation shows $y^2 - \frac{p^2}{4}+q=0$.

$\Rightarrow ~ y=\pm \sqrt{\frac{p^2}{4}-q} ~ \Rightarrow ~ x = -\frac{p}{2} \pm \sqrt{\frac{p^2}{4}-q} = \frac{-b \pm \sqrt{b^2-4ac}}{2a}.$

Geometrically, $x=y-\frac{p}{2}$ moves the parabola horizontally so that it becomes symmetric to the $y$-axis. After this shift the roots are easy to find.

enter image description here

This method also works for polynomials of higher degrees: Tschirnhaus transformation.

  • 0
    it would be awesome if you give a way to see how that transformation arose (or how one can come up with it in the first place)2016-03-21
4

I've seen several "proofs" of this formula, all of which proceed by a sequence of randomly-chosen algebraic operations and end at the famous formula. Of course, no normal human being would ever pluck these out of thin air like that.

For me, the realization was this: Consider the formula

$(x - \alpha)(x - \beta) = 0$

If $x=\alpha$, then $x-\alpha$ is obviously zero. Whatever $x-\beta$ happens to be, it gets multiplied by zero, so the result is still zero. So in other words, one solution to this equation is $x=\alpha$. By a similar line of reasoning, $x=\beta$ is also a solution. So here we have purposely constructed an equation that has exactly two solutions: $\alpha$ and $\beta$ (which can be whatever we choose them to be).

Now let's open the brackets:

$(x-\alpha)(x-\beta) = 0$ $x(x-\beta) - \alpha(x-\beta) = 0$ $x^2 - \beta x - \alpha x + \alpha \beta = 0$ $x^2 - (\alpha + \beta)x + (\alpha \beta) = 0$

Now suppose we have an equation that looks like

$x^2 + Ax + B = 0$

(Notice that this isn't the usual general quadratic equation. I've assumed that the $x^2$ has a coefficient of 1.)

If we line the two equations up, it becomes clear that

$A = -(\alpha + \beta)$ $B = (\alpha \beta)$

This is the direct relationship between the coefficients we can see, and the solutions we want to find. Now, how to figure out the solutions from the coefficients?

We know that $-A$ is the sum of the solutions, and $B$ is the product of the solutions. But what are the solutions themselves? Hmm.

Well, if $-A$ is the sum, then $-A/2$ is the arithmetic mean of the solutions. I.e., it's exactly half way between the two solutions. Great, if we can just figure out how far apart the solutions are, we could exactly compute them!

So we have the sum and the product, but we want the difference. Hmm. OK. But how do we do that?

We've got $\alpha + \beta$ and we've got $\alpha \beta$, but we want $\alpha - \beta$. Tricky…

The solution is Black Magic. Observe: The thing we have is $\alpha + \beta$, and

$(\alpha + \beta)^2 = \alpha^2 + 2 \alpha \beta + \beta^2$

The thing we want is $\alpha - \beta$, and

$(\alpha - \beta)^2 = \alpha^2 - 2 \alpha \beta + \beta^2$

The difference between these two expressions is exactly $4\alpha\beta$which we can compute! If $B = \alpha\beta$, then $4B = 4\alpha\beta$. What I am saying is that

$(\alpha - \beta)^2 = (\alpha + \beta)^2 - 4\alpha\beta = A^2 - 4B$

It therefore follows that

$\alpha - \beta = \sqrt{A^2 - 4B}$

Recalling that $-A$ is the sum of the solutions and so $-A/2$ is the average, we have

$\alpha = -\frac{A}{2} - \frac{\sqrt{A^2 - 4B}}{2}$ $\beta = -\frac{A}{2} + \frac{\sqrt{A^2 - 4B}}{2}$

In other words, half the sum minus half the difference gives the lower solution, and half the sum plus half the difference gives the upper solution.

$x = \frac{-A \pm \sqrt{A^2 - 4B}}{2}$

Notice that this doesn't look quite like the familiar formula, because we assumed that the $x^2$ coefficient equals exactly 1. But notice that if we divide all the coefficients by the leading one, we can get this form!

$ax^2 + bx + c = 0$ $x^2 + \frac{b}{a}x + \frac{c}{a} = 0$ $A = \frac{b}{a}$ $B = \frac{c}{a}$

Substitute these into our previous formula, and (trust me) the familiar quadratic solution formula pops out.


Notice that we can do this trick for a polynomial of any degree!

$(x - \alpha)(x - \beta)(x - \gamma) = 0$

This gives a cubic equation with known solutions. Opening the brackets,

$x^3 - (\alpha + \beta + \gamma)x^2 + (\alpha\beta + \alpha\gamma + \beta\gamma)x - (\alpha \beta \gamma) = 0$

Given the equation

$x^3 + Ax^2 + Bx + C = 0$

we get

$A = -(\alpha + \beta + \gamma)$ $B = (\alpha\beta + \alpha\gamma + \beta\gamma)$ $C = -(\alpha\beta\gamma)$

The trouble is, now we need to somehow compute $\alpha$, $\beta$ and $\gamma$ given only $A$, $B$ and $C$. And it turns out this is way harder than last time. (Indeed, I have no damn idea how to do it! Computing the average doesn't really help this time.)

3

Graphical Approach The following method actually requires some knowledge about the graph of parabola and calculus. This is just another way to view the formula. For $a>0,\, f^{\prime}(x)=0\,\left[x=-\frac{b}{2a}\right]$ would give you the minima. Now substitute this in the expression to see whether it attains any negative value. If it is positive here then it wont have any real roots for obvious reasons given that you have sufficient knowledge about functions. Now if the function has real roots then the arithmetic mean of the roots would give the axis of parabola. The roots are symmetrical about this line so if we somehow find the difference between the roots and add half of it to $-\frac{b}{2a}$ then we would get one root and subtracting half of difference would give the other root.We know the sum of roots is $-\frac{b}{a}$ and product of roots is $\frac{c}{a}$. So \begin{equation} F+G=-\frac{a}{b},\,FG=\frac{c}{a}$ \end{equation} therefore, $(F-G)^2 = \left(-\frac{a}{b}\right)^2 -4\frac{c}{a}$ taking root, $|F-G|=\left[\left(-\frac{a}{b}\right)^2 -4\frac{c}{a}\right]^{\frac{1}{2}}$ therefore the roots are $-\frac{b}{a} + \frac{|F-G|}{2}\,-\frac{b}{a} - \frac{|F-G|}{2}$ enter image description here Substituting the values you would get the formula. I think this graphical way is a much interesting way of looking at the formula as algebra here can be too abstract for people new to it.

  • 0
    It's interesting that because the "algebra here can be too abstract for people new to it", we introduce calculus instead. But it appears (graphically) that for a>0, the minimum of $ax^2+bx+c$ is at $x_0=-b/(2a)$ (with a factor of $2$ in the divisor, don't forget that!) and you can prove this; [see here for example](http://math.stackexchange.com/a/1636031), although that proof uses algebra again, so you might not like it.2016-03-08
2

Our approach exploits the fact that any differentiable function $f:\mathbb{R} \rightarrow \mathbb{R}$ is uniquely specified by its derivative and a point on its graph. The quadratic polynomial $P(x) = ax^2 + bx + c$ where $a \neq 0$ satisfies $P(0) = c$ and $P'(x) = 2ax + b$.

We have $P'(x) = 2ax + b = a(2(x+\frac {b}{2a}))$. We can deduce from this that $g(x) = a\cdot(x+\frac{b}{2a})^2 + C$ has the same derivative as $P(x)$. To make it the same function, it must have the same value at a point on its graph, say at $x=0$.

In other words, we must make $g(0) = \frac{b^2}{4a} + C = c$ which can be done if we let $C = c - \frac{b^2}{4a}$. We have thus shown $ax^2 + bx + c = a\cdot(x+\frac{b}{2a})^2 + c - \frac{b^2}{4a}$

To derive the quadratic formula, $a\cdot(x+\frac{b}{2a})^2 + c - \frac{b^2}{4a} = 0 $. This can be solved using elementary algebraic manipulations, from which we get the desired $x = \frac{-b\pm \sqrt{b^{2} - 4 ac}}{2a}$

1

If one is interested only in the roots one can concentrate on the monic equations. Now any monic quadratic expression becomes the square of a linear expression by addition of a suitable constant (finding what the suitable constant is called completing the square). The additive constant is uniformly the same expression in the coefficients of the given quadratic equation.

So the given equation has the same solutions as the modified equation of the form $(x+B)^2=A$. SO the formula for the roots are of the form $x=-B \pm \sqrt A$. Thats the reason all quadratic equations have solutions expressed in a uniform manner.

0

There are essentially two questions being asked here: 1. Why does the quadratic formula apply to ALL quadratic equations? 2. Why is the quadratic formula correct / How can I prove this?

Most of the answers have focussed on the second version of this question, and more specifically, at simply coming up with different proofs, rather than instructing how to come up with one yourself. Fair enough, since these proofs have been incredibly instructive to many readers including me.

But it does leave a bit of a gap. I'll address the first part of that gap by answering the first question: why does the quadratic formula actually apply to all quadratic equations?

That's because the quadratic formula expresses its solution without assuming any information about the actual value of any of the constants a, b, and c. Therefore, it can be applied to a quadratic equation with any values for these three constants.

This is what mathematics is all about: finding general solutions to things so that you can use the same solution method for a wide array of problems.

The same could be said about why the number "3" can be used to describe any group of that size; a group of three sheep, three people, three bricks, whatever. The whole reason why that concept was invented and why it survived the passage of time is because it's so useful.

It's useful because it can be applied to a wide array of different things, rather than requiring a re-invention of the wheel for every different kind of object. This can, of course, only be done by presuming as little as possible to whatever the concept can be applied to.

And that's why concepts like the quadratic formula show up in your maths classes in the first place. They solve a very important problem because many real-life things can be modeled as polynomials, especially the lower ones like the linear and quadratic, so being able to solve such a thing with just one "wheel" is incredibly important and useful.

Hopefully that adds a little perspective on the general applicability and consequent utility of the quadratic formula, the former essentially being what the question was inquiring under its most literal interpretation.