109
$\begingroup$

An $n$th-degree polynomial has at most $n$ distinct zeroes in the complex numbers. But it may have an uncountable set of zeroes in the quaternions. For example, $x^2+1$ has two zeroes in $\mathbb C$, but in $\mathbb H$, ${\bf i}\cos x + {\bf j}\sin x$ is a distinct zero of this polynomial for every $x$ in $[0, 2\pi)$, and obviously there are many other zeroes.

What is it about $\mathbb H$ that makes its behavior in this regard to be so different from the behavior of $\mathbb R$ and $\mathbb C$? Is it simply because $\mathbb H$ is four-dimensional rather than two-dimensional? Are there any theorems that say when a ring will behave like $\mathbb H$ and when it will behave like $\mathbb C$?

Do all polynomials behave like this in $\mathbb H$? Or is this one unusual?

6 Answers 6

109

When I was first learning abstract algebra, the professor gave the usual sequence of results for polynomials over a field: the Division Algorithm, the Remainder Theorem, and the Factor Theorem, followed by the Corollary that if $D$ is an integral domain, and $E$ is any integral domain that contains $D$, then a polynomial of degree $n$ with coefficients in $D$ has at most $n$ distinct roots in $E$.

He then challenged us, as a homework, to go over the proof of the Factor Theorem and to point out exactly which, where, and how the axioms of a field used in the proof.

Every single one of us missed the fact that commutativity is used.

Here's the issue: the division algorithm (on either side), does hold in $\mathbb{H}[x]$ (in fact, over any ring, commutative or not, in which the leading coefficient of the divisor is a unit). So given a polynomial $p(x)$ with coefficients in $\mathbb{H}$, and a nonzero $a(x)\in\mathbb{H}[x]$, there exist unique $q(x)$ and $r(x)$ in $\mathbb{H}[x]$ such that $p(x) = q(x)a(x) + r(x)$, and $r(x)=0$ or $\deg(r)\lt\deg(a)$. (There also exist unique $q'(x)$ and $s(x)$ such that $p(x) = a(x)q'(x) + s(x)$ and $s(x)=0$ or $\deg(s)\lt\deg(a)$.

The usual argument runs as follows: given $a\in\mathbb{H}$ and $p(x)$, divide $p(x)$ by $x-a$ to get $p(x) = q(x)(x-a) + r$, with $r$ constant. Evaluating at $a$ we get $p(a) = q(a)(a-a)+r = r$, so $r=p(a)$. Hence $a$ is a root if and only if $(x-a)$ divides $p(x)$.

If $b$ is a root of $p(x)$, $b\neq a$, then evaluating at $b$ we have $0=p(b) = q(b)(b-a)$; since $b-a\neq 0$, then $q(b)=0$, so $b$ must be a root of $q$; since $\deg(q)=\deg(p)-1$, an inductive hypothesis tells us that $q(x)$ has at most $\deg(p)-1$ distinct roots, so $p$ has at most $\deg(p)$ roots.

And that is where we are using commutativity: to go from $p(x) = q(x)(x-a)$ to $p(b) = q(b)(b-a)$.

Let $R$ be a ring, and let $a\in R$. Then $a$ induces a set-theoretic map from $R[x]$ to $R$, "evaluation at $a$", $\varepsilon_a\colon R[x]\to R$ by evaluation: $\varepsilon_a(b_0+b_1x+\cdots + b_nx^n) = b_0 + b_1a + \cdots + b_na^n.$ This map is a group homomorphism, and if $a$ is central, also a ring homomorphism; if $a$ is not central, then it is not a ring homomorphism: given $b\in R$ such that $ab\neq ba$, then we have $bx = xb$ in $R[x]$, but $\varepsilon_a(x)\varepsilon_a(b) = ab\neq ba = \varepsilon_a(xb)$.

The "evaluation" map also induces a set theoretic map from $R[x]$ to $R^R$, the ring of all $R$-valued functions in $R$, with the pointwise addition and multiplication ($(f+g)(a) = f(a)+g(a)$, $(fg)(a) = f(a)g(a)$); the map sends $p(x)$ to the function $\mathfrak{p}\colon R\to R$ given by $\mathfrak{p}(a) = \varepsilon_a(p(x))$. This map is a group homomorphism, but it is not a ring homomorphism unless $R$ is commutative.

This means that from $p(x) = q(x)(x-a) + r(x)$ we cannot in general conclude that $p(c) = q(c)(c-a) +r(c)$ unless $c$ commutes in $R$ with $a$. So the Remainder Theorem may fail to hold (if the coefficients involved do not commute with $a$ in $R$), which in turn means that the Factor Theorem may fail to hold So one has to be careful in the statements (see Marc van Leeuwen's answer). And even when both of them hold for the particular $a$ in question, the inductive argument will fail if $b$ does not commute with $a$, because we cannot go from $p(x) = q(x)(x-a)$ to $p(b)=q(b)(b-a)$.

This is exactly what happens with, say, $p(x) = x^2+1$ in $\mathbb{H}[x]$. We are fine as far as showing that, say, $x-i$ is a factor of $p(x)$, because it so happens that when we divide by $x-i$, all coefficients involved centralize $i$ (we just get $(x+i)(x-i)$). But when we try to argue that any root different from $i$ must be a root of $x+i$, we run into the problem that we cannot guarantee that $b^2+1$ equals $(b+i)(b-i)$ unless we know that $b$ centralizes $i$. As it happens, the centralizer of $i$ in $\mathbb{H}$ is $\mathbb{R}[i]$, so we only conclude that the only other complex root is $-i$. But this leaves the possibility open that there may be some roots of $x^2+1$ that do not centralize $i$, and that is exactly what occurs: $j$, and $k$, and all numbers of the form $ai+bj+ck$ with $a^2+b^2+c^2=1$ are roots, and if either $b$ or $c$ are nonzero, then they don't centralize $i$, so we cannot go from $x^2+1 = (x+i)(x-i)$ to "$(ai+bj+ck)^2+1 = (ai+bj+ck+i)(ai+bj+ck-i)$".

And that is what goes wrong, and there is where commutativity is hiding.

  • 0
    @MarcvanLeeuwen: Thanks for the observations; I'll clarify.2012-03-23
30

The finiteness of the number of roots of a polynomial $f(x)\in K[x]$ where $K$ is a field depends on two interlaced facts:

  • $K[x]$ is a Unique Factorization Domain: every polynomial $f(x)$ factors in an essentially unique way as a product of irreducibles;

  • if $f(\alpha)=0$ then $f(x)=(x-\alpha)g(x)$ where $\deg g(x)=(\deg f(x))-1$.

The combination of these two facts (the first one in particular) does not hold anymore if you think the polynomial $f(x)$ as a polynomial with coefficients in the ring $\Bbb H$ of Hamilton quaternions. This is because the latter is not commutative.

You may also ponder on this fact: in a commutative environment the transformation $a\mapsto\phi_h(a)=hah^{-1}$ (conjugation) is always trivial. Not so in $\Bbb H$, again as a side effect of non-commutativity. The point is that if an element $a$ satisfies a certain algebraic relation with real coefficient (such as $a^2=1$), so will all its conjugates $\phi_h(a)$.

  • 4
    +1: the point about conjugation above is _extremely_ important.2012-03-21
25

I would like to emphasize a point which is made in Arturo Magidin's answer but perhaps in different words: if $D$ is a noncommutative division ring, then the ring $D[x]$ of polynomials over $D$ does not do what you want it to do.

If $F$ is a field, then one reason you might care about working with polynomials $F[x]$ is that they describe all the expressions you could potentially get from some unknown $x \in F$ (or perhaps $x \in \bar{F}$ or perhaps something even more general than this) via addition and multiplication.

Why does this break down when you replace $F$ with a noncommutative division ring $D$? The problem is that if you work with some unknown $x \in D$ (or in some ring containing $D$) then $x$, by assumption, doesn't necessarily commute with every element in $D$, so starting from $x$ and adding and multiplying you get not only expressions like $a_0 + a_1 x + a_2 x^2 + ...$

but more complicated expressions like $a_0 + a_{1,0} x + x a_{1,1} + a_{1, 2} x a_{1, 3} + a_{2,0} x^2 + x a_{2,1} x + x^2 a_{2,2} + a_{2, 3} x^2 a_{2,4} + a_{2, 5} x a_{2, 6} x a_{2,7} + ... $

The resulting algebraic structure is quite a bit more complicated than $D[x]$. Already you can't in general combine expressions of the form $axb$ and $cxd$, so even to describe the expressions you can get by using $x$ once I should've actually written $a_0 + a_{1,0} x a_{1,1} + a_{1,2} x a_{1,3} + a_{1,4} x a_{1,5} + ....$

  • 3
    Just one remark: whether polynomials in $R[x]$ do what you want when you use $x$ to stand for some unknown value (maybe outside the ring $R$) depends not so much on whether the elements of $R$ commute _among each other_ as on whether _elements of $R$ commute with what you want $x$ to stand for_ (because in $R[x]$ they commute by definition). So for instance it is fine to use polynomials in $\mathbf R[x]$ with $x$ standing for some unknown quaternion (no worse than standing for a matrix), but it's _not possible_ to do so for polynomials in $\mathbf C[x]$, even though $\mathbf C$ is commutative!2013-03-15
17

I'd like to give a complement to the answers already given, since some of them suggest a relation with more advanced arithmetic topics like Unique Factorization, while this is really based on elementary ring theory only. Notably one has the following

Theorem. Let $R$ be a commutative domain, and $P\in R[X]$ a nonzero polynomial of degree $d$. Then $P$ has at most $d$ roots in $R$.

Normally a commutative domain is called an integral domain (note the curious meaning of "integral"), but I've used "commutative" here to stress the two key properties assumed: commutativity and the absence of zero divisors. (I do assume rings to have an element $1$, by the way.) In the absence of commutativity, even introducing the notion of roots of $P$ is problematic—unless $P$ has its coefficients in the center of $R$ (as is the case in your quaternion example)—as Qiaochu Yuan points out. Indeed for evaluation of $P$ in $a\in R$ one must decide whether to write the powers of $a$ to the right or the left of the coefficients of $P$, giving rise to distinct notions of right- and left-evaluation, and hence of right- and left-roots (and neither form of evaluation is a ring morphism). But even for the case that $P$ has its coefficients in the center $Z(R)$ of $R$, so that right- and left-evaluation in $a$ coincide and define a ring morphism $Z(R)[X]\to R$, the conclusion of the theorem is not valid, as this question illustrates.

The proof of the theorem is based on the following simple

Lemma. Let $R$ be a commutative domain, $P\in R[X]$, and $r\in R$ a root of $P$. Then there exists a (unique) $Q\in R[X]$ with $P=(X-r)Q$, and every root of $P$ other than $r$ is a root of $Q$.

The existence and uniqueness of $Q$ do not depend on $R$ being commutative or a domain: for any ring $R$ one has $P=(X-r)Q$ if and only if $Q$ is the quotient of $P$ by euclidean left-division by $X-r$ and the remainder is $0$, and the latter happens if and only if $r$ is a left-root of $P$ (so properly stated the Factor Theorem does hold for general rings!). But the final part of the lemma does use both commutativity and the absence of zero divisors: one uses that evaluation of $(X-r)Q$ in some root $r'\neq r$ of $P$ can be done in separately in the two factors (this requires commutativity: without it evaluation is not a ring morphism), and then one needs to conclude that one of the factors (necessarily the second) becomes $0$, which requires the absence of zero divisors. Note for noncommutative $R$, that even if $P$ should be in $Z(R)$, the first part fails, since evaluation is only a morphism $Z(R)[X]\to R$, and the factors $X-r$ and $Q$ need not lie in $Z(R)[X]$.

The lemma of course implies the theorem by a straightforward induction on $\deg P$.

One final remark unrelated to the question: since the morphism property of evaluation $Z(R)[X]\to R$ does not help us here, one might wonder what is the point of considering evaluation at all in the absence of commutativity. However note that in linear algebra we teach our students to fearlessly substitute a matrix into polynomials, and to (implicitly) use the morphism property of such evaluation maps $K[X]\to M_n(K)$ where $K$ is a (commutative!) field. This works precisely because $K$ can be identified with $Z(M_n(K))$ (the subring of homothecies).

3

This has to do with the fact that $\mathbb H$ is not a field, since the number of zeroes of a polynomial over a field is always bounded by the degree of the polynomial.