34
$\begingroup$

We often learn in a standard linear algebra course that a determinant is a number associated with a square matrix. We can define the determinant also by saying that it is the sum of all the possible configurations picking an element from a matrix from different rows and different columns multiplied by (-1) or (1) according to the number inversions.

But how is this notion of a 'determinant' derived? What is a determinant, actually? I searched up the history of the determinant and it looks like it predates matrices. How did the modern definition of a determinant come about? Why do we need to multiply some terms of the determinant sum by (-1) based on the number of inversions? I just can't understand the motivation that created determinants. We can define determinants, and see their properties, but I want to understand how they were defined and why they were defined to get a better idea of their important and application.

  • 2
    the determinant defines volume in n-dimensions. Munkrese Analysis on Manifolds text has a nice discussion. I'm not well-versed in the history you seek so I leave it to others!2012-09-12
  • 2
    Related question http://math.stackexchange.com/questions/668/whats-an-intuitive-way-to-think-about-the-determinant/. See the answers given there.2012-09-12
  • 0
    Possible duplicate of http://math.stackexchange.com/questions/81521/development-of-the-idea-of-the-determinant.2013-10-21

3 Answers 3

20

I normally have two ways of viewing determinants without appealing to higher-level math like multilinear forms.

The first is geometric, and I do think that most vector calculus classes nowadays should teach this interpretation. That is that, given vectors $v_1, \ldots, v_n \in \mathbb{R}^n$ dictating the sides of an $n$-dimensional parallelepiped, the volume of this parallelepiped is given by $\det(A)$, where $A = [v_1 \ldots v_n]$ is the matrix whose columns are given by those vectors. We can then view the determinant of a square matrix as measuring the volume-scaling property of the matrix as a linear map on $\mathbb{R}^n$. From here, it would be clear why $\det(A) = 0$ is equivalent to $A$ not being invertible - if $A$ takes a set with positive volume and sends it to a set with zero volume, then $A$ has some direction along which it "flattens" points, which would precisely be the null space of $A$. Unfortunately, I'm under the impression that this interpretation is at least semi-modern, but I think this is one of the cases where the modern viewpoint might be better to teach new students than the old viewpoint.

The old viewpoint is that the determinant is simply the result of trying to solve the linear system $Ax = b$ when $A$ is square. This is most likely how the determinant was first discovered. To derive the determinant this way, write down the generic matrix and then proceed by Gaussian elimination. This means you have to choose nonzero leading entries in each row (the pivots) and use them to eliminate subsequent entries below. Each time you eliminate the rows, you have to multiply by a common denominator, so after you do this $n$ times, you'll end up with the sum of all the permutations of entries from different rows and columns merely by virtue of having multiplied out to get common denominators. The $(-1)^k$ sign flip comes from the fact that at each stage in Gaussian elimination, you're subtracting. So on the first step you're subtracting, but on the second step you're subtracting a subtraction, and so forth. At the very end, by Gaussian elimination, you'll obtain an echelon form (upper triangular), and one knows that if any of the diagonal entries are zero, then the system is not uniquely solvable; the last diagonal entry will precisely be the determinant times the product of the values of previously used pivots (up to a sign, perhaps). Since the pivots chosen are always nonzero, then it will not affect whether or not the last entry is zero, and so you can divide them out.

EDIT: It isn't as simple as I thought, though it will work out if you keep track of what nonzero values you multiply your rows by in Gaussian elimination. My apologies if I mislead anyone.

  • 3
    also, we should emphasize the sign of $det[v_1|v_2|...|v_n]$ reveals the handedness or orientation of the set $\{ v_1,v_2,\dots v_n \}$2012-09-12
  • 3
    Did you ever try actually performing Gaussian elimination on a _generic_ matrix (with all entries independent unknowns)? Try it for a $3\times3$ matrix! It doesn't really work as you advertised, and you'll have a hard time actually making (just) the determinant appear in the computations. You can find something like this done to prove [Cramer's rule](http://en.wikipedia.org/wiki/Cramer%27s_rule#Proof), but is is _not_ usual Gaussian elimination, and it assumes the determinant is already known.2012-09-12
  • 1
    @MarcvanLeeuwen, you're right, I forgot that actually the final diagonal entry will be the determinant multiplied by the value of the first pivot. But since WLOG the first pivot must be a nonzero value, then it can be divided. I don't think it's as hard to manipulate into the determinant form as one might think.2012-09-12
  • 0
    BTW, I just tried it for the $3 \times 3$ case, and it ends up being $a_{11} \det(A)$ for the last diagonal entry, as expected. Note that if $a_{11} = 0$, then we just swap rows until WLOG $a_{11} \neq 0$. The nice thing about swapping rows in Gaussian elimination not affecting the determinant is that it shows why, on some level, the determinant must be permutation-invariant.2012-09-12
  • 0
    @ChristopherA.Wong In fact it is unclear to me what you mean by "first pivot". With all entries unknown, there isn't a single (non-constant) expression that is assured to be nonzero. So you need to multiply rows by factors that are not known to be nonzero, and these factors will remain in (the determinant of) your matrix. I can see how you get a factor $a_{1,1}$, but not how you avoid introducing even nastier factors in the sequel. I'm stuck with $\begin{pmatrix}a_1&a_2&a_3\\0&a_1b_2-b_1a_2&a_1b_3-b_1a_3\\0&a_1c_2-c_1a_2&a_1c_3-c_1a_3\end{pmatrix}$.2012-09-12
  • 0
    @MarcvanLeeuwen You're actually right that it is not as simple as I thought it was before, and the final entry will be the determinant multiplied by many of the nonzero factors used as pivots. However, for the $3 \times 3$ case it works a little more easily; using your matrix, if you do Gaussian elimination again, you get $$\begin{bmatrix} a_1 & a_2 & a_3 \\ 0 & a_1 b_2 - b_1 a_2 & a_1 b_3 - b_1 a_3 \\ 0 & 0 & (a_1c_3 - c_1 a_3)(a_1 b_2 - b_1 a_2) - (a_1 b_3 - b_1a_3)(a_1 c_2 - c_1 a_2) \end{bmatrix}.$$2012-09-12
  • 0
    Expanding the last term, we get $a_1^2 b_2 c_3 - a_1 a_2 b_1 c_3 - a_1 a_3 b_2 c_1 + a_1^2 b_3 c_2 + a_1 a_2 b_3 c_1 - a_1 a_3 b_1 c_2 $, which is precisely equal to $a_1 \det(A)$.2012-09-12
  • 0
    You're right; I think I just didn't have the courage to do that... ;-)2012-09-12
9

I do not know the actual history of determinant, but I think it is very well motivated. From the way I look at it, it's actually those properties of determinant that make sense. Then you derive the formula from them.

Let me start by trying to define the "signed volume" of a hyper-parallelepiped whose sides are $(u_1, u_2, \ldots, u_n)$. I'll call this function $\det$. (I have no idea why it is named "determinant". Wiki says Cauchy was the one who started using the term in the present sense.) Here are some observations regarding $\det$ that I consider quite natural:

  1. The unit hypercube whose sides are $(e_1, e_2, \ldots, e_n)$, where $e_i$ are standard basis vectors of $\mathbb R^n$, should have volume of $1$.
  2. If one of the sides is zero, the volume should be $0$.
  3. If you vary one side and keep all other sides fix, how would the signed volume change? You may think about a 3D case when you have a flat parallelogram defined by vectors $u_1$ and $u_2$ as a base of a solid shape, then try to extend the "height" direction by the third vector $u_3$. What happens to the volume as you scale $u_3$? Also, consider what happens if you have two height vectors $u_3$ and $\hat u_3$. $\det(u_1, u_2, u_3 + \hat u_3)$ should be equal to $\det(u_1, u_2, u_3) + \det(u_1, u_2, \hat u_3)$. (This is where you need your volume function to be signed.)
  4. If I add a multiple of one side, say $u_i$, to another side $u_j$ and replace $u_j$ by $\hat u_j = u_j + c u_i$, the signed volume should not change because the addition to $u_j$ is in the direction of $u_i$. (Think about how a rectangle can be sheered into a parallelogram with equal area.)

With these three properties, you get familiar properties of $\det$:

  1. $\det(e_1, \ldots, e_n) = 1$.
  2. $\det(u_1, \ldots, u_n) = 0$ if $u_i = 0$ for some $i$.
  3. $\det(u_1, \ldots, u_i + c\hat u_i, \ldots, u_n) = \det(u_1, \ldots, u_i, \ldots, u_n) + c\det(u_1, \ldots, \hat u_i, \ldots, u_n)$.
  4. $\det(u_1, \ldots, u_i, \ldots, u_j, \ldots, u_n) = \det(u_1, \ldots, u_1, \ldots, u_j + cu_i, \ldots, u_n)$. (It may happen that $j < i$.)

You can then derive the formula for $\det$. You can use these properties to deduce further easier-to-use (in my opinion) properties:

  • Swapping two columns changes the sign of $\det$.

This should tell you why oddness and evenness of permutations matter. To actually (inefficiently) compute the determinant $\det(u_1, u_2, \ldots, u_n)$, write $u_i$ as $u_i = \sum_{j=1}^n u_{ij}e_j$, and expand by multilinearity. For example, in 2D case,

$$ \begin{align*} \det(u, v) & = \det(u_1e_1 + u_2e_2, v_1e_1 + v_2e_2) \\ & = u_1v_1\underbrace{\det(e_1, e_1)}_0 + u_1v_2\underbrace{\det(e_1, e_2)}_1 + u_2v_1\underbrace{\det(e_2, e_1)}_{-1} + u_2v_2\underbrace{\det(e_2, e_2)}_0 \\ & = u_1v_2 - u_2v_1. \end{align*} $$

(If you are not familiar with multilinearity, just think of it as a product. Ignore the word $\det$ from the second line and you get a simple expansion of products. Then you evaluate "unusual product" between vectors $e_i$ by the definition of $\det$. Note, however, that the order is important, as $\det(u, v) = - \det(v, u)$.)

  • 1
    The name originates from Gauss in Disquisitiones arithmeticae (1801) while discussing quadratic form. Full article: http://www-groups.dcs.st-and.ac.uk/history/HistTopics/Matrices_and_determinants.html2016-07-06
8

The determinant was originally `discovered' by Cramer when solving systems of linear equations necessary to determine the coefficients of a polynomial curve passing through a given set of points. Cramer's rule, for giving the general solution of a system of linear equations, was a direct result of this.

This appears in Gabriel Cramer, ``Introduction a l'analyse des lignes courbes algebriques,''(Introduction to the analysis of algebraic line curves), Geneve, Ches les Freres Cramer & Cl. Philibert, (1750). It is cited as a footnote on p. 60, which reads (from French):

``I think I have found [for solving these equations] a very simple and general rule, when the number of equations and unknowns do not pass the first degree [e.g. are linear]. One finds this in the Appendix No. 1.'' Appendix No. 1 appears on p. 657 of the same text. The text is available on line, for those who can read French.

The history of the Determinant appears in Thomas Muir, ``The Theory of Determinants in the Historical Order of Development,'' Dover, NY, (1923). This is also available on line.