17
$\begingroup$

Given a matrix $A_{n \times n}$, which has elements $a_{i,j} \sim \mathrm{unif} \left[a,b\right]$, what is the probablity of $\det(A)$ being zero? What if $a_{i,j}$ have any other distribution?

Added: Let's assume an extension of the about problem; What is the probability of, $\mathbb{P}(|\det(A)| < \epsilon ), \; s.t. \; \epsilon \in \mathbb{R} $ ?

Thanks!

  • 2
    You need to specify the joint distribution of the $a_{i,j}$ or else taking them all equal gives an example where the probability is one. It is worth noting that asymptotics of the case where $a_{i,j}$ are $iid$ Bernoulli random variables is an open research question with work done by some big names like Bourgain and Vu.2012-10-31
  • 0
    This is a nice proof (for the Lebesgue measure, but would seem to generalize to measures that are absolutely continuous wrt the Lebesgue measure) http://www.uwindsor.ca/math/sites/uwindsor.ca.math/files/05-03.pdf However, it is inductive. It would be nice to see other proofs for a result that seems 'obvious'.2012-10-31
  • 0
    @Chris, can you be more specific in your comment "taking them all equal" and "probability is one". If the $a_{i,j}$ are $iid$ say uniform in $[0,1]$, then almost surely the determinant is nonzero. Did you mean if the $a_{i,j}$ all have the same value (ie, degenerate distribution)?2012-11-01
  • 0
    @alancalvitti: Yes, the point was to show that you need to specify the joint distribution because there's at least one case in which the probability is 1.2012-11-01

5 Answers 5

9

Here is another (inductive) proof inspired by Marvis' suggestion below.

It relies on the fact that $\det A \ne0$ iff the columns of $A$ are linearly independent (li.). Let $\Delta_k = \{ (x_1,...,x_k) | x_i \in \mathbb{R}^n, \ x_1,...,x_k \mbox{ are not li.} \}$. Then if we show that $\Delta_n$ has Lebesgue measure zero, it will follow that the set $\{A | \det A = 0 \}$ has Lebesgue measure zero.

In fact, we will show that $m_k \Delta_k = 0$, for $k=1,...,n$, where $m_k$ is the Lebesgue measure on $\mathbb{R}^n \times\cdots \times \mathbb{R}^n$ ($k$ copies).

First we must show that $\Delta_k$ is measurable. If we let $\phi(x_1,...,x_k) = \min_{\|\alpha\| = 1} \| \sum \alpha_i x_i \|$, then we see that $\phi(x_1,...,x_k) = 0$ iff $\{x_1,...,x_k\}$ are not li. Since $\phi$ is continuous, and $\Delta_k = \phi^{-1} \{0 \}$ we see that $\Delta_k$ is Borel measurable.

It is straightforward to see that $\Delta_1 = \{0\}$, hence $m_1 \Delta_1 = 0$. Now suppose this is true for $\Delta_k$, with $1 \leq k < n$. Let $N = \mathbb{R}^n \times \Delta_k$. By assumption $m_{k+1} N = 0$ (since $m_k \Delta_k = 0$). Also, $N \subset \Delta_{k+1}$.

Consider a point $(x, x_1,...,x_k) \in \Delta_{k+1} \setminus N$ (note the indices on the $x$s). Then $\{x_1,...,x_k\}$ are li., but $\{x, x_1,...,x_k\}$ are not li. This can be true iff $x \in \mathbb{sp} \{x_1,...,x_k\}$, a $k$-dimensional hyperplane in $\mathbb{R}^n$ passing through $0$. Note that $m(\mathbb{sp} \{x_1,...,x_k\}) = 0$. Then using Fubini we have (with a slight abuse of notation) \begin{eqnarray} m_{k+1} (\Delta_{k+1} \setminus N) &=& \int 1_{\Delta_{k+1} \setminus N} \, d m_{k+1}\\ & = & \int 1_{\mathbb{sp} \{x_1,...,x_k\}}(x) 1_{\Delta_k^C}((x_1,...,x_k)) \, d x \, d(x_1,...,x_k)\\ & = & \int \left( \int 1_{\mathbb{sp} \{x_1,...,x_k\}}(x) \, dx \right) 1_{\Delta_k^C}((x_1,...,x_k)) \, d(x_1,...,x_k)\\ & = & \int m(\mathbb{sp} \{x_1,...,x_k\}) 1_{\Delta_k^C}((x_1,...,x_k)) \, d(x_1,...,x_k) \\ & = & 0 \end{eqnarray} Since $m_{k+1} (\Delta_{k+1} \cap N) = m_{k+1} N = 0$, we have $m \Delta_{k+1} = 0$. It follows that $m \Delta_k = 0$ for $k=1,...,n$.

If $\mu$ is any measure on $\mathbb{R}^n\times \cdots \mathbb{R}^n$ that is absolutely continuous with respect to the Lebesgue measure ($m_n$ in this case), then it is clear that $\mu \Delta_n = 0$ as well.

In particular, if $\mu$ can be expressed in terms of a joint probability density function, then $\mu \Delta_n = 0$. I believe this includes the case you intended in the question.

I mentioned this in the comments above, but think it is worth repeating here: http://www1.uwindsor.ca/math/sites/uwindsor.ca.math/files/05-03.pdf.

  • 1
    Relatedly, one can show that the space of all singular matrices forms a lower-dimensional manifold.2012-11-01
  • 0
    @copper.hat Neat. I have always wanted to formalize my comment but never did due to lack of enthusiasm and lack of my ability to formalize my thought. I will use your answer as reference in future. Thanks.2012-11-02
  • 0
    @copper.hat, nice argument. Can you generalize the statement to the case when determinant is bounded by $\epsilon$? (I added to the main question...)2012-11-04
  • 0
    That is a considerably more difficult question, you might be better off asking it as a separate question and referring to this one.2012-11-05
6

A zero determinant is a linear constraint on the elements of the matrix. In other words, given all values except one, this constraint fixes the one value to what is necessary for the determinant to be 0. Looks like we are picking one fixed value of one (or more) of the entries in the matrix out of a continuous distribution, which would make it an event of 0 measure (in other words, probability of having a 0 determinant is 0 as long as we are picking from a continuous distribution).

  • 1
    This is a reasonable intuitive argument, but needs some rigour to be an answer.2012-10-31
  • 0
    @copper.hat Agree, was just pointing out the intuition behind what may be formalized as an argument...2012-10-31
  • 0
    Also heuristically, the determinant cuts out a hypersurface in the $n^2$-dimensional space of all matrices, and this must surely have empty interior.2012-10-31
  • 0
    The empty interior follows by looking at $t \mapsto \det ( A + t I)$, since it is zero for only a finite number of $t$'s.2012-10-31
  • 0
    "A zero determinant is a linear constraint" I think you mean "polynomial constraint".2012-10-31
  • 1
    @WillieWong It's linear in each $a_{ij}$2012-10-31
4

The probability is zero. We need only consider the 2-by-2 case.

WLOG, consider a uniform distribution on $[0,1]$. Then, generate the $A_{11}$ and $A_{21}$ entries independently.

The $A_{12}$ entry will then be some multiple $k$ of $A_{11}$: $$k = \frac{A_{12}}{A_{11}}.$$

In order for $A$ to be singular, then $A_{22} = kA_{21}$ exactly. This means that $A$ is singular if and only if a uniformly distributed random variable assumes a specific value, namely $kA_{21}$. The probability of a continuous random variable assuming a discrete value is zero.

Indeed, for some values of $A_{11}$ and $A_{21}$, the value $\frac{A_{21}}{A_{11}} A_{12} > 1$ is a possibility as well. For example, consider

$$ A =\begin{pmatrix} .25 & .75 \\ .5 & a \end{pmatrix}.$$

The only value of $a$ that will make this non-singular is outside the unit interval.

Why may we consider the 2x2 case only? For any singular $n$-by-$n$ matrix, it is possible to choose two rows and two columns such that the 2-by-2 matrix of their entries is singular.


Alternatively, consider any $n$-by-$n$ random matrix whole elements are uniformly chosen from $[0,1]$ (or $(0,1)$ -- it matters not).

Then, for the $i$th row and $j$th column, a necessary condition for the singularity of $A$ is that $A_{ij} = A_{pj} \frac{A_{iq}}{A_{pq}}$ for at least one value of $p = 1,\ldots,n,\ p \neq i$ and $q = 1,\ldots,n,\ q \neq j$.

In other words, if you consider any randomly generated element in the matrix, then it must be a scalar multiple of some element in the same column, and that scalar multiple must be the ratio of the elements from the corresponding rows in a different column.

So, you have $$P(\det A = 0) = \sum_{p,q} P\left(A_{ij} = A_{pj}\frac{A_{iq}}{A_{pq}}\right).$$

The sum is finite, and each individual probability is the probability that $A_{ij}$ assumes a distinct value in a continuous distribution.

One may notice that the probability sum is missing the "and" terms -- the sum is basically the probability of one event or another event, many times. But inclusion-exclusion demands we subtract off the probability that both events happen. However, this is obviously zero, if $A_{ij} = \alpha$ and $A_{ij} = \beta$, then we see that $\alpha = \beta$ and this equality leads to the determinant of a 2-by-2 matrix being zero.

  • 0
    Why did you assert that only the $2\times 2$ case is all that needs to be considered?2012-10-31
  • 0
    Because it is the simplest case; for any other square matrix, you can make the same argument, except that you have to jointly consider the probability of multiple random variables assuming discrete values on a continuous distribution. It just introduces more cases needlessly, since you will always get $P(X = x) = 0$.2012-10-31
  • 0
    The set of values for which a 2x2 has determinant 0 is not a discrete or isolated set.2012-10-31
  • 0
    No, but the set of values for any single entry that sends the 2x2 determinant to zero is a discrete set given the other randomly-generated entries.2012-10-31
  • 0
    You need to formalize the induction...2012-10-31
  • 1
    Probably. I also need to eat lunch. I will do the latter first.2012-10-31
  • 0
    Lunch sounds good...2012-10-31
  • 0
    @copper.hat I added a hand-wavy argument before grabbing lunch.2012-10-31
  • 2
    As Ed Gorcenski has rightly pointed out the probability is zero. Here is a non-rigorous but hopefully an intuitive way to look at this is as follows. A $n \times n$ matrix will have zero determinant iff the columns/rows are linearly dependent i.e. to put it the other way the column vectors though they lie in $\mathbb{R}^n$, they strictly lie in a proper subspace of $\mathbb{R}^n$, say $\mathbb{R}^{n-1}$. Hence, the desired probability is nothing but the $$\dfrac{n \text{ dimensional volume of a cube in } \mathbb{R}^{n-1}}{n \text{ dimensional volume of a cube in } \mathbb{R}^{n}} = 0$$2012-10-31
  • 0
    In the question comments I added a reference to a short proof that a polynomial function $\mathbb{R}^n\to \mathbb{R}$ is either identically zero or non-zero a.e. (Lebesgue).2012-10-31
  • 0
    @Marvis: I added an answer inspired by your suggestion.2012-11-01
1

Check this related work:
http://www.inf.kcl.ac.uk/staff/ccooper/papers/Li_Matrix_GFt.pdf

It considers a more general case....

1

Most of the above answers are based on the assumption that A is in Rn X Rn. But what if A is in Fn x Fn, where Fn is a finite field? I remember from linear algebra class fifty years ago that linear algebra works over any field; is this still true? This leads to a whole new line of investigation with interesting applications in crypto, such as the number of possible keys in a Hill cipher.

I guess it's time to try some monte carlo experiments on my HP Prime again.

Ooop. forget I posted that! I don't want to violate national security or empower terrorists or Mafiosi.