13
$\begingroup$

I'm having a hard time truly understanding the meaning of $\dim\operatorname{Im} T + \dim\operatorname{Ker} T = \dim V$ where $V$ is the domain of a linear transformation $T:V\to W$. I've used this equation several times in many problems, and I've gone over the proof and I believe that I fully understand it, but I don't understand the intuitive reasoning behind it. I'd appreciate an intuitive explanation of it.

Just to be clear, I do understand the equation itself, I am able to use it, and I know how to prove it; my question is what is the meaning of this equation from a linear algebra perspective.

  • 0
    It's a combination of the first isomorphism theorem for groups and the fact that $\dim (U \oplus V) = \dim U + \dim V$. So I guess you should seek the meaning of the first isomorphism theorem in group theory. There the proof is pretty conceptual: you draw two short exact sequences and proceed to construct an isomorphism between them.2012-10-06
  • 0
    Geometrically you can understand the theorem by considering fibers over the points of the image.2012-10-06

4 Answers 4

14

I like to think of it as some form of conservation of dimension. If you have a linear mapping then it acts on each dimension of the domain (this is a consequence of linear mappings being completely determined by their action on any given basis of a space).

There only two possibilities for each dimension, either it is preserved or it is compressed (i.e. taken to $\mathbf{0}$). The net dimension of the compressed portion of the domain is your nullity, i.e. the dimension of your kernel. The net dimension which is preserved is your rank, i.e. the dimension of your image space. This gives you an intuitive understanding of the rank-nullity theorem.

As a note, if you take a minute and think deeply then you'll realize this argument is essentially the same as the projections that trb456 mentioned.

  • 0
    Wow, this is great! The basis of an image is the linearly independent vectors in the transformation matrix; its dimension is the number of such vectors, or the rank of the matrix! And the kernel is what's left! So the rank-nullity theorem actually corresponds with $dimImT + dimKerT = dimV$! Because linear transformations are actually matrices :). Your second paragraph made that all very clear, and I thank you!!2012-10-06
  • 0
    And yes, I realize now that this is essentially what trb456 said, but you made it much more intuitive, which is just what I needed. Bringing the rank-nullity theorem into the mix was especially helpful, I really like it when the relationship between linear transformations and matrices makes itself clear :). I may be getting overexcited, but there's really nothing quite like truly understanding a topic in linear algebra, I mean beyond the "use it to solve a problem" way. Thank you very very much :).2012-10-06
  • 0
    @Daniel You're very welcome. I'm glad you found the argument intuitive.2012-10-06
  • 0
    @Daniel: I'd like to offer a word of warning about your statement that "linear transformations are actually matrices". a transformation $T:V\to W$ only "becomes" a matrix once you choose a basis for each of $V$ and $W$. If you choose different bases, then the matrix for $T$ changes.2012-10-06
  • 0
    Also, one remark about EuYu's answer: given a basis for $V$, it's not necessarily true that the number of basis vectors sent to $0$ by $T$ is equal to the nullity of $T$. But it *is* true that there *exists* a basis for $V$ that has this property - namely, take a basis for $\ker T$ and extend it to a basis for $V$.2012-10-06
  • 0
    @Brad My understanding was that matrices and linear maps are different ways of expressing the same concept. Linear operations (addition and multiplication) are the same for matrices and linear maps, and many (if not all) concepts are shared.. Would it really be wrong to say they're the same thing? Even if the matrix is different, it can still represent the same linear transformation.2012-10-06
  • 0
    It depends on what one means by "the same thing", I guess! Here's what I meant by that remark. Let $L(V,W)$ be the set of linear maps from $V$ to $W$. By definition the elements of $L(V,W)$ are certain functions from $V$ to $W$. If we choose bases for $V$ and $W$ then we can write the matrix of each $T\in L(V,W)$ with respect to those bases, and this gives a one-to-one correspondence between $L(V,W)$ and the set of $\dim W\times \dim V$ matrices. In that sense, linear transformations "are" matrices *so long as we have fixed bases in mind*.2012-10-06
  • 0
    But if you are given a $\dim W\times \dim V$ matrix and you want to think of it as a linear map from $V$ to $W$, you need to *choose* bases to do so. It just happens to be the case that a lot of vector spaces one encounters, like $\mathbb{R}^n$, have commonly accepted "standard bases", and we often identify $L(\mathbb{R}^n,\mathbb{R}^m)$ with the set of real $m\times n$ matrices by using these "standard bases". But vector spaces aren't always equipped with a "standard basis". (I'm putting "standard basis" in quotation marks because it's more of a psychological term than a mathematical one.)2012-10-06
11

You can think about Rank-Nullity Theorem geometrically in terms of things called fibers over points.

Think about the case when your mapping $f: U \to V$ is surjective, and consider the mapping $f^{-1}: V \to 2^U$ that takes each point $p \in V$ to it's preimage $f^{-1}(p)$ (called fiber over $p$) in $U$. You can easily check the fibers are affine subspaces of $U$ parallel to each other (each point on $U$ passes through exactly one fiber). Also, the fiber passing through $0 \in U$ is exactly $\ker f$.

You can thus picture $U$ as being separated into infinite number of thin layers, like a sedimentary rock:

enter image description here

From this you can easily see that to uniquely specify a point in $U$ you can first specify a fiber (the set of fibers being parameterized by $V = \operatorname{im} f$) and then specify a point on a fiber (that is (non-uniquely) parameterized by $\ker f$). This gives you the Rank-Nullity Theorem: $$\dim \ker f + \dim \operatorname{im} f = \dim U.$$

For example, in case of a mapping $f: \mathbb{R}^2 \to \mathbb{R},\; (x, y) \to x + y$, the fibers will satisfy an equation of the form $y = a - x$ for some $a \in \mathbb{R}$. You can check that here $y + x = a - x + x = a$ indeed does not depend on either $y$ or $x$.

Now, how many independent variables do you need to specify a point in $\mathbb{R}^2$? You need one variable ($a$) to specify a fiber (equivalently, a point on $\mathbb{R}$), and another one (say, $x$) to specify a point on the fiber - that's two degrees of freedom, as expected!

The Rank-Nullity theorem states that for any surjective linear mapping $f: U \to V$, in any dimension you can use the same trick to uniquely parameterize any point in $U$. The same goes for any non-surjective linear mapping, of course, you'll just need to corestrict it to its image.

Alternatively, you could draw another line through $0 \in \mathbb{R}^2$ distinct from $\ker f$. You can easily show that it crosses each fiber of $f$ exactly once, so you can use it to parameterize fibers more explicitly: identify this line with $\operatorname{im} f$, then for any two points on $\ker f$ and $\operatorname{im} f$ you can uniquely obtain the corresponding point of $\mathbb{R}$ using the parallelogram rule. Rank-Nullity states that you can do this sort of thing in any dimension and for any $f$ (instead of lines you'll have affine subspaces of different dimensions, though).

This is a geometric picture of what's going on.

3

Perhaps think of it in terms of projections? Whatever T does not project into the image must disappear; i.e. is in the kernel. This is why it is the domain dimension that matters. The image is an injection into the range, so it has the same dimension as the corresponding preimage in the domain. The image of the kernel is just zero, so it is the dimension of the kernel in the domain that matters.

  • 0
    But isn't $\operatorname{Im}T$ part of $W$, and not $V$? What I mean is, by your explanation, $\operatorname{Im}T$ is based off of elements in $V$, whereas by my understanding (and I might have misunderstood), $\operatorname{Im}T$ is based off of elements in $W$. So if $\dim\operatorname{Im}T$ is based on $W$, it should (intuitively) have no relation with $\operatorname{dim}V$.. I hope I was clear.2012-10-06
  • 0
    I understand now (partly thanks to EuYu's answer) what you were trying to explain, thank you.2012-10-06
  • 0
    Fiber has the same dimension as kernel, not image.2012-10-07
  • 0
    The fiber of the *image* of *T* is not the kernel. Perhaps I'm just phrasing badly--how might you re-word?2012-10-07
  • 0
    It doesn't make sense to speak of "the fibre of the image of $T$", at least not as a subset of $V$. Given a function $f:X\to Y$, there is a fibre $f^{-1}(y)$ over each point $y\in Y$. In the case of a linear transformation $T:V\to W$ and $w\in W$, the fibre over $w$ is the empty set if $w\not\in \operatorname{Im}T$, and if $w\in\operatorname{Im}T$ then the fibre $T^{-1}(w)$ is an affine subspace isomorphic to $\ker T$. Namely, it is $\ker T+v_0$ for any given $v_0\in V$ with $T(v_0)=w$. (For example, $\ker T$ is the fibre over $0$.)2012-10-09
  • 0
    I also don't understand what you mean by "the image is an injection into the range" - the image is a subspace of $W$, whereas an injection is a kind of function. Of course the inclusion $\operatorname{Im}T\to W$ is an injection, but I suspect this isn't what you meant to say, since it doesn't have to do with $T$ or $V$.2012-10-09
  • 0
    @Brad: Then I am phrasing badly by using "fibre"--I'll edit to "preimage". But I do mean "injection" to emphasize the dimension of of *Im T*.2012-10-09
  • 0
    "The image is an injection into the range, so it has the same dimension as the corresponding preimage in the domain" U mad bro :)2012-10-09
  • 0
    No, not mad, but trying for an *intuitive* explanation. The problem is that the image also contain zero as it is a subspace. I'm trying to separate out the preimage *not* mapped into the image (i.e. the kernel). The only way to "intuitively" do this is to "ignore" the zero in the image and make all the zeros "kernel". Yes, this is wrong, but we're trying to separate the image and the kernel. Don't be too literal and this should be an OK description for intuition, but not actual math (which the OP said he understood)!2012-10-10
0

$T$ is defined over all of $V$, its image must be at most as big as $V$ is; on the other hand there might be something missing. This means that it "goes to nothing" and "nothing", in this case, is the zero vector.

As trb456 says one way to think of it is this (which is the same thing the proof goes through): a vector in $V$ maps into something which is either zero or isn't. When you do this for every vector in $V$, then you can check that all the non-zero images are a vector space (just like the kernel), which means that since everything goes somewhere, the base of $V$ must go into some zeros and some not-zeros.

This might be a little confusing. If it is, let me know and I'll try to clear it up.