It's not self-evident; rather, it's a justification for the epsilon-delta definition of "continuous"! It's the bridge between the epsilon-delta view of the world that we learn in university, and the "point and line" view of the world which we know from high school.
This is the content of two 1-hour lectures, but let me try to summarize it in a nutshell:
The fundamental problem in calculus can be thought of as:
Estimate $f(a)$ using information about $f(b)$ where $b$ is some number near $a$. (You can directly measure only $f(b)$ but not $f(a)$).
So at the most primitive level, before we discuss derivatives or anything, we simply want to characterize functions for which $f(b)$ gives some estimate of $f(a)$ for $b$ close enough to $a$. These are the functions for which it makes some kind of sense to even try to make such an estimate. So what are these functions?
- Function $f$ doesn't a jump or snap at $a$. For example, if $f$ is "tension in a string", and the string breaks at $a$, measuring tension just before $a$ won't tell you the tension at $a$.
- There isn't some kind of crazy $f(x)=sin\left(\frac{1}{x}\right)$ at $x\neq 0$ and $f(0)=0$ "resonance" "shaking itself to pieces" phenomenon at $a$.
In other words, $f$ is continuous at $a$ ($f(a)$ can be estimated by looking at values of $f$ for inputs close enough to $a$) if no terrible crisis occurs for function $f$ at $a$.
Good. How do you formalize this? Along comes Arbogast at the end of the 18th century, and suggests that the property you need to demand is the intermediate value property. This is a good first-approximation to a formalization of "continuity", because is takes care of "breakage" ("issue 1"). But it does not take care of not "shaking" ("issue 2")- the topologist's sine curve satisfies the intermediate value property at $0$, but isn't "continuous". Even worse, it's satisfied by crazy functions like the Conway base 13 function which shake so badly that it makes no sense to try to estimate in anywhere.
Then, along comes Bernard Bolzano, works out epsilon-delta, and gives the modern definition of continuity. And, because he's a Catholic Priest and not a great and famous mathematician like Arbogast, he has to probe that his definition implies the Intermediate Value Property in order for anyone to take him seriously. But, personal reputation of Bolzano aside, why in fact is this property so central?
The epsilon-delta view of the world, put forward by Bolzano, posits that a line is more than just an infinite connection of points- rather, it is made up of interlocking epsilon-delta fuzz (using more technical terms, you're imposing equiping the real line with the metric topology). One way of thinking about this is that, in reality, you never know exactly what a real number is- you can write down the first zillion digits of pi, but never all of it. So a real number is an inherently fuzzy concept, and the real line (as a topological space) is made up of interlocking fuzz rather than being made up of points.
Writing out the epsilon-delta construction of a continuous function (or any epsilon-delta construction) is painful, because you have all of these nested quantifiers to deal with fuzziness (for any epsilon there exists delta such that for any x between bla bla bla). It's not human language- it looks more like some kind of awful computer code. On the other hand, the intermediate value property is talking about points. It tells you that there exists a point c between $a$ and $b$ such that bla bla bla.
Punchline: The intermediate value theorem is the bridge between fuzz world and point world.
Every other bridge between fuzz-world and point-world goes through it (all the ones you learn in undergrad calculus do anyway). So it's not at all obvious... it's the statement that the horribly convoluted non-human-language epsilon-delta language captures all the properties of a continuous function that you want; and then some. It's the statement that the convoluted non-intuitive epsilon-delta formalism happens to be the one which captures your geometric intuition. How surprising is that!
Formally, the intermediate value theorem is a consequence of completeness of the reals, which is roughly the statement that the real line doesn't have microscopic holes in it like the line of rational numbers has. So again- the passage from fuzz to points and back again has to make use completeness of the real numbers, and the intermediate value theorem is where that happens.
The intermediate value theorem isn't important because it's surprising. It's important because it gives you the bridge which you need in order to cross from epsilon-delta world to point-world and back.
Anyway, I've written far too much by now... Sorry for this tl;dr answer!