The question might be old, but I feel as though I might be able to shed some light on this problem: I had a similar qualm when studying QM.
First, we must realize what we mean by an inner-product, and a bilinear pairing.
- If $V$ is a $\mathbb C$-vector space, an inner product is a nondegenerate, (sesqui/bi)linear, positive definite map $\langle \cdot,\cdot\rangle: V \times V \to \mathbb C$ (sometimes written as $V\otimes V \to \mathbb C$ to account for bilinearity).
- If $V$ is a complex vector space, let $V^*$ be its dual space. By definition, we have a non-degenerate bilinear pairing $\langle \cdot,\cdot\rangle : V^* \times V \to \mathbb C$ which acts by taking $\gamma \in V^*$ and $v \in V$ to $\gamma(v) \in \mathbb C$. (again, this is sometimes written as $V^* \otimes V \to \mathbb C$).
Aside: In more general contexts there are even issues with degeneracy, and positive definiteness is a meaningless property. One can extend these ideas to bilinear pairings of arbitrary modules, where if $M$ is an $R$-module, we define $M^* = \operatorname{Hom}_{R\text{-mod}}(M,R)$, and examine the maps in $\operatorname{Hom}_{R\text{-mod}}(M\otimes_R M^*,R)$. There is no reason to suspect such maps should be non-degenerate, and without an ordering on $R$ there is no notion of positivity.
While the two notions above look similar, there is no reason they should be related. For sufficiently grotesque vector spaces, they won't be. The power of the Riesz representation theorem is that it tells us that when $(V, \langle \cdot, \cdot \rangle)$ is a Hilbert space, then the bilinear pairing is a bona-fide inner product and there is an isometric (complex anti-)isomorphism $\Phi: V \to V^*$ which makes them equivalent.
The Dirac notation is exploiting this equivalence, by saying that the element $\langle \psi | \phi \rangle$, thought of as the linear functional $\langle \psi|$ acting on $|\phi\rangle$, is precisely the same thing as the inner product $\langle \psi |\phi\rangle$.
Now, the motivation for (separable) Hilbert spaces is, to my knowledge, just that they are very nice. There is some corroboration coming from the fact that wave functions $L^2(\mathbb R^n,\mathbb C)$ and discrete state representations $\mathbb C^n$ (or $SU(n)$ if you prefer the operator viewpoint) all have natural Hilbert space structures compatible with the quantum mechanics.
It actually turns out that Hilbert spaces are not good enough for quantum mechanics. The reason is that the momentum operator $i\frac\partial{\partial x}$ is not a bounded operator: alternatively, $x \to e^{ix}$ (the eigenfunction of the momentum operator) is not a square integrable map. Hence the correct place to do things in quantum mechanics is the collection of Schwarz distributions, also known as rigged Hilbert space.