We define
$$TM=\bigsqcup_{p\in M}T_PM$$
with a smooth structure pulled back from the projection map.  This is a key point.  The tangent bundle's topology and smooth structure capture some of the manifold's topology.  You can find an easy bijection $TM\leftrightarrow M\times\mathbb{R}^n$, but you cannot in general find a fiber-preserving diffeomorphism between the two spaces.
When we trivialize, we require that $F:TM\to M\times V$ be not just a diffeomorphism but a diffeomorphism that is a fiberwise isomorphism.  A tangent bundle isn't just some disjoint collection of vector spaces all floating off in abstract mathland - there's more structure than that.  We need two additional things: 
- a projection map $p:TM\to M$ taking a tangent vector to its basepoint, and that 
- $M$ is covered by neighborhoods $U$ which obey two conditions:
- $p^{-1}U$ is diffeomorphic to $U\times\mathbb{R}^n$ (say via $\phi_U$) in a way that respects projection onto $U$ (i.e. $\pi_U\circ\phi_U = p$), and
- for two such neighborhoods $U$ and $V$, there is a family of vector space isomorphisms which govern the transformation of the fibers: $$U\cap V\ni x\mapsto\theta_{UV}(x):\{x\}\times \mathbb{R}^n\to\{x\}\times\mathbb{R}^n$$ (where the first $\{x\}\times\mathbb{R}^n\subset U\times\mathbb{R}^n$ and the second $\{x\}\times\mathbb{R}^n\subset V\times\mathbb{R}^n$).  This condition is in place so that when we change coordinates, the new fiber still has the structure of a vector space.
 
These neighborhoods are called "local trivializations;" they're analogous to coordinate neighborhoods in a manifold.  (In fact, one method of constructing $TM$ is by suitably patching together local trivializations from a cover of coordinate neighborhoods.)  
For $F:TM\to M\times\mathbb{R}^n$ to be a global trivialization, we need not just that $F$ is a diffeomorphism, but that $F$ preserves all of this structure.  In particular, when restricted to a single tangent space, $F$ must be an isomorphism.  This is much stronger than simply requiring $F$ be a diffeomorphism between $TM$ and $M\times\mathbb{R}^n$.
The standard counterexample against the idea that all tangent bundles are trivializable is $T\mathbb{S}^2$.  The "hairy ball" theorem states that there is no nonvanishing vector field on $\mathbb{S}^2$.  You can see this from the Poincare-Hopf index theorem: 
  On any closed smooth manifold $M$, for any nondegenerate vector field $V$ on $M$, the Euler characteristic
  $$\chi(M) = \sum_{x\in\{\mbox{zeros of }V\}} \iota_v(x)$$
  where $\iota_v(x)$ is the index of $v$ at $x$, the degree of the vector field when restricted to a small circle about $x$ and normalized.
Now it's clear that $T\mathbb{S}^2$ is not trivializable: $\chi(\mathbb{S}^2) = 2$, and if we had a trivialization, then we would have a nonvanishing vector field which would force the Euler characteristic to $0$.
In fact, we can see from this much more than that $T\mathbb{S}^2$ is nontrivializable: the Euler characteristic is an obstruction to the trivializability of the tangent bundle of a manifold.  In order for the tangent bundle to be trivializable, we must be able to find $n$ global sections which are a pointwise basis for the tangent spaces.  Each of these sections would be a nonvanishing vector field, which would imply that the Euler characteristic of the manifold is $0$.  
(Note that, as Jason DeVito points out below, a zero Euler characteristic is necessary but not sufficient for a trivializable tangent bundle.)
This is an edited response to your attempt to trivialize $T\mathbb{S}^2$.  Let's be a little more concrete: represent $\mathbb{S}^2$ as $\widehat{\mathbb{C}}$.  Charts are the identity $\widehat{\mathbb{C}}-\{\infty\}\to\mathbb{C}$, and inversion $\widehat{\mathbb{C}}-\{0\}\to\mathbb{C}$ where $p\mapsto \frac{1}{p}$.  (We define $\frac{1}{\infty}=0$).  Note that transition maps are given by inversion, $w = z^{-1}$.
Each of these neighborhoods is a trivialization of $T\mathbb{S}^2$, so in each of them we can represent a tangent vector as $(v,z)$ where $v$ is the vector and $z$ is the basepoint.  Let's start with $\mathbb{C}$.  Define on this neighborhood $F(v,z) = (v,z)$.  This takes care of the map $F$ for all of $\widehat{\mathbb{C}}-\{\infty\}$.
To extend to infinity, we need to define $F$ on $\widehat{\mathbb{C}}-\{0\}$ so that it agrees with the definition we have given on $\mathbb{C}$.  Note that the differential of the transition function $\frac{1}{z}$ is $\frac{-1}{z^2}$.  For every $w\in\widehat{\mathbb{C}}-\{\infty\}$, we need to define $F(v,w) = (\frac{-v}{w^2},w^{-1})$ so that it is well-defined under coordinate changes.
Now how should we define $F(v,\infty)$?  We see the problem: We'll have to map $(v,\infty)\mapsto 0$ in order for $F$ to be continuous at $\infty$.  This prevents $F$ from being an isomorphism on $T_\infty\widehat{\mathbb{C}}$, so it's not possible to use this method to trivialize $F$.  (In fact, it's not possible for reasons discussed above.)