For a presheaf, given two sections $s,t\in\mathcal{F}(U)$, you can have that they agree in every single neighbourhood, yet be different. That is, if $U_i$ is an open cover of $U$ and $s|_{U_i}=t|_{U_i}$ for each $i$ then if you have a presheaf, it is possible that $s\neq t$, even if this condition held for every single open cover of $U$. In a sheaf, the local data (being the sets in the cover) actually determines the section uniquely.
The other half of being a sheaf says that your sections may be glued together. That is, suppose you have sections $s_i$ over open sets $U_i$, such that the restriction of $s_i$ and $s_j$ to $U_i\cap U_j$ agree for all $i,j$. We would like to be able to glue these sections together, which we can do only if you have a sheaf. If you have a sheaf, then you are guaranteed that all of your sections $s_i$ are just the restrictions of some section of $\mathcal F(\bigcup_i U_i)$.
If you think of sections as functions (which you should), then the sheaf axioms just say that you can glue compatible functions together, and any function is determined uniquely by what it does on the open sets of your space. A corollary of this is that when you have a sheaf, it is enough to define its behaviour on an open cover of your space, and this will uniquely determine your sheaf's value on any open subset. For a presheaf, giving its values on an open cover isn't enough to pin down your presheaf uniquely.
The Wikipedia article on constant sheaves has an example of a presheaf on the set with two elements (the constant presheaf) and demonstrates exactly why both axioms fail in each case, and then gives a construction of the sheafification.