Since you can recognize the language with a PDA, it must be context-free, so we can do better than a context-sensitive grammar. Here’s a context-free grammar that should work; $S$ is the initial symbol, $\lambda$ is the empty string, $E$ is intended to generate words with an equal number of $0$’s and $1$’s, and $U$ is intended to generate words with more $0$’s than $1$’s.
$\begin{align*} &S\to U\\ &U\to E0U\mid E0E\\ &E\to 0E1E\mid 1E0E\mid\lambda \end{align*}$
(I don’t really need $U$: I could use $S$ instead.)
The basic idea is that there has to be at least one extra $0$; the second production generates it, possibly preceded by a string with an equal number of $0$’s and $1$’s; what follows it may but need not have excess $0$'s.
There is an algorithm for converting a non-deterministic PDA to a CFG, but it’s fairly complicated. (The opposite direction is easier.) One version is outlined in these notes starting at page 9. These notes do it a little differently and in pretty full detail.