Start with a pushdown automaton $M$ that recognizes $L$. Let $s_0$ be the initial state. If there are any transitions from $s_0$ to $s_0$, add a state s_0' that will be the new initial state, and add an s_0'\to s_0 transition for each of the original $s_0\to s_0$ transitions. This ensures that $a_1$ is handled properly.
Now suppose that you’re in state $s$ reading some letter $a$. Look at what $M$ would do from state $s$ on reading every possible pair of letters $xa$, and provide an appropriate transition for each of these actions. (Be sure to account for what happens to the stack.)
This is based on the assumption that you just want to accept words that derive from words in $L$ of even length. If you also want to accept $a_1a_2$ when some $a_1b_1a_2\in L$, for instance, you’ll probably find it easier to modify the transitions for input $a$ to mimic the possible $ax$ transition of $M$.