So how can we construct a list of all possible configurations of a log space TM in log space? There are potentially more than $O(\log n)$ valid configurations, so storing the enumeration seems impossible.
You are correct in saying that there are potentially more than $O(\log n)$ configurations. Indeed, there could be polynomially many (since $2^{O(\log n)}=n^{O(1)}$).
However, we don't need to store the enumeration -- we only need to compute it on demand. This is the key insight to log-space reductions, so it's worth expanding on this point. Functionally, a list is simply a data structure that takes an index and returns the corresponding element. In other words, it's just a function $f:[N]\rightarrow\Sigma^*$. One way of implementing $f$ is to literally list the values. For highly-structured lists, however, it's far more space-efficient to compress the data.
For example, the function $f:[N]\rightarrow\{0,1\}$ defined by $f(x)=0$ implements the all-zeroes list using only constant space. I realize that this is obvious and trivial, but it highlights some surprisingly deep insights. Note that if we query $f(x)$ 100 times, then we'll calculate $f(x)$, from scratch, 100 times. For complicated compression schemes, this could severely hurt the time efficiency of the algorithm. [Fun fact: the optimal compression can take arbitrarily long to unpack.]
Getting back to NL-completeness: We can think of the input as a list, with the input-head implicitly acting as a counter $\textit{index}$ containing the current location. Moving the head left corresponds to subtracting 1, while moving it right corresponds to adding 1. Since the input has length $n$, explicitly storing $\textit{index}$ only requires an additional $\log n$ space. Reading from the input is really querying $x(\textit{index})$.
The transducer effectively intercepts this query to $x(\textit{index})$ and calculates the answer.
One final note: the definition of a log-space transducer doesn't say anything about asking for the $k$-th bit of the output. We can get it in this form by adding a single counter (using $O(\log n)$ bits) that stores how many bits of output have been computed. Whenever the transducer would have written to the output tape, it instead increments the counter. Upon reaching $k$, it returns that value.