Neglecting the binary encoding for the moment, a Turing Machine encoding can be quite simple, just a set of tuples of the transition function, that is, a subset of $Q \times \Gamma \times \Gamma \times \{L,R, N\} \times Q$ where $Q$ are the states, $\Gamma$ the tape alphabet and $L, R$ are move left and right opcodes, while $N$ does nothing, which allows state changes with nothing else happening.  To keep it really simple, say that the first element of $Q$ is the start state and the second element the (only) final state. 
We can say that any set of such tuples is a legal TM. Of course, most of them won't do much, but in fact, most significant questions about what such a set will do as a TM are undecidable. 
But the bottom line is that your set TM could be the power set of ${Q \times \Gamma \times \Gamma \times \{L,R, N\} \times Q}$, each subset suitably encoded as a string to be regular, and further encoded in binary to remain regular. All an automaton need check is the basic syntax of tuples, with $Q$ and $\Gamma$ finite. This can be done with one pass at the input and no auxiliary storage, that is, by a finite state automaton.
Or, as Lucas Cook points out, the encoding can actually be to all strings over a given alphabet, even $\{0,1\}^*$, which is certainly regular.
On the other hand, we could make a fancy encoding in some procedural language, with declared variables and other bells and whistles, and make TM almost arbitrarily complex, though we certainly want to keep it recursive. Typically, real programming languages are formally context sensitive or pretty close, as are most actually useful languages.
All this shows that the syntax of Turing Machine encodings is not very interesting from the standpoint of theoretical hierarchies of formal languages. The fun begins with semantics -- what TM's do when run -- or when you try to find the shortest program for each partial recursive function in a given encoding, which leads to the wonderful world of Kolomogorov/Chaitin complexity theory.