I've designed a quadcopter and have it printed out on a 3D printer. Now I need to control it.
I have formulated an MDP (Markov decision process) and want the helicopter to learn when it is in a stable hovering position.
I have an on-board INS (inertial navigation sensor) that outputs $(x,y,z), (\dot{x}, \dot{y}, \dot{z}), (\phi,\theta,\psi), (\dot{\phi}, \dot{\theta},\dot{\psi})$ - [position], [velocity], [attitude] and [attitude rotation speed] respectively).
Therefore, the MDP is formulated as follows:
States:
- $x$
- $y$
- $z$
- $\dot x$
- $\dot y$
- $\dot z$
- $\phi$
- $\theta$
- $\psi$
- $\dot\phi$
- $\dot\theta$
- $\dot\psi$
Actions:
- rotor1
- rotor2
- rotor3
- rotor4
Rewards:
- 1 if hovering; 0 otherwise
The trouble is, I don't know how to get a transition probability matrix. Do I get this from flight data? How best to go about this?
Is it necessary to build a quadcopter in a physics simulator (I've never done this before...)
Many thanks in advance,