I am trying to understand Markov processes but am still confused by their definition. In particular, the Wikipedia page gives this example of a non-Markov process. The example is of pulling different coins out of a bag at random, and the article seems to imply that any time information is required about previous states to determine the next state, it's not a Markov process.
But isn't that what higher-order Markov chains are for? In the Wikipedia article, couldn't you have represented the process with an n-order Markov chain where n is the number of coins in the bag? (I feel sure I am wrong here but I can't see exactly how).
Disclaimer: Apologies if this question is below the normal quality - it's because I study computer science, not mathematics.
EDIT
Just clarifying what my confusion is. Why couldn't we represent the coins being removed from a bag with states such as:
- initial states {coin1, coin2, ..., coinN}
- states at t=1 {coin1&coin2, coin1&coin3, ..., coin(N-1)&coin(N)}
...where each state represents the coins that have been chosen so far? Probabilities could be assigned to these states that would reflect the previous states, but you still would only need to know the current state in order to predict the next one. So why isn't drawing coins from a bag a Markov process?