I encountered the problem here(Hidden Markov Chain)
It is like this:
The task is to compute, given the parameters of the model, the probability of a particular output sequence(observed sequence). This requires summation over all possible state sequences: The probability of observing a sequence(output sequence)
of length L is given by
(*)
where the sum runs over all possible hidden-node sequences
For more specific information on the related information, refer to here
My problem is:
How the (*) equation can be derived?
It seems to be according to Total Probability Theorem. But this does not make sense to me.