1
$\begingroup$

Given a set of column vectors $v_1, v_2,...,v_t$ is there a way to calculate a unique transition matrix?

In other words, is there one and only one matrix $A$ such that $Av_{i} = v_{i+1}$?

Additionally, is it possible to find the "transition matrix of best fit" for $v_1, v_2,...v_t$? In other words, the matrix $A$ that minimizes the error between a predicted vector and the actual?

I'm asking because I'm trying to use Markov chains to predict a series of column vectors and I'm curious if there's a more "vectorized" (i.e. linear algebra based) implementation of calculating the transition matrix.

  • 0
    Yep, that was a typo, thanks. What you said about span makes sense, however, for my particular dataset I can't guarantee that will be true.2012-01-30

1 Answers 1

0

I can't (yet) describe the theory behind it, but I wrote an article on fitting Leslie matrices to data: http://arxiv.org/abs/1203.2313

The above article references the method as presented in Caswell's book on matrix population models, and only really works for a specific format of transition matrix. The method can actually be made quite general and should be framed in terms of Kronecker products and the columnize-ing vec operator. I am working on that article currently, and will post a draft if you ask me to.

To get you started consider a data matrix $D$ at time t being the result of multiplication by an unknown transition matrix $X$:

$D_{t+1}=XD_{t}$

This is equivalent to the following with $I$ a conformable identity matrix:

$D_{t+1}=IXD_{t}$ (1)

Now you can use the following formula (from various textbooks):

$vec(ABC)=(C^{T}\otimes A)vec(B)$

For our case, B is the unknown transition matrix, A is the identity matrix, C is the data matrix. Take $vec()$ of both sides of (1):

$vec(D_{t+1}) = vec(IXD_{t}) = (D_{t} \otimes I)vec(X) $

This is a standard matrix equation. Rewrite as

$ (D_{t} \otimes I)^{-1} vec(D_{t+1}) = vec(X)$

and solve if possible. It is unlikely that there is a single solution to this (see textbooks etc), but you can minimize the following

$|| (D_{t} \otimes I)^{-1} vec(D_{t+1}) - vec(X) ||_2$ (2)

Then you de-vec $vec(X)$ to get the original transition matrix.

Caveats: So far I have found that you need to constrain the heck out of X to make it usable and enforce sparsity, and you also probably want to minimize the norm of (2) for an underdetermined solution. I do the following in the paper, and I am working on articulating how to do the latter with SVD now.

If you are using this, please cite me, and feel free to IM or ask questions here.