I'm reading a paper that jumps from eq (1) to eq (2) without describing the steps taken. I'd like to understand how/why/what steps allow the transform. (My linear algebra is a bit rusty...)
(1) $\boldsymbol{V}^\pi = \boldsymbol{R} + \gamma \boldsymbol{P}_{a_1} \boldsymbol{V}^\pi$
to
(2) $\boldsymbol{V}^\pi = (\boldsymbol{I} - \gamma \boldsymbol{P}_{a_1})^{-1} \boldsymbol{R}$
Where:
- $\boldsymbol{V}^\pi$ is a vector
- $\boldsymbol{R}$ is a vector
- $\boldsymbol{P}_{a_1}$ is an N-by-N matrix
- $(\boldsymbol{I} - \gamma \boldsymbol{P}_{a_1})$ is guaranteed to be invertible due to the nature of the problem.
The superscript ($\pi$) and subscript ($a_1$) may be safely ignored as they identify particular instances of the vector and matrix. I include them for completeness and just in case I'm wrong about them being ignorable. Also, if it helps for context, the paper is dealing with Markov Decision Processes (MDP), and these equations are related.
I'm assuming the steps are simple since the author skips them, so this question is hopefully easy to answer. Thanks!