I have a dynamic programming expressed in the following Bellman backup equation form,
$ V(\boldsymbol{\theta},T)=\max_{i \in N} \mathbb{E} \left[ x_i + V(\boldsymbol{\theta}_{x_i}, T-1) \right] $
where $\theta_i$ is the expectation of $x_i$, and the $\boldsymbol{\theta}$ vector is updated with observation $x_i$ using Bayes rules.
So the backup could be expanded recursively as,
\begin{align} V(\boldsymbol{\theta},T)&=\max_{i \in N} \theta_i + \int p(x_i) V(\boldsymbol{\theta}_{x_i}, T-1) dx\\ &=\max_{i \in N} \theta_i + \int p(x_i) \max_{j \in N} \left( \theta_{j,x_i} + V(\boldsymbol{\theta}_{x_i, x_j}, T-2) \right) dx\\ &=\dots \end{align}
The integral part, for example,
$ \int p(x) \max_{j\in N} \theta_{j, x} dx $ is complicated because the integrand is changing with different $x$, although $\theta$ is simply a linear function of $x$. I can surely find a section where certain $\theta$ should be choosed via max operator for the above equation, by solving a group of inequalities. But for the whole $T$ iteration it becomes impossible.
I don't know how I can solve this equation, or approximate the optimal solution. Any help is appreciated! Thanks!