0
$\begingroup$

I'm trying to determine the solution (optimal policy) to the standard optimal control problem with a small variation described fully as under: \begin{equation} \min_{\bf{\pi}} \, \mathbb{E_0}[\sum_{t=0}^{N-1}f(x_t, u_t,\epsilon_t)] \end{equation} subject to:

\begin{equation} x_{k+1} = g(x_k,u_k,\epsilon_k)\forall k \\ \sum_{t=0}^{N-1}u_t=K \end{equation} Here, the value of state variable $x_0$ and $K$ is known and $\epsilon_k$ is a random variable with known distribution and we're looking for optimal policy vector of functions ${\pi}=\{\mu_1(x_1),\mu_2(x_2),\mu_3(x_3)\ldots \mu_{N-1}(x_{N-1})\}$

I have one solution in my mind but I fear that it is computationally inefficient therefore I'm looking for suggestions to come with alternate possibilities. Any help in this matter would be highly appreciated.

  • 0
    It is unclear what you are asking. What exactly is the problem? What is $\pi$? Should I know this from 'standard optimal control problem'? And what does it mean having a solution on mind? What does it mean for a solution to be computationally inefficient. As far as I am concerned your problem has generically unique solution that can be computed working backwards.2017-02-14
  • 0
    @Jan $\pi$ is the vector of optimal control functions corresponding to each time point as described in the questions. The standard optimal control problem is devoid of the constraint $\sum_{t=0}^{N-1}u_t=K$ which is not the case here as incorporating this is somewhat of a problem. I can formulate this as a dynamic programming problem by working backwards but this would require two arguments for the optimal policy function at each time point (as in the case of knapsack problem) which increases the computations substantially.2017-02-15
  • 0
    Is [certainty equivalence](https://en.m.wikipedia.org/wiki/Stochastic_control#Certainty_equivalence) applicable in your case? If so then you could also find an optimal policy without disturbances with static optimisation with equality constraints which can be satisfied using Lagrange multipliers.2017-02-15

0 Answers 0