I have the following optimal control problem, which can be considered in a two period discrete time setting. It has this generic form.
Let $w: \mathbb{R} \rightarrow \mathbb{R}$ be a given function, modeling terminal reward (in my example in mind, $w$ is negative, bounded, increasing, and concave). At time $0$, the state of the system is known, and equal to $x$. The controller can choose a control $t \in [0,\infty )$ such that the state of the system at time $1$ is $f(t,x,Z)$, where $Z$ is a standard random variable. The cost of using strategy $t$ is $g(t,x)$. The goal is to calculate
$\hat{w}(x) = \underset{t}{min} \ \left( g(t,x) + e^{-\lambda t} E [w(f(t,x,Z))] \right),$
and find the optimal $t$. Is there a reference for explicit/computable solutions to this problem? Subject to certain restrictions on w,f,g?