I have the following system of equations for function $p(y)$ and I need help debugging my solution:
$\begin{align} 0&=\log(p(y))+1-\lambda-\gamma y^2-\eta \left(\frac{e^{-y^2/2}}{\sqrt{2\pi}}\right)\left(\frac{1}{p(y)}\right)\\ 0&=1-\int_{-\infty}^{\infty}p(y)dy\\ 0&=1-\int_{-\infty}^{\infty}y^2p(y)dy\\ 0&=c+\int_{-\infty}^{\infty}\frac{e^{-y^2/2}}{\sqrt{2\pi}}\log(p(y))dy \end{align} $
For simplicity, all logs are natural and $c>0$ is a given constant. $y\in[-\infty,\infty]$ though I have a reason to believe that $p(y)\approx 0$ for $|y|>3.5$. Multipliers $\lambda$, $\gamma$, and $\eta$ are unknown as is $p(y)$.
Since there probably isn't an analytical solution for this, I took a stab at solving it numerically using MATLAB's Optimization Toolbox (I have version 7.3.0 R2006b). I used the trapezoidal approximations for integrals, integrating in $[-3.5, 3.5]$ using integration points spaced 0.005 apart, and fsolve
.
My idea was to use fsolve
on a MATLAB function (call it outer
) that computes the bottom three equations (the integrals) over $y\in[-3.5, 3.5]$, using the first equation as a constraint inside it. So, the input to outer
function is the triple $(\lambda, \gamma, \eta)$, and the output is the triple containing the values of the three integrals. To obtain $p(y)$ for computation of the integrals, I put the first of the four equations above (the only non-integral) into another MATLAB function (call it inner
) and then for each integration point of $y$, I would use fsolve
to obtain $p(y)$ that satisfied $(\lambda, \gamma, \eta)$ that were passed into outer
.
I am interested in $p(y)$, and here is the command I actually ran:
>> [x fval]=fsolve(@outer,x0,optimset('Display','iter'))
where x0
is my initial guess at $(\lambda, \gamma, \eta)$. Here is the code for outer
:
function F = outer (x) lambda = x(1); gamma = x(2); eta = x(3); y=[-3.5:0.005:3.5]; [tmp n] = size(y); %compute p(y) f = @(p)inner(p,y(1),lambda,gamma,eta); py(1) = fsolve(f,0.01,optimset('Display','off')); for i=2:n f = @(p)inner(p,y(i),lambda,gamma,eta); py(i) = fsolve(f,py(i-1),optimset('Display','off')); end
%compute integrals pyl=py(1:n-1); pyr=py(2:n); pys=pyl+pyr; y2py=(y.^2).*py; y2pyl=y2py(1:n-1); y2pyr=y2py(2:n); y2pys=y2pyl+y2pyr; re=1./(sqrt(2*pi)).*exp(-y.^2/2).*log(py); rel=re(1:n-1); rer=re(2:n); res=rel+rer; F = [sum(pys)*0.005./2-1; sum(y2pys)'*0.005./2-1; sum(res)*0.005./2+3];
Here is the code for inner
:
function a = inner(py, y, lambda, gamma, eta) a=log(py)+1-lambda-gamma*y^2-eta*exp(-y^2/2)/(sqrt(2*pi)*py);
Unfortunately, though I have used a similar method in a similar (toy) scenario with success (see Background), fsolve
in its default state does not seem converge to a solution here. The answer it gives me is that I need to change the guess for initial values of $(\lambda, \gamma, \eta)$. I have tried many, many guesses for the initial values, and none converged. While I know my calculus, I do not have much experience with numerical methods, so I would appreciate if anyone has any ideas about:
1) Why the system of equations as I set it up doesn't converge?
2) Is there a way to perhaps re-formulate the problem to solve it?
3) Maybe there is another solver I should try? Or change settings in fsolve
? Generally speaking, I am looking for ways to debug this...
4) If convergence is impossible, what is the reason?
5) Maybe there is an analytic solution for $p(y)$? (This, obviously, would be ideal)
Background
This is a constraint optimization problem and we are looking for an (approximate) probability density function (as you may have guessed for the second equation). The problem is to find a maximum-entropy distribution of a continuous random variable that is constrained not only in variance (third equation) but also in the KL divergence with the Gaussian random variable (which is where the fourth equation comes from).
These equations were inspired by the discussion of Additive Noise in Chapter 7.3 of Gallager's "Information Theory and Reliable Communication". There is an example on pages 334-335 of using Calculus of Variations to prove that the maximal-entropy distribution in the presence of a finite support is uniform, and, if the support is $\mathbb{R}$ with a variance constraint, then the maximal-entropy distribution is Gaussian. Our first equation is really Gallager's Equation (7.4.5) augmented with variance and the KL divergence constraints.
As a toy exercise, I first solved the case with infinite support, i.e. the following set of equations:
$\begin{align} 0&=\log(p(y))+1-\lambda-\gamma y^2\\ 0&=1-\int_{-\infty}^{\infty}p(y)dy\\ 0&=1-\int_{-\infty}^{\infty}y^2p(y)dy\\ \end{align} $
I used the method that I described (with outer
and inner
MATLAB functions) and got the Gaussian for $p(y)$ as expected.