8
$\begingroup$

I have a question about Euler-lagrange equation which you can check this file. http://www.bekirdizdaroglu.com/ceng/Downloads/ISCE10.pdf, specifically equation $8$.

There is a functional $F$ and we want to find a function $f$ which minimizes $F$. Then we attain the Euler-Lagrange function $E-L(f)$. And we iterate by $\frac{\delta f}{\delta t} = -(E - L)$ to get this $f$.

Why do we set up the problem as $\frac{\delta f}{\delta t} = -(E - L)$ and how we choose the $\delta t$?. The $f$ which meets the $E-L=0$ is a nextreme. Although the gradient descent method is also a iteration processing, the right side of its equation is $-\nabla f$ if one want the local minimum. But does $E-L$ is a type of $\nabla f$.

How we choose $\delta t$, if it is too big, the iteration process is not stable, if it is too small, it is time-consuming. There is a necessary condition - CFL condition, not sufficient condition .Also, this condition is decided by the right side of the equation. link In page three, part 3, $\delta$t is decided by the order spatial derivatives. But in my equation, I do not have any spatial derivatives, but I have a integral, $\int f$. So, how should I choose the $\delta t$, I ask a professor in my uni, the answer is 'It is not a analytical function, you need to try the number'.
PS: Could you understand my question?

  • 0
    possible duplicate of [Euler-Lagrange, Gradient Descent, Heat Equation and Image Denoising](http://math.stackexchange.com/questions/29703/euler-lagrange-gradient-descent-heat-equation-and-image-denoising)2011-07-15
  • 3
    @Hans: I'm not very familiar with this area of math (I've only taken an intro course on Lagrangian/Hamiltonian mechanics), but is it really appropriate to close as a duplicate of the other question? It seems that onlybluemoon is asking more of a fundamental question of why we should set things up as $\frac{df}{dt}=-(E-L)$, whereas the other question accepts this setup and is dealing with perhaps more advanced issues. Again, I'm not sure I'm able to judge this area, so please do tell me your thoughts.2011-07-16
  • 0
    Thank you. The truth is that 'delta(f)/delta(t)=-(E-L function)' is difficult to me... I try to find the answer for all days. And then, I found http://math.stackexchange.com/questions/29703/euler-lagrange-gradient-descent-heat-equation-and-image-denoising which have the same problem like me. Then I read the answers, but I can not understand fully. So I register on this web ,post a question and always check it. I try to earn some reputation but I can not answer these complex question...2011-07-16
  • 0
    To you, this is a fundamental question of why we should set things up as dfdt=−(E−L).So could you give me some information about it. I know the greatest gradient method, but my question is different from it.2011-07-16
  • 2
    Sorry, I am having a little bit of difficulty understanding your question. Is your question about "**why** do we set up the problem as $\delta f / \delta t = -(\mathrm{E-L})$ in the gradient descent method", or are you asking about something else?2011-07-16
  • 0
    I voted to close as duplicate because the question explicitly twice states that the OP has the same question. If that's not actually the case, I retract my close vote. @onlybluemoon, in that case I suggest you take out the two statements "I have the same question" and instead independently formulate your question. You can refer to the other question in formulating your question, so you don't have to repeat all the equations that are already there.2011-07-16
  • 0
    Thank you for your reply. Yes, My question is "why do we set up the problem as δf/δt=−(E−L) and how we choose the δt". The f which meets the E-L=0 is a extreme. Althought the gradient descent method is also a iteration processing, the right side of its equation is - $\nabla$f if one want the local minimum. But does E-L is a type of $\nabla$f.2011-07-16
  • 0
    @onlybluemoon: Please write this not only in the comments but edit your question to make it clear and complete. People shouldn't have to read through long exchanges in the comments to understand the question.2011-07-16
  • 0
    Hi, joriki, I describe my question in more detail. I did not delete the link since it is the same as me. The reason why I post a new question and cite this link is I can not understand the answers on that link although I read it several times. As a result, I need new answers. So if there is still inappropriate place, just tell me. Thank you for your suggestion.2011-07-16
  • 0
    @onlybluemoon: There is a direct contradiction. First you state that your question is the same as in that other post, and then you state that your question is "why do we set up the problem as δf/δt=−(E−L) and how we choose the δt". That's not the question in the other post, which is "I do not understand however how the author jumps from the E-L equations to the gradient descent method. How is the time variable t included?" These are quite different questions. Please clarify.2011-07-16
  • 0
    joriki, Thank you for reminding me. I check it again. And I found it is really different with mine ,but they have related though. Maybe that is why I cannot understand the answers on that post.embarrassing... I edit my question now.2011-07-16
  • 0
    onlybluemoon: thank you very much for clarifying your question. I think it is clear now that the question is different but related from the one you've linked to. One more thing: If you want people in a comment thread to see your comments add `@` in front of their name, for example @joriki [Note however that only one person per comment can be notified.] I'll have to run now, but I'll check back a bit later if there is an answer. If not, I'll try to write something.2011-07-16
  • 0
    @Theo Buehler, Thank you. It is a happy thing I have ever met in these days.2011-07-16
  • 0
    No, I'm afraid I don't understand your supplemental question. I stated in my answer that there's no $\delta t$ in the paper you linked to, but you're again asking how to choose $\delta t$. As far as I can tell, there's no choice to be made in that paper. If your question is no longer specifically about that paper, but generally about the numerical solution of differential equations, you should probably pose a new question. If you do so, please put as much effort as you can into formulating it clearly; perhaps get a native speaker to straighten it out; it's difficult to follow what you write.2011-07-17
  • 0
    @joriki, Thank you for reminding me. I try to clarify my question in other question. It is frustrating since my English mark is not bad now.2011-07-18

1 Answers 1

8

Thanks for clarifying your question. I'll assume that you're using $f$ to refer to the function denoted by $u$ in the paper you link to.

First, let me remark that I wouldn't put too much stock in that paper. It goes to a lot of effort just to derive the well-known linear Gaussian blurring convolution for noise reduction. The diffusion approach only adds value if you consider anisotropic diffusion; as long as you stick with the linear approach used in that paper, you might as well just do Gaussian blurring and forget about the whole diffusion aspect.

You seem to be thinking of an iterative gradient-descent scheme, in which some step $\delta t$ is chosen and the solution is modified by adding a corresponding multiple of the gradient to it. This is not what's happening in this paper -- there is no $\delta t$ in the paper; rather, the paper considers a continuous flow where at each point the flow is in the direction of the gradient. A discrete $\delta t$ that would have to be chosen would only enter into this if you were to try to calculate that flow numerically by discretizing it. This is not done in the paper; rather, the paper solves the heat flow equation analytically and derives the Gaussian blurring convolution from it, with the width of the Gaussian blurring kernel increasing with the time $t$.

I interpret the other part of your question to ask why the expression equated to zero in the Euler-Lagrange equations is like a gradient. The Euler-Lagrange equations are derived by varying the function $u$, calculating the resulting variation of the objective function $E$ and setting it to zero. The expression that is being set to zero is the variation of $E$ with respect to $u$ at each point. You can think of this in analogy to a function of several variables, where the gradient is obtained by considering the variation of the function under variation of each of the variables. Each function value of $u$ at a point is like a variable; the difference is that $u$ is defined at uncountably many points while a function of several variables only has a finite number of variables. The paper seems to equivocate somewhat on whether $u$ is defined on a continuum or on the pixels; if you consider it defined on the pixels, then each of its values at one of the pixels would be one of the variables, and the gradient of $u$ would be the collection of the variations of $u$ with respect to each of these variables, and this is what the expression in the Euler-Lagrange equations represents.

There might be more to explain, but I'll stop here and see whether this makes sense for you or whether I'm going off on a wrong track.

  • 2
    +1 for this answer! That's so much better than whatever I would have been able to come up with...2011-07-16
  • 0
    @joriki,Thank you for your explanation. It is so helpful. I guess I understand all of what you was saying. In the fourth paragraph, you inspired me. $\delta$E is calculated according to f, the f which meet the equation $\delta$E=0 is a extrem - a stationary point, just like its gradient =0 .2011-07-17
  • 0
    @Theo Buehler, Thank you for your attention and comments.2011-07-17
  • 0
    @joriki, Hi, For the third paragraph,sorry I did not notice that. So I link the other paper, like [link](http://heim.ifi.uio.no/~kent-and/inf5670/imexrk.pdf)2011-07-17
  • 0
    @joriki, Hi, For the third paragraph,sorry I did not notice that. Let me specify in the above quesion.2011-07-17