59
$\begingroup$

A real valued function $f$ defined in $(a,b)$ is said to be convex if $$f(\lambda x+(1-\lambda)y)\le \lambda f(x)+(1-\lambda)f(y)$$ whenever $a < x < b,\; a < y < b,\; 0< \lambda <1$.
Prove that every convex function is continuous.

Usually it uses the fact:
If $a < s < t < u < b$ then $$\frac{f(t)-f(s)}{t-s}\le \frac{f(u)-f(s)}{u-s}\le\frac{f(u)-f(t)}{u-t}.$$

I wonder whether any other version of this proof exists or not?

  • 2
    All proofs I have seen boil down to something similar. The above fact is useful in that it shows that right- and left-hand derivatives exist at each point, and hence it is locally Lipschitz. This is true in $\mathbb{R}^n$ as well.2012-12-14
  • 14
    Your title is a bit misleading. It is *not* the case that every convex function is continuous. What is true is that every function that is finite and convex *on an open interval* is continuous on that interval (including $\mathbb{R}^n$). But for instance, a function $f$ defined as $f(x)=-\sqrt{x}$ for $x>0$ and $f(0)=1$ is convex on $[0,1)$, but not continuous.2014-08-15
  • 3
    Furthermore, in convex analysis we frequently refer to so-called "extended valued functions" defined on the extended real line $[-\infty,+\infty]$. Continuing my example above, for instance, we could define $f(x)=+\infty$ for $x<0$. If we define the secant rule above carefully, using sensible conventions for arithmetic on infinities, you will find that it holds for any points $(a,b)\in\mathbb{R}^n$---even $a,b<0$!2014-08-15
  • 2
    Ha ha! I did not notice that this question is almost two years old! Well. I think the clarifications are still important.2014-08-15
  • 2
    What is the "usual proof" that uses that fact?2015-03-08
  • 0
    It also doesn't hold if we are dealing with infinite-dimensional spaces.2016-11-18
  • 0
    See also: https://math.stackexchange.com/questions/24676/convex-function-in-open-interval-is-continuous2018-01-22

8 Answers 8

102

The pictorial version. (But it is the same as your inequality version, actually.)

Suppose you want to prove continuity at $a$. Choose points $b,c$ on either side. (This fails at an endpoint, in fact the result itself fails at an endpoint.)

AA1

By convexity, the $c$ point is above the $a,b$ line, as shown:

A2

Again, the $b$ point is above the $a,c$ line, as shown:

A3

The graph lies inside the red region,

A4

so obviously we have continuity at $a$.

16

I would be careful to rephrase the query as:

Is there an alternative proof of the fact that a real-valued convex function defined on an open interval of the reals is continuous?

Since in general convex functions are not continuous nor are they necessarily continuous when defined on open sets in topological vector spaces.

An alternative might be to identify the point of discontinuity as x. Then there exists a point arbitrarily close to x, denoted x', whose value f(x') is bounded away by a constant from f(x). Depending on how you want your proof structured, you may think it sufficient to note that this implies the epigraph of the function is not closed and therefore the function is not lower semicontinuous. But every convex function on the reals is lower semicontinuous on the relative interior of its effective domain, which equals the domain of definition in this case.

A more general proof of this property is given in "Convexity and Optimization in Banach Spaces." The authors prove the proposition that every proper convex function defined on a finite-dimensional separated topological linear space is continuous on the interior of its effective domain. You can likely see the relevant proof using Amazon's or Google Book's look inside feature.

12

You can do a proof by contradiction.

Assume $f\in\mathbb{R}^\mathbb{R}$ is convex, but not continuous at some $x_0\in(a,b)$. This means that: $$ \exists_{\epsilon>0}\forall_{\delta>0}\exists_{x\in(x_0-\delta,x_0+\delta)} : |f(x)-f(x_0)|\ge\epsilon$$ This formula implies that once we fix $\delta$, $f$'s graph has infinitely many points in one of the areas: I, II, III or IV, with $x_0$ as an accumulation point of their $x$ coordinates:

enter image description here

We split our proof into 2 cases:

$(1)$ The area is either I or II. In this case we select some point on the function's graph from that area: $(x_1,f(x_1))$, and draw a line segment from that point to $(x_0,f(x_0))$. We then select another point on the graph from the same area: $(x_2,f(x_2))$, whose $x$ coordinate is closer to $x_0$ than the intersection of our line segment and $y=f(x_0)+\epsilon$ . This contradicts the convexity of $f$, as can be seen in the following image: enter image description here

$(2)$ The area is either III or IV. Assume, without loss of generality, that the area is III. In this case we select some point on the function's graph to the right of $x_0$, say: $(x_1,f(x_1))$. We then draw a ray, which starts at $(x_1,f(x_1))$, and goes through $(x_0,f(x_0))$. We use $x'$ to denote the $x$ coordinate of the intersection of our ray and $y=f(x_0)-\epsilon$. If they do not intersect, we set: $x'=-\infty$. Next, we select another point: $(x_2, f(x_2))$ on $f$'s graph, in area III, with $x'

2

I like GEdgar's pictorial explanation. The key idea is: for $f(x)$ in $(a,b)$, we try to prove $f(x)$ is bounded above and below. It's easy to show $f(x)$ is bounded above by definition of convex. To show $f(x)$ is bounded below, we can see GEdgar's 2nd and 3rd picture.

Once we proved $f(x)$ is bounded, $|f(x)|\leq M$,

$$a < s < t < u < b$$ let $$s < x < u \quad;\quad 0<\varepsilon then we can show $$f(x+\varepsilon)-f(x)\leq \dfrac{f(u)-f(s)}{(u-s)}\varepsilon \leq \dfrac{M\varepsilon}{(u-s)}$$ as $\varepsilon \rightarrow 0$ $$|f(x+\varepsilon)-f(x)| \rightarrow 0$$ then $f(x+)=f(x)$ the same way we can show $f(x-)=f(x)$, so $f(x)$ is continuous

2

I think there's a neat proof, but maybe I made a mistake.

Fix an $x\in(a,b)$ and take a $y$ such that $a. We suppose that $f$ is descontinuous at $x$. If so, there is an $\epsilon>0$ such that we can choose a sequence $(\lambda_n)$ that satisfies $$0<\lambda_1<\lambda_2<\cdots<1;$$ $$\lambda_n\to 1;$$ $$f(\lambda_nx+(1-\lambda_n)y)\geq f(x)+\epsilon;$$ given that all the $\lambda_n$ are taken sufficiently near $1$ (ie, you're choosing points sufficently near $x$ and associating the correspondent $\lambda$). It's also valid that $$f(\lambda_nx+(1-\lambda_n)y)\leq \lambda_nf(x)+(1-\lambda_n)f(y).$$ Now, multiplying the first inequality by $-1$ and adding it to the seconde one we get: $$0\leq (\lambda_n-1)f(x)+(1-\lambda_n)f(y)-\epsilon.$$ Making $n\to\infty$: $$ 0\leq-\epsilon \Rightarrow\epsilon=0; $$ a contradiction.

1

I presume you mean "proper convexity' as in $(1)$ above not $(2)$

Not just mere, " "midpoint convex/jensen convexity"/ "convexity in the sense of Jensen"," as in $(2)$, below.

Although (its hardly mere I suppose) because the $(1)$ as defined in the question and $(2)$ below are, not, but are nonetheless "almost equivalent".

(2) $$F(\frac{x+y}{2}) \leq \frac{F(x)}{2} +\frac{F(y)}{2}$$

That is because under relatively mild conditions measurability, regularity conditions/boundedness conditions, midpoint convex function (in the sense of Jensen) are convex in the tradition .

Apparently a real valued midpoint convex function $(2)$ already satisfies the definition of convexity as above $(1)$, except for the restriction that $\sigma$ applies only to all rational numbers in the the unit interval (not just $2$, or dy-adics). That is before continuity is applied.

That is $(1.a)$ below according to pt. 7.11 of chapter "Continuous Convex Functions" in http://link.springer.com/chapter/10.1007%2F978-3-7643-8749-5_7

$$(1.a)\forall \, \sigma \in \mathbb{Q}\cap[0,1];\, \forall (x,y)\, \in\, \text{dom}(F):\, F(\,\sigma x + [1-\sigma] y\,)\, \leq\, \sigma F(x)\,+\,[1-\sigma]F(y).$$

Its presumably a bit confusing the use the words 'in the sense of Jensen' for midpoint convexity.

This is because I believe that Jensen did, or helped develop the inequalities that any 'function' must satisfy, in order to be 'properly convex',as defined above in the question $(1)$.

Because Jensen well as the weaker notion of midpoint convexity $(2)$ which apparently is equivalent to $(1.a)$, at least if the domain is real valued, and which is named after Jensen (midpt convexity is often called jensen- convexity).

0

The best alternative proof (in my humble opinion) is a function is convex if and only if its epigraph is a convex set. If a function is NOT continuous then the epigraph can't be convex (obviously... draw a picture); but then by the above, the function can't be convex. This proof used the contrapositive.

  • 0
    Consider the extended-valued function from my above comments: $f(x)=+\infty$ if $x<0$, $f(x)=1$ if $x=0$, and $f(x)=-\sqrt{x}$ if $x>0$. The epigraph of this function is a convex set. But $f$ is not continuous! And its epigraph is not a closed set.2014-08-15
  • 0
    Real valued function $\neq$ extended real value function (the former is the setting setting of the question and my answer, the latter is your setting). In any case, your epigraph isn't even convex,... for example you can't realize it as the intersection of epigraphs of real valued functions. If you could, then I'd believe it was convex because the intersection of arbitrary number of convex sets is convex... but your set is definitely not convex2014-08-15
  • 1
    Oh, my epigraph is convex all right. It's not closed, but it's convex. Try and prove otherwise: give me two points in the set for whom the secant lies outside of the set. The closure of the set adds the $(x,y)\in[0,0]\times[0,1)$. (Asking me to prove it is the intersection of epigraphs of real valued functions is a circular argument. Besides, the unit disc in $\mathbb{R}^2$ cannot be described as the intersection of epigraphs of convex functions, either.2014-08-15
  • 0
    As for real valued $\neq$ extended real valued: extended real valued functions are simply a short-hand way of expressing real-valued functions that are not defined on the entire real line (or $\mathbb{R}^n$, etc., as appropriate). For instance, consider the classic logarithmic barrier $f(x)=-\log x$, defined on $(0,+\infty)$. Is that not a real-valued function?2014-08-15
  • 0
    Ah! You're right. It is a convex set, because the epigraph to the right of $x=0$ is clearly convex and so now we just have to show that the union of this epigraph with the epigraph when $x\le 0$ is also convex. But epi($f(x):x<0)=\{(x,t):t>f(x)\}=\{(x,t):t>\infty \} = \emptyset$ which makes the desired result vacuously true. The only problem with this example is that we want a function to convex if and only if its epigraph is convex (and we can normally prove this).2014-08-16
  • 0
    Well .... I see the function is convex as well (as it should be). I guess I just don't like using e.r.fs (Also in my last comment I should have had $t\ge f(x)$ and $t\in \mathbb{R}$ but the result is the same nonetheless.)2014-08-16
  • 1
    It is *always* the case that a function is convex if and only if its epigraph is convex. That is true for functions defined on all of $\mathbb{R}^n$, and it's true for functions whose domain $\mathop{\textrm{dom}}(f)$ is only a subset thereof, like the logarithmic barrier. Indeed, the epigraph definition properly implies that $\mathop{\textrm{dom}} f$ is a convex set. Convex functions with limited domains are simply far too important in practice to ignore. The extended reals simplify the analysis of these functions, but you can do without that if you prefer.2014-08-16
  • 0
    For instance, here is how you properly define the secant rule if you to properly take into account domains: *a function* $f$ *is convex if and only if* $\mathop{\textrm{dom}} f$ *is a convex set, and if for all* $(x,y)\in\mathop{\textrm{relint}}\mathop{\textrm{dom}} f$ *and* $\lambda\in(0,1)$, $$f(\lambda x+(1-\lambda)y)\leq \lambda f(x)+(1-\lambda)f(y).$$ Now, wouldn't it be nice if we could simply jettison all of that domain business? We can with the extended real convention.2014-08-16
  • 0
    That's not what I was worried about. Worried about the fact that it was an erv (thought it might fail). But if your erv is an extension of a convex function it's clear it'll be too (where we define the extension as $\infty$ outside the original domain)2014-08-16
0

Here is the picture of my proof

Let, By contrary, $c$ be a point on the domain. Choose a $d$ then there exists a sequence $\{a_n\}_n$ that converges to c and $f(a_n) \notin (f(c)-d,f(c)+d)$. So from the sequence $\{a_n\}_n$ we can always select a one sided monotonic sequence $\{c_n\}_n$ which is the subsequence of the original seq. as one side of $c$ must have infinitely many points of $\{a_n\}_n$, WLOG let it is the left sided sequence & $f(c_1)

Now, take $M_1=(c_1,c)$ any point $x$ in $M_1$ must be $ as $\exists m$ s.t. $c_m\leq x\leq c_{m+1}$ hence $x$ must belong lower area of line joining of $f(c_m)$ & $f(c_{m+1})$ (By convexity of $f$) so for all $x$ in $M_1$ $f(x).

Now, join the point $(c_1,f(c_1))$ & $(c,f(c))$ (let the line be $l_1$). It will cut at B at the $y=f(x)+d$. Let x-cordinate of B is $c^{'}$.

Now, note $B=(c^{'},f(c^{'})$ lie upper area of $l_1$ otherwise contradicting the point $A=(c,f(c))$ (By convexity of $f$). Now from B we can always draw a line such that A lie upper area of this line intersecting at B' at $y=f(c)-d$ line. So let x-cordinate of B' is $c^{''}$. Now lastly join $(c^{''},f(c^{''}))$ & $(c^{'},f(c^{'}))$ contradicting the point A.