4
$\begingroup$

First off: I haven't taken a functional analysis course yet, so please keep that in mind when explaining. I just rather randomly read about this stuff and it started to interest me.

I'm having trouble understanding how the following contorted mappings work. I don't want any more details than necessary; just the plane explanation which functions they take to which space and why it makes sense to define the mapping in the way they are defined (maybe like a diagram in words) and so on.

1) In this Wikipedia article it was mentioned, that there is a natural map $F$ from a vector space $V$ over a field $F$ to $V''$ (the dual of the dual, meaning the space of all linear functionals that take themselves linear functionals as arguments and return values from $F$, if I understood it correctly ), by $F(x)(f)=f(x)$.

(The notation $F(x)(f)$ would imply that $F(x)$ is itself a function taking as an argument another function $f$. But what would be the domain and range of this $f$ ? And why does it make sense to assign the value of $F(x)(f)$ the value of $f$ at $x$)

2) Similar question (maybe more difficult), to be found in these lecture notes,page 61 bottom: Let $I$ be compact and $E,F$ Banach spaces and consider the mapping $\mathscr{F}:C(I,L(E,F)) \rightarrow L(E,C(I,F))$ defined by $\mathscr{F}(f)=(x\mapsto (t\mapsto f(t)(x)))$. Now I understand that to $f$ there is a map associated, lets call it $\mathscr{G}:E \rightarrow C(I,F)$, such that $\mathscr{G} (x)$ is a continuous map from $I$ to $F$, $t\mapsto f(t)(x)$. But what I don't understand is, why the author says this mapping simulates the change of variables.If it did, why did the constraints to work in the space of continuous respectively linear maps had to be imposed? And does why this (latest) mapping from $I$ to $F$ takes the value $f(t)(x)$ ?

  • 1
    All of what you're confused about can be understood on the level of sets; it isn't actually about functional analysis. See http://en.wikipedia.org/wiki/Currying .2011-09-02

1 Answers 1

4
  1. $V$ is a vector space; $V^*$ is the dual. So the elements of $V^*$ are functions $\mathbf{f}\colon V\to K$ ($K$ the field). The double dual $V^{**}$ is the dual of $V^*$, so the elements of $V^{**}$ are functions $\mathbf{x}$ with domain $V^*$ and range $F$; i.e., $\mathbf{x}\colon V^*\to K$.

    Since $F$ is a function from the vector space $V$ to the double dual $V^{**}$, that means that for each vector $x\in V$, $F(x)$ is in $V^{**}$. So $F(x)$ is a function whose domain is $V^*$ and whose range is $K$. If $\mathbf{f}\in V^*$, then $[F(x)](\mathbf{f})$ (that is, the function whose name is "$F(x)$" evaluated at $\mathbf{f}\in V^*$) is a scalar. Now, $\mathbf{f}$ is itself a function with domain $V$ and images in $K$, so $\mathbf{f}(x)$ "makes sense" (we are evaluating a function with domain $V$ at an element of $V$), and is in fact a scalar. So we can define $[F(x)](\mathbf{f})$ as $\mathbf{f}(x)$, because both sides are scalars, which is what the value of $F(x)$ at an element of $V^*$ has to be.

    Added. There still seems to be some confusion here, so let's make sure everything works explicitly here. We have now seen that in fact the formula "makes sense" in that both sides are what they ought to be: scalars. Now, why does this define a linear function from $V$ to $V^{**}$?

    Let $x\in V$. We want $F(x)$ to be an element of $V^{**}$, so $F(x)$ needs to be a function with domain $V^*$. How do we describe a function with domain $V^*$? By saying what it does to every vector in $V^*$. So take a vector $\mathbf{f}\in V^*$; then $F(x)$ is the function that is defined by the rule $[F(x)](\mathbf{f}) = \mathbf{f}(x)$. This defines a function, called $F(x)$, with domain $V^*$ and images in $K$. In order to verify that $F(x)$ is in fact an element of $V^{**}$, we need to verify that this function is in fact a linear function with domain $V^*$. To that end, let $\mathbf{f},\mathbf{g}\in V^*$, and let $\alpha\in K$. We need to show that $$[F(x)](\mathbf{f}+\alpha\mathbf{g}) = [F(x)](\mathbf{f}) + \alpha[F(x)](\mathbf{g}).$$ Indeed, the left hand side is, by definition of $F(x)$ and by the definition of what it means to add linear transformations, equal to: $$[F(x)](\mathbf{f}+\alpha\mathbf{g}) = (\mathbf{f}+\alpha\mathbf{g})(x) = \mathbf{f}(x) + \alpha\mathbf{g}(x).$$ The right hand side is equal to, by definition of $[F(x)]$, $$[F(x)](\mathbf{f}) + \alpha[F(x)](\mathbf{g}) = \mathbf{f}(x) + \alpha\mathbf{g}(x).$$ So they are indeed equal, hence $[F(x)]$ is a linear function with domain $V^*$ and image in $K$; that is, $[F(x)]$ is an element of $V^{**}$, as desired.

    Finally, we need to check that the map $F\colon V\to V^{**}$ is itself a linear transformation. That is, we need to check that for all $x,y\in V$ and all $\beta\in K$, $$F(x+\beta y) = F(x) + \beta F(y).$$ Now, this is an equality of functions; both sides are functions with domain $V^*$ and codomain $K$, so in order to verify that the two functions are equal we need to check that they have the same value at every element of the domain, which is $V^*$. So let $\mathbf{g}\in V^*$. We need to check that $[F(x+\beta y)](\mathbf{g}) = [F(x)](\mathbf{g}) + \beta[F(y)](\mathbf{g})$. And indeed, we have: $$\begin{align*} [F(x+\beta y)](\mathbf{g}) & = \mathbf{g}(x+\beta y) &&\text{(by definition of }F\text{)}\\ &= \mathbf{g}(x) + \beta\mathbf{g}(y) &&\text{(because }\textbf{g}\text{ is linear)}\\ &= [F(x)](\mathbf{g}) + \beta[F(y)](\mathbf{g}) &&\text{(by definition of }F\text{)} \end{align*}$$ So $F$ is linear, and thus we have a linear transformation $F\colon V\to V^{**}$ with domain $V$ and coddomain $V^{**}$.

  2. $\mathcal{C}(I,L(E,F))$ is the set of continuous functions whose domain is $I$, and whose range is an element of $L(E,F)$ (the continuous linear functions from $E$ to $F$). So if $\mathbf{f}\in\mathcal{C}(I,L(E,F))$, and $a\in I$, then $\mathbf{f}(a)$ is a continuous linear function from $E$ to $F$.

    Now, let $f\in \mathcal{C}(I,L(E,F))$. We want to define $\mathscr{F}(f)$ to be an element of $L(E,\mathcal{C}(I,F))$. That is, we want $\mathscr{F}(f)$ to be a function with domain $E$ and whose values are continuous functions from $I$ to $F$. How do we describe a function? By saying what its values are at a point in the domain. So let $x\in E$. The notation $x\mapsto A$ means that we want the function to send $x$ in the domain to $A$. So $$\mathscr{F}(f) = \Bigl( x \longmapsto \bigl(t\mapsto f(t)(x)\bigr)\Bigr)$$ is telling you that $\mathscr{F}(f)$ is a function that will take $x$ to... well, the image of $x$ is a function (from $I$ to $F$). How do you describe a function? By saying what it does to every point in the domain. So I need to tell you what $[\mathscr{F}(f)](x)$ does to each $t\in I$. What it does is to send $t$ to $f(t)(x)$. This makes sense because $f(t)$ is a linear function from $E$ to $F$, so evaluating it at $x\in E$ gives an element of $F$.

    In general, one can think of a function of two variables as a function of single variable whose values are functions of single variables. For example, given $f(x,y) = x^2-y^2$, you can think of it instead as a bunch of functions, one for each value $a$ of $x$: $f_a(y) = a^2-y^2$. Intuitively, the first coordinate tells you which function to use, and the second coordinate tells you where to evaluate it. So instead of thinking of this function as a function with domain $\mathbb{R}\times\mathbb{R}$ and range $\mathbb{R}$, we can think of it as a function with domain $\mathbb{R}$ and range $\mathbb{R}^{\mathbb{R}}$ (functions from $\mathbb{R}$ to $\mathbb{R}$). This is called currying. But you can also think of $f(x,y)=x^2-y^2$ as follows: the second coordinate tells you which of a family $g_b$ of functions to use, and then you evaluate $g_b(x)=x^2-b^2$ at the first coordinate.

    By the same token, a function whose values are functions can be thought of instead as a function of several variables. So an element of $C(I,L(E,F))$ can be thought of in two ways:

    • As a continuous function with domain $I$ and image linear maps from $E$ to $F$; or
    • As a function with domain $I\times E$, continuous in the first component and linear in the second component, and values in $F$.

    Likewise, $L(E,C(I,F))$ can be thought of in two ways:

    • As a function with domain $E$ and image continuous functions from $I$ to $F$; or
    • As a function with domain $E\times I$ and values in $F$, linear in the first coordinate and continuous in the second.

    As you can see by looking at the second descriptions of each, the two sets, $C(I,L(E,F))$ and $L(E,C(I,F))$ "should" be the same thing. The only difference is that in the first case we are thinking of $I$ as giving the "index" of the function you will use, which you then evaluate at a point in $E$, while in the second we think of $E$ as giving the "index" of the function you will use, which you then evaluate at a point in $I$. The description given just formalizes the fact that the first description of each set actually gives the same set "morally".

As far as restrictions: you don't need to restrict in principle: if you let $Y^X$ be the set of all functions from $X$ to $Y$, then for any three sets $A$, $B$, and $C$, "currying" tells you that there are natural bijections $$(C^B)^A \longleftrightarrow C^{B\times A} \longleftrightarrow (C^A)^B$$ (one reason for the notation). When dealing with Banach spaces, you are mostly interested in functions that "respect the structure" (linear, and continuous); when dealing with topological spaces, you are mostly interested in functions that preserve the structure (continuous). Here, we have a topological space, $I$, and two Banach spaces, $E$ and $F$. So we want to focus only on the functions that "respect the structure". So we want them to be linear when between $E$ and $F$; and continuous when the domain is $I$.

Added. "Natural bijections" above means:

  • Given any sets $A$, $X$, and $C$, a map $f\colon X\to A$ induces a map $C^A\to C^X$ by "precomposition with $f$": given any $h\in C^A$, the map $h\circ f$ is in $C^X$.
  • Given any sets $A$, $C$, and $Y$, a map $g\colon C\to Y$ induces a map $C^A\to Y^A$ by "postcomposition with $g$": given any $h\in C^A$, the map $g\circ h$ is in $Y^A$.

The bijections $(C^B)^A \leftrightarrow C^{A\times B} \leftrightarrow (C^A)^B$ are such that given maps $h\colon C\to Z$, $g\colon Y\to B$, and $f\colon X\to A$, the induced maps $(C^B)^A\to (Z^Y)^X$, $C^{A\times B}\to Z^{X\times Y}$, and $(C^A)^B\to (Z^X)^Y$ commute with the corresponding bijections: $$\begin{array}{ccccc} (C^B)^A & \longleftrightarrow & C^{A\times B} & \longleftrightarrow & (C^A)^B\\ \downarrow &&\downarrow &&\downarrow\\ (Z^Y)^X & \longleftrightarrow & Z^{X\times Y} & \longleftrightarrow & (Z^X)^Y \end{array}$$ is a commutative diagram.

  • 2
    You should copyright your answers and put them together into a book of solved problems; I'd steal a scanned copy from the web...er, I mean, I'd buy it.2011-09-02
  • 1
    As a computer scientist and occasional functional programmer, I've always been amazed at how complex and opaque mathematicians' notation manages to make these things look. They are very simple plumbing operations with almost no content -- why the idiosyncratic, unsystematic notation, different each time you encounter it? I mean, capital F's in various fonts?? I'd kill (well, at least maim slightly) for a book that explained functional analysis or differential geometry in Haskell notation, but apparently there's a secret covenant among authors that it _has_ to look forbidding ...2011-09-02
  • 0
    Unlike in other fields; notably Electrical Engineering and the IEEE, there is no glory in setting up standards in math of notation or otherwise, so this is, unfortunately, not likely to change any time soon.2011-09-02
  • 1
    @gary: There is no glory, perhaps, but good notation *is* important. As Leibniz said, find good notation and it can solve half your problem. Witness how much more flexible calculus is to use with Leibniz's notation than with Newton's. But it really is a matter of usefulness. We tend to use $f$, $g$, $h$ for functions, $x$, $y$ for points, $v$, $w$, for vectors, as a sort of "visual cue". This means that if we have lots of functions around, but want to keep those visual cues, we need to have ways of differentiating among several different $f$s and $g$s and $h$s$, $v$s and $w$s, etc.2011-09-02
  • 1
    Actually, now that I've calmed down a bit (sorry for hijacking the comment thread) I think what bothers me most is not how things are named, but that textbook authors almost never deign to spell out the _types_ of definitions explicitly. If the author here had only written "Define $\mathscr F : (I \to V \to W) \to V \to I \to W$ as ...", there would almost have been no need to look at the actual definition, except to verify that it did the natural thing. The arrows can be decorated to distinguish continuous and linear maps where appropriate.2011-09-02
  • 0
    @Arturo: I agree with you completely; I was just agreeing with Henning on his lament on how difficult it can be to navigate through much of mathematical notation. It would be great if there were congresses every few years where mathematical notation would be standardized.2011-09-02
  • 0
    @Henning Makholm: Could you please tell me why you wrote $\mathscr{F}: \ (I \rightarrow V \rightarrow W) \rightarrow V \rightarrow I \rightarrow W$ instead of $\mathscr{F}: \ (I \rightarrow (V \rightarrow W)) \rightarrow ((V \rightarrow I) \rightarrow W)$ ? After reading through the article on currying on Wikipedia it seems that this notation is the precise one...2011-09-03
  • 0
    @Arturo: Thanks for the great answer. Though I still have two questions left. Concerning 1): It is now clear to me how the map $F$ works and that the equation $F(x)(f)=f(x)$ is correct in the sense that on both sides are scalars. But does this equation really define the map $F:V \rightarrow V^{∗∗}$, as it was said in the Wikipedia article ? Shouldn't the definition state precisely element which in $V^{∗∗}$ $F(x)$ should assume ?2011-09-03
  • 0
    (Continuation) Concerning 2): You said that there are natural bijections between $(C^B)^A \ldots$ . Did you mean by "natural" "obvious/easy to construct once you have your fixed $C,B,A$s" or did you mean something with a category-theory flavour ? If you meant the latter, could you mabye describe it in laymen terms ? (Please bare with me, that I just read somewhere that in category theory the word "natural" has a precise meaning - but I personally don't have even the vaguest idea what categories are and how the word "natural" is given a precise meaning there.)2011-09-03
  • 0
    @resu, in the programming language tradition that I'm advocating for here, the $\to$ associates to the right, so $(T\to V\to W)\to V\to I\to W$ is the same as $(T\to(V\to W))\to(V \to(I\to W))$. The reason for this convention is that it minimizes the number of parentheses needed for a curried function. Your reading is _almost_ right, except that $((V\to I)\to W)$ would be something that took a _function_ in $V\to I$ as argument and gave you a $W$, whereas you actually have a something that takes a $V$ and gives you a function form $I$ to $W$.2011-09-03
  • 0
    @resu: (1) You need to prove that the image is in fact a linear function (i.e., that $F(x)$ is linear on $V^*$), and that the map itself is linear (linear on $V$); I will add that. (2) Natural in the sense of "natural transformations". E.g., given a map $g\colon C\to Z$, you automatically get a map $C^A\to Z^A$ by "post-composing with $g$". If you have a map $f\colon X\to A$, you get a map $C^A\to C^X$ by "pre-composing with $g$". (cont)2011-09-03
  • 0
    @resu: If you have bijections $X\to A$, $Y\to B$, and $C\to Z$, they induce a bijection $(C^B)^A\to (Z^Y)^X$, $(C^A)^B\to (Z^X)^Y$, and $C^{A\times B}\to Z^{X\times Y}$; these bijections commute with the bijections we define in general between the sets $(R^S)^T$, $(R^T)^S$, and $R^{S\times T}$.2011-09-03
  • 0
    @resu: Re-reading your comment: The map does in fact explicitly tell you what element $F(x)$ is! Remember, elements of $V^{**}$ are themselves maps with domain $V^*$ and codomain $K$. In order to specify what function $F(X)$ is, I just need to tell you what it does to any element of $V^*$. That is **precisely** what the equation $[F(x)](\mathbf{f}) = \mathbf{f}(x)$ is saying: it's telling you what scalar $\mathbf{f}$ is sent to under the map $F(x)$.2011-09-03
  • 0
    @resu: Nonentheless, the bijections between $(C^B)^A$, $C^{A\times B}$, and $(C^A)^B$ are pretty easy: they are precisely the ones I described above via "currying": given a function $f\colon A\times B\to C$, think of it as a function from $A$ to the functions from $B$ to $C$ by sending $a$ to $f(a,-)$; or from $B$ to the functions from $A$ to $C$ by sending $b$ to $f(-,b)$.2011-09-03
  • 0
    @Henning: I'm sure there are those who say Haskell notation is forbidding too. Actually, even $\lambda$ notation is not well-known in mathematics, and there are probably mathematicians who balk at the idea of writing functions as $f = \lambda x . \, x^2$. As for spelling out the type signature of a function, I think this is a matter of style. Most of the texts I've read recently have type declarations for functions. The trouble, perhaps, is that not everything is a function...2011-09-04
  • 0
    Great answer Arturo, thanks.2011-09-07