I'm wondering how the concept of convolution can be extended to 2D. As example, let us take a constant function $z=f(x,y)=1$ with support on $[0..1]^2 \in \mathbb{R}^2$ (see Fig. 1).
If we now convolute $f$ with itself (see Fig. 2), in the direction $(1,1)^T$, we should end up with a linear hexagonal hat function (see Fig. 3), which has the value $z=1$ at the center.

There are at least two ways to compute the resulting function/surface, but for both these methods I'm not completely sure how to apply them.
- Integrating, just like convoluting two univariate functions $f(t)$ and $g(t)$: $$(f * g)(t) = \int_{-\infty}^\infty f(\tau) g(t-\tau) \; d \tau$$ However, in the bivariate case I guess we should use a double integral, since we're convoluting surfaces now, not curves. Moreover, we have to consider the direction in which we convolute, which in this example is $(1,1)^T$. 
- Fourier transformation. Since convolution reduces to multiplication in the frequency domain, this seems a useful method. Suppose that we would know the Fourier transform $\hat{f}$ of $f(x,y)$, how should we then involve the direction? 
I hope somebody can demonstrate one or both of these methods, using the example above. References to bivariate convolution, preferably with examples, are of course also welcome!
[Edit]: Ok, using the expression $$(f*g)(x,y) = \int f(x',y')g(x-x',y-y')dx'dy'$$ I end up with different values than expected. For example, at $(.5,.5)$ the computed value is $.25$, instead of the expected $.5$.
Just to be sure, the convolution of the unit square (i.e. a constant value $z=1$ on $[0..1]^2 \in \mathbb{R}^2$) in the direction $(1,1)^T$ should result in the hexagonal hat function, correct?

Additionally, instead of the function $y$, I got the function $xy$ for the highlighted green part of the resulting function. How does one end up with $y$?
 
            
