The importance of the Hessian comes from the following fact:
Let $X$ and $Y$ be normed spaces, and let $L(X,Y)$ denote the space of continuous linear maps from $X$ to $Y$. There is a canonical isomorphism (and actually an isometry) $L(X,L(X,Y)) \simeq \operatorname{Bil}(X\times X\to Y),$ where Bil is the set of bilinear continuous maps from $X \times X$ to $Y$.
Since the second derivative belongs to $L(X,L(X,Y))$, we can identify it with a bilinear map. If $f \colon \mathbb{R}^n \to \mathbb{R}$, then the bilinear maps from $\mathbb{R}^n \times \mathbb{R}^n$ to $\mathbb{R}$ can be represented by $(n\times n)$ matrices, and you get the Hessian matrix. If the codomain of $f$ has higher dimension, matrices no longer suffice. I guess you'd need tensors. See also here.
Additional material: assume that $\mathbf{f} \colon \mathbb{R}^n \to \mathbb{R}^m$ is of class $C^2$. Then the second differential at $\mathbf{x}_0 \in \mathbb{R}^n$ acts like $\mathrm{d}^2 \mathbf{f}(\mathbf{x}_0)\colon (\mathbf{v}_1,\mathbf{v}_2) \mapsto \sum_{i,j=1}^n \frac{\partial}{\partial x_{i}} \frac{\partial \mathbf{f}}{\partial x_{j}}(\mathbf{x}_0) v_{1,i}v_{2,j}$ on the generic vectors $\mathbf{v}_1$, $\mathbf{v}_2 \in \mathbb{R}^n$, and $v_{1,i}$, $v_{2,j}$ are the components of $\mathbf{v}_1$ and $\mathbf{v}_2$, respectively.