In undergraduate applied vector calculus, the gradient is simply the one-dimensional array of partial derivatives of a function $f$: $(\delta_1f,...,\delta _df)$
In differential geometry (at least in the general relativity course I'm taking), the gradient of $f$ (a function on a manifold $M$) is defined as a map $$(df)_p: T_pM \to \mathbb R,$$ $$(df)_p(X) := X(f),$$ where $T_pM$ is the tangent space on point $p \in M$.
Why do these definitions intuitively capture the same concept of "gradient"?