I was unsure whether to ask this here or at a physics SE.
Wald's "General Relativity" defines parallel transport as follows:
$\nabla$ is a derivative operator (is linear, obeys Leibniz rule, commutative with contraction, torsion free and is consistent with the notion of vectors as directional derivatives). A vector $v^b$ given at each point of a curve C is parallel transported is said to be parallelly transported as one moves along the curve if the equation
$t^a \nabla _a v^b =0$
is satisfied along the curve, where $t^a$ are vectors tangent to the curve.
How does this definition reproduce what we intuitively understand as parallel transport? Also, other references use different terminology, with a $\nabla _v$ meaning the derivative along a vector $v$, (the same role $t^a$ plays in the definition above) which is easier to understand but (1) seems ill defined, unlike wald's definition, and (2) both definitions should be related in some way.