I am writing a computer program that searches for the minimum of a multivariate function $f: \mathbb{R}^n \to \mathbb{R}$. This function is in fact the sum of many functions:
$f(x) = \sum_{i=1}^m f_i(x)$
The computation starts from some guess $x_0$ of a minimum point, and at each iteration $j$ it improves the current guess $x_j$ into a better one $x_{j+1}$ (using some variant of Newton's method or steepest descent). As $x_j$ approaches the minimum, the gradient becomes closer to 0. The problem is, that to compute the gradient, I need to sum up many gradients, like this
$\nabla f = \sum_{i=1}^m \nabla f_i$
and the $\nabla f_i$'s are nonzero. The fact that their sum is close to zero means that if we compute this sum in floating point arithmetics, we get a big accumulated error and the computation is very inaccurate. When I try to approximate the gradient by evaluating $f$ at different points near $x_j$, the approximate gradient comes out indeed quite different from the computed gradient. (In comparison, at points where $\nabla f$ is far from zero, the computed and approximate gradients are very close).
Does anyone know of a good method to compute the gradient near extremal points?