The "established rule" you're looking for is related to the Nyquist-Shannon sampling theorem, so I'd suggest researching that term. Non-rigorously, you have to sample things at least twice per up/down wiggle in order to not lose information about the function. This kind of makes sense: you have to catch every peak and trough, or you're not getting the whole picture. You have to sample twice per period of the highest frequency present in the function. If you calculate the FFT of such a properly sampled sequence, you'll see that the FFT well approximates the Fourier transform.
Unfortunately, your function is of infinite support in both the $x$ and $k$ domains: it does not have a highest frequency, so the Nyquist theorem does not strictly apply. However, we can still non-rigorously apply it. The results depend on where you truncate $f(x)$, since you can't feed infinite samples into an FFT.
Let's calculate the "instantaneous" frequency of $f(x)$. It is given by the derivative of the cosine argument: $ k_\text{inst}(x) = \frac{d}{dx}x^2 = 2x $ At the endpoints of a finite domain of truncation $x\in(-x_c,x_c)$, the highest frequency present is $ k_\text{inst-max}=2x_c, $ and the Nyquist rate is double that $ k_\text{Nyq}\approx 4x_c $ To convert to a "length" per sample (where "length" is really your units of measure for $x$) we take the reciprocal and multiply by $2\pi$ $ \nu_s=\frac{2\pi}{4x_c}=\frac{\pi}{2x_c} $ The sampling length should be at least that; in practice you take some fraction of that that length for finer resolution (oversampling). This is Delta_x
in your parlance. Here's an example with $x_c=5$ and no oversampling ($\nu_s$ as above):

You can see that the samples just barely capture the peaks and valleys of the function most of the time. For better resolution, up the oversampling; here's an example with the sampling length halved.

As you can see, this does a better job of capturing the behavior of the function. Finally, you'll only see agreement in the $k$ domain with the expected Fourier transform over the limited set of frequencies that are present in your truncation of $f$, and even then you'll see better and better agreement at higher and higher oversampling.
Finally, there are some other things you can do like windowing the function before taking the FFT to smooth out issues in the $k$ domain, so I would suggest looking into window functions on Wikipedia.