1
$\begingroup$

I'm using Python here, but even if you're not a Python expert you may be able to help.

I have a family of curves (cubic splines fitted to data) that look like periodic functions, and I'd like to shift each curve linearly along the x axis so as to minimize the sum of the squared distances between each curve and the mean curve along a specified region.

In other words, for a family of n curves $f_i(x)$ and mean curve $g(x)$, I want to find a vector of $\delta$ values that minimizes the following over region $(a,b)$:

$ \sum_{i=1}^n \int_a^b (f_i(x+\delta_i) - g(x))^2 dx $

In practice, n is around 50. Since transforming curves also alters the mean curve somewhat, this is repeated iteratively until no more shifts are required beyond a certain threshold.

I'm using scipy.optimize.minimize to attempt to minimize the following function:

lambda delta: sum([scipy.integrate.quad(lambda x: (f(x+d) - g(x)) ** 2, *day_region)[0] for f, d in zip(fs, delta)]) 

where

g = lambda x: sum(f(x) for f in fs) / len(fs) 

fs is a list of scipy.interpolate.LSQUnivariateSpline objects. They have an "integrate" method that can be used to compute a definite integral quickly, but at the moment I'm not taking advantage of it. After profiling, it seems that the number of times the function to be minimized is called is causing it to take a very long time.

Is there a smarter way to do this to make it faster, or some way to rework this analytically so that it's easier to solve?

  • 0
    How do you define your $a$ and $b$ boundaries ? I believe these depend on the $\delta_i$ variables as well, or are infinite. Anyway I believe you could compute the gradient of this expression, at least numerically and try a gradient descent. If the bounds depend on $\Delta = [\delta_i]_i$, then you'll have to use the [Leibniz integral rule](http://en.wikipedia.org/wiki/Leibniz_integral_rule) to compute the gradient.2012-10-16
  • 0
    $a$ and $b$ are a specific, predefined range of interest.2012-10-16
  • 0
    and are you sure that depending on $\Delta$, you will not get out of the domain on which your spline is defined ?2012-10-16
  • 0
    That is possible, but the function is continuous beyond the first and last piece of the spline (where it will approach -inf or inf) and the squared deviations increase rapidly outside the range of the data. Delta could also be bounded to stay within the useful portion of the splines.2012-10-16

1 Answers 1