1
$\begingroup$

In our application we wish to estimate the actual path of objects. We have a set of samples of object locations $(t_i, x_i, y_i, P_i)$ where $t_i$ is the sample time, $(x_i, y_i)$ is the 2D location, and $P_i$ is the error covariance matrix.

We estimate the path using polynomial regression as a function of $t$. That is, we have two polynomials $x(t) = a_x + b_x t + c_x t^2 + d_x t^3$ and $y(t) = a_y + b_y t + c_y t^2 + d_y t^3$. We find $\{a, b, c, d\}$ using a log-likelihood estimator. That is, we minimize: $$\Sigma (x(t_i) - x_i, y(t_i) - y_i) P_i^{-1} (x(t_i) - x_i, y(t_i) - y_i)^T$$

I would like to be able to estimate the approximation error for each $t$. After finding our point $(x(t), y(t))$ using the above polynomials, we also want the error covariance matrix $P(t)$. Does anybody know of a method of doing so?

Note that when $P_i$ are diagonal matrices we get the simple linear regression that has known coefficient error estimates that I can use. Does anybody know how can this be done for the non-diagonal case?

Alex.

  • 0
    I'm puzzled by the $P_i$ notation. A different covariance matrix for each value of the index $i$? That makes it look as if you mean $P_i$ is a _scalar_ for each $i$, maybe the variance of the $i$th error. That would make the sum make sense, that you're trying to minimize. And is there some difference between what you mean by $x_i$ and what you mean by $x(t_i)$? Or do you mean $P_i$ is a $2\times 2$ matrix of covariances: the variance of $x_i$, the variance of $y_i$, and the covariance between them? Is $P_i$ fully known, or is it to be estimated based on the data?2011-08-08
  • 0
    @Michael: $x(t_i)$ is the value of the to-be-fitted function $x(t)$ at time $t_i$, whereas $x_i$ is the actually measured value. My understanding of $P_i$ is what you wrote at the end: a known $2\times2$ matrix of covariances between $x_i$ and $y_i$.2011-08-08
  • 0
    @Alex: I may be missing something, but it seems to me that the coefficient covariance matrix you want is given in [this Wikipedia section](http://en.wikipedia.org/wiki/Numerical_methods_for_linear_least_squares#Parameter_errors.2C_correlation_and_confidence_limits)?2011-08-08
  • 0
    $P_i$ is indeed a known error covariance matrix for measurement $i$. @joriki, I already read the mentioned wikipedia section. It seems this section is for the "weighed linear least squares". I don't see how I can represent my case with a diagonal weights matrix.2011-08-08
  • 0
    In addition, as I mentioned in my original post, when $P_i$ are diagonal matrices it reduces to the simple weighed linear least squares - the one mentioned in the wikipedia page.2011-08-08
  • 0
    Sorry, I hadn't seen that they assume a diagonal weight matrix in the section before. (I did say I might be missing something ;-)2011-08-08
  • 0
    @joriki: What you're saying should have been obvious to me; I guess notation can be confusing if one isn't paying close attention. (at)Alex: To speak of log-likelihood estimators is to have in mind a particular parametrized family of probability distributions for the errors. This sort of generalized least square coincides with maximum-likelihood estimators ONLY if the distribution of the errors is Gaussian. Where you say "we also want the error covariance matrix", does that mean it is to be estimated, rather than known in advance?2011-08-08
  • 0
    @Michael, yes, we mean estimated. We want to know the distribution of estimation errors for the estimated point $(x(t), y(t))$ at any given $t$. For simple linear regression it is known as "prediction band".2011-08-09
  • 0
    @Alex: I looked at that Wikipedia section again, and I don't see where the assumption of diagonal $W$ is used. Everything there beginning with the normal equations is written in matrix form and seems valid for arbitrary $W$. The step that says "When $W = M^{−1}$ this simplifies to" uses the fact that $M$ is symmetric, as it must be for a covariance matrix. So it seems to me that $W$ is just called "the diagonal matrix of such weights" in the section before because of how the weights are introduced there, and this isn't actually required anywhere. But I may be missing something again... :-)2011-08-09
  • 0
    I took a very "theoretical" sort of course on the Wishart distribution and related topics, and other "applied" sorts of courses on the same topics (those other courses not only exist but are more numerous than courses like the one I took) might have left me knowing standard algorithms for things like this off the top of my head. Instead they left me knowing how to derive such things from scratch. A fair amount of work in _some_ cases, and I'm quite rusty.2011-08-10
  • 0
    @joriki - I don't know how they developed their formulas, so I don't know what properties about W they assume.2011-08-10
  • 0
    @Michael, do you have any good pointers? books? articles? Something else?2011-08-10

0 Answers 0