0
$\begingroup$

Say I have the following functions

  1. $ f(x) = Asin(Bx) $
  2. $ g(x) = M_1x $
  3. $ h(x) = M_2x $; where $M_2 \approx 0$ and $M_1 > 1000 M_2$
  4. $ z(x) = C $
  5. $ e(x) = N(0,\sigma) $
  6. $ m_g(x) = f(x) + z(x) - g(x) + e(x) $
  7. $ m_h(x) = f(x) + z(x) - h(x) + e(x) $

where the range of $ h(x) $ is approximately the same as $ f(x) $ but both are much less than the range of $ g(x) $ over the same x interval.

If I generate n samples of $m_g(x)$ and $m_h(x)$, using the same x locations and the same $e(x)$ values for each, and regress the data to get "best fit" slope values for $M_1$ and $M_2$, I noticed that the the result estimating $M_1$ is always SLIGHTLY more accurate (different at 4-5th decimal place). That is, regressing the much "steeper" $m_g(x)$ data points is more accurately estimating the true slope value ($M_1$ in this case) even when all other factors are kept the same.

Why is this?

As mentioned, it is a very, very small difference in accuracy between the two but $M_1$ always ends up more accurate (i.e. the error between the regressed slope and the simulated, "known" slope value is smaller).

I would like to know why, mathematically (that's why I'm here), this happens and how each function contributes (e.g. what would happen if $f(x)$ or $\sigma$ is scaled up? If $f(x)$ changes shape?).

I have a spreadsheet (LibreCalc) with my "simulated" data if anyone wants to see it.

Thanks in advance for the insights!

  • 0
    Note that when you just write out the name of a function like $\sin$, it gets interpreted as a juxtaposition of variable names and formatted accordingly (e.g. italicized). To get the proper formatting, you can use the predefined commands like `\sin`. If you need a name for which there isn't a predefined command, you can use `\operatorname{name}`.2012-12-16

0 Answers 0