In the image below, the linear regression with heteroscedasticity got transformed into homoscedasticity by dividing the usual linear regression with $x_i^{\frac{1}{2}}$. As a result of transformation, it becomes a homoscedastic linear regression. I am wondering for what reason we can apply $x_i^{\frac{1}{2}}$ to the equation. Is that because it is just the way to transform the
a heteroscedastic regression into a homoscedastic regression?

For what reason, why is $x_i^{\frac{1}{2}}$ applied to transform the usual regression to transform Weighted Least Regression?
0
$\begingroup$
statistics
recreational-mathematics
statistical-inference
machine-learning
least-squares
-
1It is pretty clear by the definition of the variance - and how we need to get rid of the linear dependence of $x_i$. – 2017-02-12
1 Answers
2
It's cause they've assumed that the conditional variance is a linear function of $x.$ This is expressed in the equation $\mathrm{Var}(\epsilon_i|x_i) = \sigma^2x_i.$ (And note it's a huge assumption for the purposes of example... not generic at all).
Then they divide by $\sqrt{x_i}$ so that the new noise $\epsilon_i'= \epsilon_i/\sqrt{x_i}$ has variance $\mathrm{Var}(\epsilon_i'|x_i) = \sigma^2.$ In other words, it has constant variance.
You could do the same thing if, say, the variance were quadratic in $x$: $\mathrm{Var}(\epsilon_i|x_i) = \sigma^2x_i^2.$ Then you'd divide through by $x_i$ rather than $\sqrt{x_i}.$
-
0Thank you for the explanation. What you say is that the equations in the image only covers a linear regression part? So I need to apply a different transformation factor for a non-linear regression? Thus the equation shown in the image is not generic. – 2017-02-12
-
0@user122358 No, this has nothing to do with linear vs. nonlinear regression. This example and my additional are both linear regressions (the regression function $\alpha +\beta x$ is the same in both cases and is linear in the parameters $\alpha,\beta$). The assumption was in how the variance of the noise $\epsilon_i$ depends on $x_i.$ The slide assumes it goes up as a linear function of $x_i.$ Look at the plot at the top here: https://en.wikipedia.org/wiki/Heteroscedasticity The spread increases with $x_i$ but the mean follows a trend line, just like in the normal case. – 2017-02-12
-
0Thank you for the explanation, – 2017-02-12