If given the choice between two statistical models (for argument's sake, let's say Model 1 is $y = \beta_0 + \beta_1 x_1 + \epsilon$ and Model 2 is $y = \beta_0 + \beta_1 x^2_1 + \epsilon$), is there a way to select which model is more appropriate based upon an analysis of the residuals?
Let's stipulate that the models have the same number of parameters, $k$, and that $SSE_{Model 1} = SSE_{Model 2}$. The only difference is that residuals from Model 1 'appear' to be more systematic than those in Model 2. That is, upon examining the residuals for Model 1 along the dimension $x$, it becomes clearly apparent that there are areas along $x$ where $Prob(\epsilon_i \ge 0 | x_i) > .5$ - and there are areas along $x$ where $Prob(\epsilon_j \ge 0 | x_j) < .5$.
Since a test of just $SSE$ would rule that the two models are equivalent (as both models have the same overall error measurement) - and since the number of parameters, $k$, is the same for both models, the usual tests, Mallow's CP, Akaike's AIC & Schwarz's BIC wouldn't discriminate. Yet, I can see with my eyes that one model has more (or at least a less-random pattern of) systematic error than the other.
Yes, I know that this probably means that neither model is the 'correct' model, but I'm constrained to select between the given choices of models.
I don't know of a test (other than just 'eyeballing' the residuals) that would allow me to decide between the alternatives.
Is there a formal test, perhaps one based upon the pattern of residuals, that would allow me to discriminate? I could see something that perhaps stepped through the various values of $x$ and tested as the null hypothesis that the residuals are such that $Prob(\epsilon_{segment} \ge 0|x_{segment}) = .5$. Perhaps the model where the cumulative stepwise test is most false would be the model that I should select against.
Any ideas?