I'm watching an MIT lecture on linear approximations: https://youtu.be/BSAA0akmPEU?t=28m30s.
The lecturer states that if we get a quadratic term in computing a linear approximation, then we should drop the quadratic term. The justification he gave was that, since we've been excluding quadratic terms all along, we should also drop the quadratic terms if they arise during the process of computation. However, I don't find this answer satisfying or substantive.
Also, wouldn't the quadratic term improve our approximation?
I'd greatly appreciate it if someone could explain the reasoning behind why we drop quadratic terms (if they arise) during the computation of linear approximations.
Thank you.