Recently, I thought of the following interesting problem. Given a set of data, I noticed that as the degree of a polynomial increases, in general the $R^2$ value tends to increase too.
I will define the $R^2$ value as the following: For a polynomial $p_k(x)=a_0+a_1 x+\cdots+a_k x^k$, the $R^2$ value for a set of points $(x_i,y_i)$ is: $$R^2\equiv 1-\sum_{i=1}^n \left[y_i-(a_0+a_1\cdot x_i +\cdots + a_n\cdot {x_i}^k)\right]^2 \tag{1}$$ Where $n$ is the number of points.
Below I demonstrate the increase of $R^2$ with the following set of data I've made up:
$$\begin{array}{c|c}x&y\\\hline0&-1\\0.5&-0.5\\1.4&-0.9\\2.1&0.2\\2.5&0.7\\3.1&1.7\\4.3&2.3\\5.2&1.5\\5.6&3.5\end{array}$$ Here is an animated GIF I have created showing this:
I realized that the set of data must be many-to-one or one-to-one for the $R^2$ value to tend to $1$, otherwise the interpolating polynomial will not be able to pass through all the points since the polynomial is a function.
Therefore, I've conjectured the following, and would like to prove it:
Let $p_k(x)$ be a least squares fitting polynomial of degree $k$. Consider a discrete many-to-one or one-to-one relationship between $x$ and $y$ with finite values of $y$. Then $$\lim_{k\to \infty} R^2=1$$ for all sets of data satisfying the above conditions.
I figured that the $R^2$ value will never decrease as $k$ increases.
I know that given $n$ points $(x_i,y_i)$, the following yields the coefficients $a_0, a_1,\cdots,a_k$: $$\begin{bmatrix} y_1 \\ y_2 \\ \vdots \\y_k \end{bmatrix}=\begin{bmatrix} 1 & x_1 & {x_1}^2 & \cdots & {x_1}^k \\ 1 & x_2 & {x_2}^2 & \cdots & {x_2}^k \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & {x_n}^2 & \cdots & {x_n}^k \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_k \end{bmatrix} \tag{2}$$
Now, I was thinking that I could combine $(1)$ and $(2)$ to prove it, however I am unsure how to do so. However, I noticed that there may be a problem with this because I know that one condition for a matrix to be invertible is that the matrix must be square. Therefore, I think we may be restricting ourselves to the specific cases where $n=k$ when proving it (Which loses the generality of the proof).
I was wondering whether proving this may be related to Power Series which does this for continuous functions $\forall x \in \mathbb{R}$. If so, I think the approach to proving this may be significantly easier.
If this is not a true conjecture or if some clarification is required, please let me know in the comments. Thanks in advance.
