Start with the definition of linearly dependent. A set of elements $\{v_1,\ldots,v_n\}$ of a vector space $V$ over field $K$ is linearly dependent if there are some $\lambda_1,\ldots,\lambda_n \in K$ such that $\lambda_1v_1 + \ldots + \lambda_nv_n = 0$. In your case, the vector space is the space $V$ of all linear mappings $\mathbb{R}^2 \rightarrow \mathbb{R}^2$, and the field is $\mathbb{R}$. Note that this space is isomorphic to $\mathbb{R}^4$. It follows that every set with more than 4 elements must be linearly dependent, because $\text{dim}(V) = 4$.
Now, every matrix $A \in \mathbb{R}^{nxn}$ has an associated characteristic polynomial $p_A \in \mathbb{R}[x]$ with $\text{deg }p_A = n$, i.e. $p_A(x) = a_nx^n + \ldots + a_1x + a_0$. And, very importantly, you always have $p_A(A) = 0$! (One says that the characteristic polynomial annihilates the matrix). Thus, for every matrix $A \in \mathbb{R}^{2x2}$ you can find $\lambda_0,\ldots,\lambda_2$ such that $\lambda_0I + \lambda_1A + \lambda_2A^2 = 0$. This answers question (2) and (5).
For the other questions, observe that you can also interpret the above as a way to write $A^2$ as a linear combination of $I$ and $A$, more precisely as $A^2 = \frac{1}{\lambda_2}(\lambda_0 I + \lambda_1 A)$. You can extend that to generate any higher power too - for $A^3$ you get $ \begin{eqnarray} A^3&=&AA^2=A\frac{1}{\lambda_2}(\lambda_0 I + \lambda_1 A) =\frac{1}{\lambda_2}(\lambda_0 A + \lambda_1 A^2) =\frac{1}{\lambda_2}\left(\lambda_0 A + \lambda_1 \frac{1}{\lambda_2}(\lambda_0 I + \lambda_1 A)\right) \\ &=&\left(\frac{\lambda_0}{\lambda_2}+\frac{\lambda_1^2}{\lambda_2^2}\right)A + \frac{\lambda_0\lambda_1}{\lambda_2^2}I \end{eqnarray} $ The exact coefficients are not important - the important fact is that for every n you can find coefficients $\mu_1,\mu_2 \in \mathbb{R}$ such that $A^n = \mu_1I + \mu_2A$. Thus, the subspace spanned by a set of powers of $A$ has at most dimension 2. It follows that every set with 3 or more members is linearly dependent.