Basic definitions
Before you can understand finite fields, you need to understand what a field is. Fields are algebraic structures, meant to generalize things like the real or rational numbers, where you have two operations, addition and multiplication, such that the following hold:
- Addition and multiplication are both commutative and associative operations.
- Multiplication distributes over addition, i.e. $a(b+c)=ab+ac$.
- There are distinguished elements $0$ and $1$ such that $0+a=a$ and $1*a=a$.
- For every $a$ we have an element $-a$ such that $a+(-a)=0$. If $a\neq 0$, we have an element $(1/a)$ such that $a(1/a)=1$.
Phrased differently, $(\mathbb{F},+)$ and $(\mathbb{F}\setminus 0,\cdot)$ are abelian groups, and we the operations distribute.
Given a field, we can define a vector space to be an abelian group such that we can multiply vectors by field elements (and we satisfy various associativity and distributivity conditions).
Some examples of fields are the real numbers $\mathbb{R}$, the complex numbers $\mathbb{C}$, the rational numbers $\mathbb{Q}$, and the the integers modulo a prime $\mathbb{F}_p=\mathbb{Z}/(p)$. If $\mathbb{F}$ is a field, then the collection of $n$-tuples of $\mathbb{F}$, written $\mathbb{F}^n$, is a vector space. Every finite dimensional vector space can be written in this form.
Constructing new fields from old ones
Given a field $\mathbb{F}$, we can construct a new field by adding new elements to the $\mathbb{F}$, creating what is called a field extension of $\mathbb{F}$. For example, if we adjoin $\sqrt[3]{2}$ to $\mathbb{Q}$, we get a new field $\mathbb{Q}[\sqrt[3]{2}]$ which is the subset of $\mathbb{R}$ consisting of all elements of the form $a+b\sqrt[3]{2}+c\sqrt[3]{4}$ with $a,b,c\in \mathbb{Q}$. More generally, if we have an irreducible polynomial, we can formally adjoin the root of the polynomial to the field. In the example above, the polynomial would be $x^3-2$.
A field extension of $\mathbb{F}$ is naturally a vector space over $\mathbb{F}$. If it is finite dimensional, it is called a finite field extension. If we obtained the extension by adjoining a single root of a polynomial, the dimension will be the degree of the polynomial, and is called the degree of the extension. Additionally, every finite field extension can be obtained by adjoining elements, possibly several times (in stages).
An example of all this is $\mathbb{Q}[x]/(x^2+x+1)$, where we have adjoined a root of the polynomial $x^2+x+1$ to the rationals. The elements of the field are polynomials with coefficients in $\mathbb{Q}$, modulo the relation that two polynomials represent the same thing if their difference is a multiple of $x^2+x+1$. For example, $(x+1)(x-1)=x^2-1=(x^2-1)-(x^2+x+1)=-x-2$. The fact that we have multiplicative inverses comes from $x^2+x+1$ being irreducible and the Euclidean algorithm.
Finite fields
Suppose that a field $\mathbb{F}_q$ has a finite number of elements $q$. Then considering the sequence of elements $1, 1+1, 1+1+1, \ldots$, we eventually have to get repetition, so we have a smallest number $p$ such that $1+1+\ldots +1$ ($p$ times) is equal to $0$. Thus, the field contains the ring $\mathbb{Z}/(p)$, and because we are a field, it must be the case that $p$ is prime. We call $p$ the characteristic of the field, and we call $\mathbb{F}_p$ the prime field of $\mathbb{F}_q$. Since every finite dimensional vector space can be written as $\mathbb{F}^n$, we must have that $q=p^n$ for some p.
\mathbb{F}_q$ is a finite field extension of $\mathbb{F}_p$, and so we can obtain $\mathbb{F}_q$ by adjoining roots of polynomials. It can be shown that every element of $\mathbb{F}_q$ is a root of the polynomial $x^q-x$. If we take just one irreducible factor $f(x)\mid x^q-x$ of maximal degree, then we have $\mathbb F_q=\mathbb{F}_p[x]/(f(x))$ (it doesn't matter which factor we took), and so there is a unique finite field with $q=p^n$ elements. We can obtain it by taking ANY irreducible polynomial of degree $n$ over $\mathbb{F}_p, and adjoining the root.
The field used in AES
In AES, they have fixed a particular irreducible polynomial of degree 8 over the field \mathbb{F}_2$. While there are many different choices, you need to have picked one to be able to do computations. By dividing out by this polynomial and taking the remainder, every element of the field is represented uniquely by a polynomial $a_7 x^7 + a_6 x^6 + \ldots +a_0$ with each $a_i= 0 \text{ or } 1. and the coefficients can be viewed as a byte. Addition is accomplished via XOR, and multiplication comes from taking a certain convolution of two bytes, then subtracting off bitshifts of the defining polynomial.
One subtle point to make is that we have essentially defined our field to be polynomials modulo relations. However, we can define polynomials over this field too by writing out a_0 + a_1 x + a_2 x^2 + \ldots + a_n x^n$ where each $a_0$ is an element of $\mathbb{F}_{256}$ = GF(2^8)$. So we have polynomials whose coefficients are polynomials. Thinking in this way is NOT helpful, and it is better to think of GF(2^8) abstractly, as just some thing that has elements which can be added and multiplied. If you don't abstract the definition of GF(2^8) away, you will end up confusing yourself.