The point of Qiaochu Yuan's answer is that firstly that in order to be surprised the $a^p-a$ is not divisible by $p$ when $p$ is composite, you should first have a (combinatorial in your case) reason to find it natural that it does hold when $p$ is a prime number. One such reason is that $a^p-a$ counts words (which I find more convenient to talk about than necklaces, but it is the same thing) of length $p$ over an $a$-element alphabet that do not consist of $p$ identical letters. The reason this number is divisible by $p$ is that cyclic rotation partitions the set of such words into packets (orbits) of size $p$. That this is so requires a bit of thought: if there were any non-trivial cyclic rotation (not necessarily by a single place) that transforms a word into itself, then by iterating that one can obtain the cyclic rotation by a single place, and any word invariant by that rotation clearly has all its letters equal. This amounts to the fact that any non-identity element of a cyclic group of $p$ element generates the entire group.
Now the second point of the Qiaochu Yuan's answer is to analyze the situation when $p$ is composite (I'll henceforth call the number $n$ to avoid confusion). Now it is no longer true that any non-identity element of a cyclic group of $n$ element generates the entire group; notably it can generate a subgroup of any order dividing $n$. Consequently there are words that are invariant under some non-trivial cyclic rotation, but not under all of them, and in particular not under the rotation by a single place. The word RBYRBY is an example, invariant under rotation by $3$ places, but not under rotation by one place. You may very well stop here; it is clear why the argument breaks down, and the numerical examples you gave show that there is no other argument that will make $a^n-a$ divisible by $n$ for all $n$ either.
If you insist on finding something that is counted by a number always divisible by $n$, then you can still take all words that are not invariant under any non-trivial cyclic rotation, as these come in packets (orbits) of size $n$. Call this number $f(n)$, which will be useful later. Now rather than having $f(n)=a^n-a$, you need to exclude for every strict divisor $d$ of $n$ the words that come in packets of size $d$. These are harder to count than words that come in singleton packets (as we saw there were $a$ of them), but can be done by the following reasoning. Such words necessarily consist of a sub-word $s$ of $d$ letters that is repeated $d$ times: $w=s.s\ldots s$. However $s$ cannot just be any word, it must not have any internal cyclic symmetry, since if it did it would belong to a shorter orbit.
But words without any cyclic symmetry is just what we were out to count, except that it was for length $n$ and now we are confronted with length $d. Hence the utility of a general function $f$: it is defined implicitly (by induction) by the relation $ f(m)=a^m-\sum_{d\text{ strictly divides }m}f(d) $ for all $m\geq1$. Note that in particular $f(1)=a$ since $1$ has no strict divisors. This equation can be formulated more simply by moving the summation to the other side, which also avoid having to mention "strict": $ \sum_{d\mid m}f(d)=a^m $ for all $m\geq1$. The general technique the will solve this is called Möbius inversion. Since you haven't heard about that, here's its secret. Think of a vectors $F(n)=(f(1),f(2),\ldots,f(n))$, the unknown we are after, and $G(n)=(a,a^2,\ldots,a^n)$ which tabulates the right-hand-sides of our last equation. This equation then says $F(n)=\mathbf D(n)\cdot F(n)$, where $\mathbf D(n)$ is the lower triangular matrix with entries $\mathbf D(n)_{i,j}=1$ whenever $j$ divides $i$, and $\mathbf D(n)_{i,j}=0$ otherwise. Actually the equation gives us the result of multiplying row $m$ of of the matrix $\mathbf D(n)$ and the vector $G(n)$, so the full matrix equation takes care of "for all $m$" (upto $n$). Now all we need is the inverse matrix $\mathbf D(n)$, and it can (rather easily) be shown to be given by $ \mathbf D(n)^{-1}_{i,j} = \begin{cases}(-1)^c & \text{if }i/j\text{ is integer and a product of }c\text{ distinct primes}\\ 0 & \text{otherwise}\\ \end{cases} $ which is usually written as $ \mathbf D(n)^{-1}_{i,j} = \begin{cases}\mu(i/j)&\text{if }j\mid i\\ 0 & \text{otherwise}\\ \end{cases} $ where $\mu$ is the so-called (arithmetic) Möbius function. Now one can write $F(n)=\mathbf D(n)^{-1}\cdot G(n)$, giving $ f(m)=\sum_{d\mid m}\mu(m/d)a^d $ which is the formula Qiaochu gave (with $m=n$) for the number of entirely non-symmetric necklacesof length $n$. (Actually he interchanged $n/d$ and $d$, which can be seen to make no difference; personally I prefer to always put $n/d$ in the argument of $\mu$.)