Alright, I understand the proof of the theorem and everything, but I don't have much intuition about the definition. I think the theorem would work using any function $\psi : \mathbb{N} \rightarrow \mathbb{N}$ such that $\psi(1) = 1$ and $\displaystyle\sum_{d | n} \psi(d) = 0$ whenever $n > 1$. So is there some specific reason as to why $\mu$ is defined the way it is?
As another question, I am using the book "A classical introduction to modern number theory" by Ireland & Rosen and on chapter 2 there is the exercise 21 which I just cannot make sense of. It reads:
"Define $f(n) = p$ if $n$ is a power of p and zero otherwise. Prove that $\sum_{d | n} \mu(n/d) \log d = f(n)$. [Hint: First calculate $\sum_{d | n} f(d)$ and then apply the Möbius inversion formula]"
It doesn't make sense to me, because which is this $p$ that the question references? Is it some fixed constant? I guess this would make sense if $\sum_{d | n} f(d) = \log n$ because then the formula would work out, but why would that be true? Well I guess if we consider that $n = p^{\epsilon}$ then the sum would give $\epsilon \log p = \log p^{\epsilon} = \log n$ (if we consider 1 as not a power of $p$), but that's just not a safe assumption, is it? I'm rambling here, any ideas?