I know that $$P\left(\bigcup_{i=1}^{n} A_i \right)$$
is the sum of of the probabilities of all the sample points that are contained in at least one of the $A_{i}$'s. This is the probability of sample points belonging to exactly 1 event, exactly 2 events, ...,exactly $n$ events. WLOG this can be written as $$P\left(\bigcup_{i=1}^{n} A_i \right) = P(A_1) + P(A_1 \cap A_2) + \cdots + P(A_1 \cap A_2 \cap \cdots \cap A_n)$$
But the $A_i$'s are arbitrary and we have to account for that. So there are $n$ possibilities for the first probability, $\binom{n}{2}$ possibilities for the second probability, ..., and $1$ possibility for the final probability. So we add and subtract these to prevent overcounting?