As André and Arthur have pointed out, the labels generated by your algorithm will not be truly independent: knowing that, say, $S(n)[1] = 3$ reduces the probability that any other element of $S(n)$ will have the label $3$.
In particular, assume that we know the labels of $k = k_1 + k_2 + k_3 + k_4 < n$ elements of $S(n)$, such that $k_1$ of those elements have the label $1$, $k_2$ have the label $2$ and so on. I'll also assume that we know the value of $n_2$ used to generate the labels. Then the probability of some element of $S(n)$, other than one of those whose label we know, having the label $i$ is equal to $P_i = \frac{\frac14 n_2-k_i}{n_2 - k}.$ To see why this is so, note that $n_2 - k$ is the total number of labels in $S(n_2)$ that don't belong to any of the $k$ known elements, while $\frac14 n_2 - k_i$ is the number of those labels that have the value $i$.
In particular, this implies that, if we only know the labels of a few elements out of many, the conditional distribution of the other labels is still pretty close to uniform: $k \ll n_2 \implies P_i \approx \frac14$. Also, if the labels we do know are pretty evenly distributed, such that $k_i \approx \frac14 k$ (and, in particular, if $|k_i - \frac14 k| \ll n_2 - k$), then $P_i$ is also close to $\frac14$.
Thus, as long as there are many elements in $S(n)$, and we know the labels of only a few of them, the remaining labels will seem pretty close to random. However, this only holds if $n_2$ (and thus $n$) is large enough: for example, as Arthur demonstrates, if $n_2 = 4$ even knowing the label of just one element is enough to bias the distribution of the other labels significantly.