0
$\begingroup$

I'd be happy if someone could justify the union bound step in this proof in a more precise manner.

At the top of page 5 in this article http://www.gatsby.ucl.ac.uk/~risi/courses/ML2/2008/VC.pdf, it is stated:

enter image description here

The union bound is applied only on a finite set of hypotheses because two hypythesis h and h' only need to be counted as distinct if they differ on one of the samples in S or S'. But if S and S' change, so do the distinct hypothese that we bound on. The aforementioned h and h' may become or cease to be distinct. To conclude, the hypothese to apply the bound over, change with each sample.

How exactly can this step be justified? What are the events(subsets of the sample space) in the union bound?

1 Answers 1

1

There's a book by Marc Antony and Peter Bartlett, "Neural Network Learning: Theoretical Foundations," which I think contains a proof of essentially your question on pages 46-50.

I think it goes like this: Assume a new model where we sample $U = (x_1,y_1),\ldots(x_{2m},y_{2m})$, induce $C\downarrow_U$, and then randomly permute their order before separating them into $S$ and $S'$. Since $U$ is drawn i.i.d., the probability of the bad event is unchanged (it involves $C$ but not $C\downarrow_U$).

Given any $U$, the bad event occurs if the random permutation shuffles some member of $C\downarrow_U$ so that $\mathcal{E}_{S'}(h) - \mathcal{E}_S(h) > \epsilon/2$ for the corresponding $h$. We can bound this for each member separately and then apply a union bound. So the sample space is the set of permutations of a fixed $U$.

At least I think that's what the book says.