(I noticed that there was a very recent question which dealt with similar material here; I didn't want to query in the comments, but merged answers might be appropriate.)
My expectation with this proof was, "Well, there are some chains in $X$ which are going to be maximal (under the ordering in $\mathscr{X})$, and these chains will have upper bounds, so those upper bounds will be maximal elements in X. So (loosely) we should just pick any maximal chain using AC and take its upper bound." But the maximal element which was located belonged to something called a tower (see the link for definition).
It seems to me that the smallest tower is uniquely determined by an arbitrary choice function. We have at least one choice function $f: \mathcal{P}(X) \to X$, and given some starting choice from $\hat{A} = \{x \in X: A \cup \{x\} \in \mathscr{X}\}$ for $A$ = $\varnothing$, it seems like the sequence of terms which belong to any "tower" is determined by the arbitrary choice function. I say this because we start necessarily with $\varnothing$, and continue to add elements to any "tower" based on the choice function for whatever set results from the last element we added. (To wit: we get some $a_0$ to union with the empty set from our "adjunction function," $g(A) = A \cup f(\hat{A} - A)$, i.e. $a_0$ = $g(\varnothing)$. Then there is again a predetermined $a_1 = g(\{a_0\})$ which we must union with $\{a_0\}$; and so forth.) So every tower contains $\varnothing$ and has to contain its "successor" under $g$, and so forth.
Call a chain in $\mathscr{X}$ "complete" if it contains its maximal element (under inclusion ordering); denote the complete chain determined by the choice function as $\mathscr{C_0}$. My question is: is the set of towers just the set of unions of any number of complete chains with $\mathscr{C_0}$?