Regarding the subobject classifier construction, why do we need the pullback?
Monos from $U$ to $X$ are called subobjects, but I see that there might be injections which just have elements of the X (viewed as a set like in set theory) permuted. This is therefore somewhat weak.
However, as far as I can see, $\text{Hom}(X,\Omega)$ are the characteristic functions (set theory terminology) and these are in bijective correspondence with what we understand as subsets. Why then do we need $U$, $j$, etc. to do set theory? The only purpose for the $U\rightarrow 1\rightarrow\Omega$ route I can come up with is to define $\chi_j$ in terms of function composition and therefore associating $\chi_j$'s with objects like $U$. Is it ment that the subobject classifier enables us to classify certain unique $U$'s and we can then associate objects (like $U$) as subobjects of objects (like X)? But why would that be necessary? Why not just consider $\text{Hom}(X,\Omega)$ as subobjects (vs, plural) of $X$? And in case we just don't want them as morphisms: I've seen hom-sets taken to a new category via a functor, why doesn't this suffice?
Secondly, it's always explicitly stated that we need a terminal object to do all of the above. But don't we also need "true" and "!" as well, must this also always be explicitly required, or are we sure some of them are implying by there being a terminal object?
Lastly, it is said that topos theory aviods the stacking of element-inclusion, i.e. $\in$ get replaced by axioms of function composition. But if Set contains any universe you would want to talk about, then surely these nested sets are to be find there too, and this just means that there are super long chaings involving subobject classifiert. Does this really reduce notational ballast, compared to any set theory with multiple types?