We typically introduce maximum likelihood by considering a family of distributions $\{P_\theta: \theta \in \Theta\}$ which admit densities $f_\theta$ with respect to some underlying measure $\lambda$. We observe a random vector $X \sim P_{\theta_0}$ where $\theta_0$ intuitively represents the "true, unknown" value of $\theta$. The maximum likelihood estimator of $\theta_0$ is then defined to be $\hat \theta = \arg \max_\theta f(X|\theta)$.
How do we resolve the fact that $f_\theta$ is only unique almost surely? Normally we write these things off, but it actually matters here. A typical example is $X_1, X_2 \sim \mbox{Uniform}(0, \theta), \theta > 0$. Depending on whether we take $f(x|\theta) \propto I[0 \le x \le \theta]$ or $I[0 < x < \theta]$ affects what the estimator is. If it is the former we have $\hat \theta = \max\{X_1, X_2\}$ whereas with the latter the MLE doesn't exist. They are both valid densities for a Uniform$(0, \theta)$ but provide different answers. It seems natural in this case to prefer $I[0 \le x \le \theta]$ since, in this case, the density is positive and continuous when restricted to the support but I find it distaseful to define things in terms of densities instead of distributions.
My two thoughts for solutions are (1) maybe once we have specified $\{f_\theta: \theta \in \Theta\}$, provided that the MLE exists, it may be the case that it is unique a.s. and (2) maybe we just bite the bullet and take the most natural family of densities for the problem at hand; when we want to make use of theoretical results, we make sure to choose $f_\theta$ so that it satisfies the required regularity conditions.
I'm unsure whether this is better for stats.stackexchange or here since it's statistics but focuses on the mathematical formalism more than statisticians prefer to.