I was reading up on various graph algorithms (Dijkstra's algorithm and some variants) and found the runtime $O(m + n \log n)$, where $m$ is the number of edges in the graph and $n$ is the number of nodes. Intuitively, this makes sense, but I recently realized that I don't know, formally, what this statement means.
The definition of big-O notation that I am familiar with concerns single-variable functions; that is, $f(n) = O(g(n))$ if $\exists n_0, c$ such that $\forall n > n_0. |f(n)| \le c|g(n)|$. However, this definition doesn't make sense for something like $O(m + n \log n)$, since there are two free parameters here - $m$ and $n$. Although in the context of graphs there are well-specified relations between $m$ and $n$, in some other algorithms (for example, string matching) the runtime might be described as $O(f(m, n))$ where $m$ and $n$ are completely independent of one another.
My question is this: what is the formal definition of the statement $f(m, n) = O(g(m, n))$? Is it a straightforward generalization of the definition for one variable where we give lower bounds on both $m$ and $n$ that must be simultaneously satisfied, or is there some other definition defined in terms of limits?
Thanks!