$M(n)=n(n+1)(n+2)$ is a somewhat slack, but easy to prove upper bound on the number of moves necessary to sort the matrix (using only $R_i$, $C_j$, and $S^-_{i,j}$).
To prove it, let a number $q$ be in in place if every number $p\leq q$ is in the "correct position", i.e. $a_{i,j}=(i-1)n+j$ for all $i,j:(i-1)n+j\leq p$. It is immediate to verify that once $1,...,p$ are in place, no move can swap them out of their position, regardless of the positions of other numbers. Also, remember that if a matrix is sorted row-wise, sorting it column-wise leaves it sorted row-wise and viceversa.
Let us then begin by spending $2n$ "initialization" moves to sort the matrix row-wise and column-wise, after which $1$ is certainly in place since it can have no smaller number above or the left of it. All that is left to prove is that if every $p
Assume $q$ is not already in place (otherwise, we can just move on to $q+1$). If the matrix is sorted row-wise then $q$ must be in the first column since, if it had any number to its left, that number would be smaller and thus in place - and then $q$ would be in place too. Also, since the matrix is sorted column-wise, either $q$ is in the first row or the element above $q$ must be smaller than $q$ and thus in place. So, applying $S^-_i,1$ to the position $q$ is currently occupying if $i>1$ (which moves it to $a_{(i-1),n}$ i.e. the rightmost element of the row immediately above it) and then sorting $q$'s row brings $q$ to left of a number smaller than it (and thus in place), so with at most $2$ moves we have $q$ in place.
The matrix, however, might no longer be sorted row-wise and/or column-wise. First of all note that the only rows that might have been altered by the $2$ operations above are rows $i$ and $i-1$, and the second operation was actually to sort one of the two rows, so $1$ more move to sort the other leaves the matrix sorted row-wise. All we have to to is then spend $n$ moves to sort it column-wise. This proves that after then first $2n$ initialization moves, $n+3$ addtional moves per number are sufficient, and thus that $n(n+1)(n+2)$ moves in total are sufficient.
In fact, it's immediate to see that if $p$ "belongs" to the first column, it must be automatically in place as soon as all numbers smaller than it are in place (or immediately after the $2n$ "initialization" moves in the case of $1$), so no moves must be spent for it. And if $p$ "belongs" to the last column, its row is already sorted as soon as we use $S_{i,j}^-$ to move it there, so we can save $1$ move. Finally, once the last number of the $(n-1)^{th}$ row is in place and the last row is sorted, one can avoid sorting the columns or indeed doing anything else since all numbers in the last row must be in place. This reduces the number of moves if $n>1$ (obviously $M(1)=0$) by $(2n-1)(n+3)$ for the numbers in the first column and/or last row (which require no moves), by $n-1$ further moves for the remaining numbers in the last column (which save the initial row-sorting), and by $n$ more moves for the last element of the $(n-1)^{th}$ row which needs no column sorting - i.e. a total of $(2n-1)(n+4)$ moves. We can then get a slightly tighter but uglier:
$M(1)=0$, $M(n)=n^3+n^2-5n+4$ for $n>1$
Note that this is an upper bound, as requested!
An exact formula seems hard to obtain and unlikely to be "simple". It would nonetheless be interesting to see if one could remove the $n^3$ term and thus make only a constant number of moves per element - or, in fact, exploit the "parallelism" inherent in $R_i$ and $C_j$ to make the number of moves asymptotically smaller than $n^2$.