Quantities are being squared, summed, and square rooted for the same reason that you do these things when finding the length of a vector. Just to pick a number, suppose that you have $10$ data points. Then before you took any measurements, there were ten variables $X_1, X_2, \ldots X_{10}$. Once you take the measurements, you have a vector in $\mathbb{R}^{10}$ denoted $\langle x_1,\ldots x_{10}\rangle$. and if you wanted its magnitude, you would sum squares and take a square root.
In the statistics setting, it is assumed that if you took measurements billions upon billions of times, there would be an average value $\mu$. If we knew $\mu$, we'd really be interested in studying all of the deviations $x_i-\mu$ in order to understand how individual values deviate from the mean. We'd study vectors of the form $\langle x_1-\mu,\ldots x_{10}-\mu\rangle$ instead of $\langle x_1,\ldots, x_{10}\rangle$. Unfortunately we do not know $\mu$. Instead, the average of the ten values (which we call $\bar{x}$) is taken to be an approximation of $\mu$, and we consider vectors of the form$\langle x_1-\bar{x},\ldots, x_{10}-\bar{x}\rangle$.
The size of these vectors would be found by summing squares and taking a square root. But why divide by $n-1$? It might seem to make more sense to divide by $n$. That would give a sense of finding the average displacement. However there is a historical reason for working this way. Once upon a time, integrals were approximated by hand and values were tabulated in the back of statistics books. If there were $10$ variables, it made sense to pretend ahead of time that you knew what $\bar{x}$ was and consider all the possible sets of measurements that would lead to that value of $\bar{x}$. This makes for a $9$-dimensional space. You have freedom to imagine $x_1$ through $x_9$, but then if $\bar{x}$ is fixed, you no longer have freedom for $x_{10}$ be whatever you like. (You may have heard the term "degrees of freedom" applied to this situation. In our example, there are 9 degrees of freedom.)
So it was actually a $9$-dimensional space that the old-timers did their calculus on. They did average the displacements, but only over the dimension of the sampling space, which was $9$, not $10$.
Another consideration: the variation in the sample data's deviations from $\bar{x}$ is a little larger than the variance would be from $\mu$ from sample to sample (since $\bar{x}$ itself varies). And dividing by $n-1$ instead of $n$ corrects somewhat for this added variation, giving a slightly larger value for the standard deviation than it would otherwise be.
I certainly haven't completely answered this, but I hope you can see the connection between statistics and vector calculus a little now. And for me, that connection is what explains the formula for standard deviation.