The key difference between different simplex implementation is how they calculate the $B^{-1}A_j$. In the Naive Simplex, they calculate $B^{-1}$ again-and-again, time-complexity something like $O(m^3)$. In the revised simplex, you update directly the $B^{-1}$ without recalculating things.
Now the basic and non-basic variables are common with all methods. You have more variables than equations. So you cannot calculate the inverse. This means you need to decide which variable to take into the base so that you have a square matrix $B$ to calculate $B^{-1}$.
When you have a variable in the base, you mark it by definition with $d_i=1$ and when it is not there you mark it with $d_i=0$ where the $d_i=-B^{-1}A_j$ so
$\bar{c}_j=c_j-\bf{c'}_B \bf{B^{-1}} A_j$
where $\bar c_j=c_j+\bf{c_B'}\bf{d_j}$
so you move along the direction $\bf{d_j}$ until the cost is optimum. In standard form problems, you move always along the sides aka active variables. When you have hit the optimum, you change the base again until you have a new optimum -- in convex problems, it is sufficient to find the local optimum meaning you repeat this until your cost function does not decrease anymore, in convex problems the local optimum is the global optimum -- the key premise behind linear optimization.
The optimality condition is the reduced-cost condition. When reduced costs are positive, you have the optimum.
Answers to your questions
'"$d_j=1$ for basic variables" means that all basic vars will be moved by the same amount in the direction $d$, does it?'
No, it can be but not always true. If you had the same amount of equations and variables, you would get a point. Now you have more variables than equations. This means you must put some variables to zeros, non-basic, in order to get the $B^{-1}$. You can move along the feasible directions determined by the basic vectors but you move one-by-one. This is how I understand it: you move from one extreme point to another along one direction at each time. Of course, there may be some methods where you choose some heuristic direction but the basic Simplex methods go from one extreme point to another until optimum. Notice in $\bar c_j=c_j+\bf{c_B'}\bf{d_j}$ that the change in direction is determined by the reduced costs and the base so the base variables affect the movement.
'The phrase "$d_i=0$ for all other nonbasic indices i" means that we won't move at all in the direction of nonbasic variables, does it?'
Yes. Notice that this situation is different to the situation when you put them back to the base. When you have the variables in the base, you move along their feasible directions.