To rephrase the definition, for each $p\in S$, we have a neighborhood $U$ of $p$ and a coordinate chart $U\to V\subset \mathbb{R}^n$ of $N$ such that $S\cap U$ is the inverse image of a the intersection of $V$ with a $k$-dimensional linear subspace of $\mathbb{R}^n$ (in particular, a subspace of the form $(\star,\star,\star,\cdots, 0,0,0)$).
Locally, things do look like the $xy$-plane embedded in $\mathbb{R}^3$, defined by the vanishing of the $z$ coordinate. Of course, this won't hold for every chart, but we can cover $S$ with charts of $N$ such that, locally, a chart will determine what the points of $S$ are. If you have a point, and you have a chart, that tells you what the other points of $S$ are near $p$.
By the implicit function theorem, if $S$ is any submanifold of $N$, and $p\in S$, we can find a coordinate chart such that near $p$ the inclusion of $S$ into $N$ looks like the inclusion if $\mathbb{R}^k\subset \mathbb{R}^n$. Phrased differently, the condition that you wanted to hold happens for EVERY submanifold. So what this definition does is gives a topological restriction that you can't have different parts of the submanifold coming too close together.
For example, if you consider a slight modification of the topologist's sine curve, where you include a segment $(-1,1)$ of the $y$-axis and we connect that up with a path to the right hand side of the curve, you can view it as the image of the interval $(0,1)$ into $\mathbb{R}^2$ under a smooth, injective map, and so it is a submanifold of sorts. However, it is not a regular submanifold. Indeed, the induced topology (as a subset of $\mathbb{R}^2$ is different than the usual topology on $(-1,1)$). This is what the definition is meant to prevent, I believe.