The following justification is based on material in Johann Cigler’s lecture notes, Chapters 1 and 3.
The ordinary Stirling numbers of the second kind are characterized by the identity $(xD)^n=\sum_k\left\{\matrix{n\\k}\right\}x^kD^k\;,\tag{1}$ where $D$ is the ordinary differentiation operator. Thus, one approach to defining a $q$-analogue $S_q(n,k)$ is to require that it satisfy the analogue of $(1)$ with $D$ replaced by $D_q$, defined by $D_qf(x)=\frac{f(qx)-f(x)}{qx-x}$ and satisfying $D_qx^n=\frac{(qx)^n-x^n}{(q-1)x}=\frac{q^n-1}{q-1}x^{n-1}=(n)_qx^{n-1}$ and $D_qx=qxD_q\;.$
In other words, one might reasonably define $S_q(n,k)$ to satisfy
$(xD_q)^n=\sum_{k=0}^nS_q(n,k)x^kD_q^k\;.\tag{2}$
Assume that $(2)$ holds for some $n$; then
$\begin{align*} xD_q(xD_q)^n&=\sum_kS_q(n,k)xD_qx^kD_q^k\\ &=\sum_kS_q(n,k)x\Big(q^kx^kD_q+(k)_qx^{k-1}\Big)D_q^k\\ &=\sum_kS_q(n,k)q^kx^{k+1}D_q^{k+1}+\sum_k(k)_qS_q(n,k)x^kD_q^k\\ &=\sum_k\left(S_q(n,k-1)q^{k-1}+(k)_qS_q(n,k)\right)x^kD_q^k\;, \end{align*}$
so if we want $(2)$ to hold for $n+1$, we must set
$S_q(n+1,k)=S_q(n,k-1)q^{k-1}+(k)_qS_q(n,k)\;.\tag{3}$
$(2)$ clearly requires that $S_q(n,0)=\delta_{n,0}$; it imposes no constraint on $S_q(0,k)$ for $k>0$, but setting it to $0$ is the natural thing to do and is compatible with $(3)$.