For $\bf W$, for example, the usual construction gives something like $\begin{align} {\bf W} =& \lambda x.\lambda y.(x y) y \\ =& \lambda x. {\bf S}\, x\, {\bf I} \\ =& {\bf S\, S\, (K\, I)} \\ =& {\bf S\, S\, (K\, (S\, K\, K))} \end{align}$ The only question, I think, is whether this simulates $\bf W$ well enough for your purpose. We have at least $\tag{*} {\bf S\, S\, (K\, (S\, K\, K))}\,M\, N \to^* (M\,N)\, ({\bf K\, (S\, K\, K)}\, M\, N)$ It is true that with a strict head-only reduction strategy, you don't immediately get to reduce ${\bf K\, (S\, K\, K)}\, M\, N \to^* M$ But is that actually a problem? If we're working in a pure combinator calculus, $M$ will always be some concrete term, which is now ready to reduce. Eventually ${\bf K\, (S\, K\, K)}\, M\, N$ will either end up in a head position and get reduced, or disappear, in which case it doesn't matter anyway.
For every concrete $M$ and $N$ it holds that ${\bf S\, S\, (K\, (S\, K\, K))}\,M\, N$ has the same normal form (if any) as $(M\,N)\,N$. That will be sufficient for most purposes I can imagine -- unless you're insisting on a bisimulation between $\bf \{BCKW\}$ and $\bf \{SK\}$ (but why would you? Bisimilarity is a tool for showing observational equivalence; if you can get that by other means, what have you lost?)
Edit: Hmm, that doesn't quite hold. It might be that $M$ is just $\bf S$, and then it still has too few arguments to reduce on the right-hand side of (*). We may have to settle for observational equivalence, where ${\bf W}\,M\,N\,P_1\ldots P_n$ has a normal form exactly when $M\,N\,N\, P_1\ldots P_n$ has, for all $n\ge 1$.