This answer deals only with the aspect of distinguishing different distributions over $\mathbb F^k$ using the dot product with a vector $w$; I haven't looked into how to make use of the fact that the distributions are generated by polynomials. So you can use this to distinguish arbitrary distributions (in fact, arbitrary complex-valued functions on $\mathbb F^k$), but there might be better solutions that make use of the polynomials. That means also that, compared to a brute-force solution by enumeration, which would take $O(|\mathbb F|^n+|\mathbb F|^{2k})$ time, I'm only reducing the $O(|\mathbb F|^{2k})$ complexity of trying out all possible $w$, not the $O(|\mathbb F|^n)$ complexity of substituting all possible arguments into the polynomials, so this will only be useful if $2k\gtrsim n$.
I'll provide an $O(|\mathbb F|^k)$-time algorithm to find a suitable $w$ to distinguish two distinct complex-valued functions on $\mathbb F^k$; in justifying the algorithm I'll also be proving that such a $w$ always exists.
Given two distinct complex-valued functions $g$ and $h$ on $\mathbb F^k$, form $f=g-h\neq0$. The goal is to find $w\in\mathbb F^k$ such that
$f_w(y)=\sum_{\scriptstyle x\in\mathbb F^k\atop\scriptstyle w\cdot x=y}f(x)$
is non-zero for some $y\in\mathbb F$. For given $w\neq0$ and $y$, the condition $w\cdot x=y$ picks out a $(k-1)$-dimensional affine subspace of $\mathbb F^k$. More generally, for $V$ any affine subspace of $\mathbb F^k$, let
$f_V=\sum_{x\in V}f(x)\;.$
Now let $W$ be any affine subspace of $\mathbb F^k$, and for $\dim W consider
$\sigma_W^d=\sum_{\scriptstyle V>W\atop\scriptstyle\dim V=d}f_V=\sum_{\scriptstyle V>W\atop\scriptstyle\dim V=d}\sum_{x\in V}f(x)\;,$
where $V>W$ means that $W$ is a proper affine subspace of $V$. By symmetry, each $x\in W$ occurs the same number $c_1$ of times in this sum, and each $x\notin W$ occurs the same number $c_2$ of times, with $c_2\lt c_1$, so we have
$\sigma_W^d=c_1\sum_{x\in W}f(x)+c_2\sum_{f\notin W}f(x)=(c_1-c_2)\sum_{x\in W}f(x)+c_2\sum_{f\in\mathbb F^k}f(x)=(c_1-c_2)f_W+c_2f_0(0)\;.$
The second term vanishes if $g$ and $h$ are normalized distributions; if they are arbitrary functions, we can calculate it (in $O(|\mathbb F|^k)$ time) and use $w=0$ if it is non-zero. Thus we may assume that $\sigma_W^d$ is a non-zero multiple of $f_W$. It follows that if $f_W\neq0$ for some $W$, then for any $d>\dim W$ at least one affine subspace $V>W$ with $\dim V=d$ must have $f_V\neq0$, since the sum of all these is a non-zero multiple of $f_W$.
This suggests the following algorithm. Find some $x\in\mathbb F^k$ with $f(x)\neq0$, and set $W_0=\{x\}$. In step $d$, calculate $f_V$ for all $V>W_{d-1}$ with $\dim V=d$, and choose $W_d$ among them such that $f_{W_d}\neq0$. By the above, this is always possible, since $f_{W_{d-1}}\neq0$. Then $W_{k-1}$ is a $(k-1)$-dimensional affine subspace of $\mathbb F^k$, and we can find $w\in\mathbb F^k$ and $y\in\mathbb F$ such that $f_w(y)=F_{W_{k-1}}\neq0$.
The search for $x$ takes $O(|\mathbb F|^k)$ time. In step $d$, there are $(|\mathbb F|^{k-d+1}-1)/(|\mathbb F|-1)\in O(|\mathbb F|^{k-d})$ affine subspaces to consider, and in a naive implementation $|\mathbb F|^d$ function values have to be summed for each, taking $O(|\mathbb F|^k)$ time per step. Since all steps after step $d$ sum only over complete cosets of $W_d$, an easy optimization would be to reduce the dimension by $1$ in each step by summing over these cosets and then working in the quotient space. The naive implementation would take $O(k|\mathbb F|^k)$ time in total; the optimization reduces this to $O(|\mathbb F|^k$). In practice, one would typically find a non-zero sum long before all affine subspaces have been searched, so the average performance may be considerably better than this.