2
$\begingroup$

I am given two sequences of multivariate polynomials $\overline{p}=(p_1,p_2,\dots,p_k)$ and $\overline{q}=(q_1,q_2,\dots,q_k)$, all of them on the variables $x_1,\dots,x_n$ over some finite field $\mathbb{F}$ ($\mathbb{F}_2$, if one needs to be concrete).

Every random choice of values for $x_1,\dots,x_n$ yields a vector $v_p\in\mathbb{F}^k$ and a vector $v_q\in\mathbb{F}^k$. Hence we have two distributions $D_{p}$ and $D_{q}$ over vectors from $\mathbb{F}^k$. Assume that the distributions are distinct.

My goal is to find (algorithmically, not just prove existence) a vector $w\in\mathbb{F}^k$ such that the distributions induced by the inner product $w\cdot \overline{p}$ and $w\cdot\overline{q}$ are still distinct.

($w\cdot\overline{p}=\sum_{i=1}^k w_i\cdot p_i(x_1,\dots,x_n)$ and so this is a multivariate polynomial taking values in $\mathbb{F}$ and so inducing a distribution on $\mathbb{F}$)

  • 0
    I have totally missed it. Will read soon.2011-08-24

1 Answers 1

2

This answer deals only with the aspect of distinguishing different distributions over $\mathbb F^k$ using the dot product with a vector $w$; I haven't looked into how to make use of the fact that the distributions are generated by polynomials. So you can use this to distinguish arbitrary distributions (in fact, arbitrary complex-valued functions on $\mathbb F^k$), but there might be better solutions that make use of the polynomials. That means also that, compared to a brute-force solution by enumeration, which would take $O(|\mathbb F|^n+|\mathbb F|^{2k})$ time, I'm only reducing the $O(|\mathbb F|^{2k})$ complexity of trying out all possible $w$, not the $O(|\mathbb F|^n)$ complexity of substituting all possible arguments into the polynomials, so this will only be useful if $2k\gtrsim n$.

I'll provide an $O(|\mathbb F|^k)$-time algorithm to find a suitable $w$ to distinguish two distinct complex-valued functions on $\mathbb F^k$; in justifying the algorithm I'll also be proving that such a $w$ always exists.

Given two distinct complex-valued functions $g$ and $h$ on $\mathbb F^k$, form $f=g-h\neq0$. The goal is to find $w\in\mathbb F^k$ such that

$f_w(y)=\sum_{\scriptstyle x\in\mathbb F^k\atop\scriptstyle w\cdot x=y}f(x)$

is non-zero for some $y\in\mathbb F$. For given $w\neq0$ and $y$, the condition $w\cdot x=y$ picks out a $(k-1)$-dimensional affine subspace of $\mathbb F^k$. More generally, for $V$ any affine subspace of $\mathbb F^k$, let

$f_V=\sum_{x\in V}f(x)\;.$

Now let $W$ be any affine subspace of $\mathbb F^k$, and for $\dim W consider

$\sigma_W^d=\sum_{\scriptstyle V>W\atop\scriptstyle\dim V=d}f_V=\sum_{\scriptstyle V>W\atop\scriptstyle\dim V=d}\sum_{x\in V}f(x)\;,$

where $V>W$ means that $W$ is a proper affine subspace of $V$. By symmetry, each $x\in W$ occurs the same number $c_1$ of times in this sum, and each $x\notin W$ occurs the same number $c_2$ of times, with $c_2\lt c_1$, so we have

$\sigma_W^d=c_1\sum_{x\in W}f(x)+c_2\sum_{f\notin W}f(x)=(c_1-c_2)\sum_{x\in W}f(x)+c_2\sum_{f\in\mathbb F^k}f(x)=(c_1-c_2)f_W+c_2f_0(0)\;.$

The second term vanishes if $g$ and $h$ are normalized distributions; if they are arbitrary functions, we can calculate it (in $O(|\mathbb F|^k)$ time) and use $w=0$ if it is non-zero. Thus we may assume that $\sigma_W^d$ is a non-zero multiple of $f_W$. It follows that if $f_W\neq0$ for some $W$, then for any $d>\dim W$ at least one affine subspace $V>W$ with $\dim V=d$ must have $f_V\neq0$, since the sum of all these is a non-zero multiple of $f_W$.

This suggests the following algorithm. Find some $x\in\mathbb F^k$ with $f(x)\neq0$, and set $W_0=\{x\}$. In step $d$, calculate $f_V$ for all $V>W_{d-1}$ with $\dim V=d$, and choose $W_d$ among them such that $f_{W_d}\neq0$. By the above, this is always possible, since $f_{W_{d-1}}\neq0$. Then $W_{k-1}$ is a $(k-1)$-dimensional affine subspace of $\mathbb F^k$, and we can find $w\in\mathbb F^k$ and $y\in\mathbb F$ such that $f_w(y)=F_{W_{k-1}}\neq0$.

The search for $x$ takes $O(|\mathbb F|^k)$ time. In step $d$, there are $(|\mathbb F|^{k-d+1}-1)/(|\mathbb F|-1)\in O(|\mathbb F|^{k-d})$ affine subspaces to consider, and in a naive implementation $|\mathbb F|^d$ function values have to be summed for each, taking $O(|\mathbb F|^k)$ time per step. Since all steps after step $d$ sum only over complete cosets of $W_d$, an easy optimization would be to reduce the dimension by $1$ in each step by summing over these cosets and then working in the quotient space. The naive implementation would take $O(k|\mathbb F|^k)$ time in total; the optimization reduces this to $O(|\mathbb F|^k$). In practice, one would typically find a non-zero sum long before all affine subspaces have been searched, so the average performance may be considerably better than this.