2
$\begingroup$

Let $s$ be an unknown bit string of length $n$. Let $p(i, b)$ be the probability that $i$-th bit of $s$ is equal to $b \in \{0,1\}$. What's the fastest method to find $s$, given the distribution $p()$? "Fastest" meaning using the least (in the expected value sense) number of queries to an oracle $O(x)$, such that $O(x)=1$ iff $x=SECRET$.

I'm interested in all observations about this problem, not only the optimal solution.

  • 0
    @zxcv: The optimal search pattern is _always_ in descending order of probability; the challenge is in finding an algorithm to enumerate the strings in this order. If the bits are independent, you may want to look at enumerating the strings in descending order of the log-likelihood $\lambda(s) = \log p(s) = \log \prod_{i=1}^n p(i,s_i) = \sum_{i=1}^n \log p(i,s_i)$, which is a linear map from $\{0,1\}^n$ to $\mathbb R^- \cup \{-\infty\}$; by the monotonicity of the logarithm, p(a) < p(b) iff \lambda(a) < \lambda(b).2012-06-27

1 Answers 1

2

Since the metric is the number of queries to the oracle, how the words are represented doesn't matter so we can simply say that the secrets are words $w_1,\dots,w_n$, with probability $p_1\ge\dots\ge p_n$ (satisfying $\sum_i p_i=1$).

Now the probability of success of any algorithm after $k$ queries to the oracle is at most $\sum_{i=1}^k p_i$. Since the algorithm that simply queries $w_1,\dots,w_n$ in that order achieves this upper bound, that algorithm is optimal in every reasonable sense of the word (in particular, it has minimum expected number of queries, median number of queries, and so on).

So now, back to the original problem: to generate $w_1,\dots,w_n$ with decreasing probability, let's assume (up to flipping and sorting of the bits) that $p(i,0)\ge p(i,1)$, and $p(1,0)\ge p(2,0)\ge\dots\ge p(n,0)$.

Initially, let the candidate set be $\{0^n\}$. Then, generate a new word by selecting the most probable word $w$ from the candidate set (and removing it), and update the candidate set by adding all words $w'$ equal to $w$ except for one bit that is changed from 0 to 1. There are at most $np$ words in the candidate set (where $p$ is the number of words already generated) and computing the probability of a given word is $O(n)$, so the algorithm is polynomial in the number of requests to the oracle (and in $n$).

  • 0
    E$x$actly. Don't forget to upvote when accepting :)2012-06-27