3
$\begingroup$

Given $S$, a set (unsorted) of $n$ real numbers;

and a real number $x$;

what is the fastest algorithm to determine if $S$ contains two numbers whose sum exactly equals $x$?

We could add each element of $S$ with each other element to determine if a pair whose sum is $x$ exists, but that would be $O(n^2)$ and I wonder if a faster approach exists.

  • 1
    Are all of the numbers random? because if they are not you should look for patterns.2012-07-24

3 Answers 3

10

I don't know if it is optimal, but I sorted the numbers ($O(n \log n)$), then added the largest to the smallest. If the sum is too small, move the lower one up. If the sum is too large, move the larger one down. Continue until you find a pair or the ends meet. The second phase is $O(n)$ as you only make $n-2$ changes at most.

  • 3
    Also, add the two largest and the two smallest to see if X is inside that interval.2012-07-24
5

Sort $S$ (in time $O(n \log n)$), and then for each $k \in S$ check to see if $x-k$ is in $S$ via binary search ($O(\log n)$ for each of $n$ searches). So you can do the whole thing in $O(n \log n)$ time (assuming your arithmetic operations are constant-time, but that caveat applies equally to your $O(n^2)$ algorithm).

5

Assume you have a data structure that can act as a "store" for your numbers.

Iterate over the elements of $s_i\in S$ and check if $x-s_i$ is in your store. If so you have found your answer, otherwise add $s_i$ to the store.

If you use a heap data structure as your store then inserting and lookup are $\Theta(\log n)$ and you get @Micah's algorithm (in effect insertion is sorting, lookup is a binary search).

But if you have other conditions on the elements of $S$ then you can use a hash table as your store to do better. In general you should be able to get an average case $O(n)$ (allowing for a worse $O(n^2)$ worst case) for a broad class of "random" numbers.

However, if you know the elements of $S$ come from a finite subset of $\mathbb R$ (e.g. integers within given bounds or 32-bit IEEE floating point represented numbers), then without space constraints you can construct a perfect hash function and get a $\Theta(n)$ worst case performance.