4
$\begingroup$

It seems to my uneducated mind that if I have $\frac{n}{k}$ subproblems, each of size $\Omega (k \log k)$, that my overall problem must be at least of size $\Omega (n \log k)$, by the reasoning that if I have $\frac{n}{k}$ things, each of which is at least $k \log k$, I should be able to multiply those together to get a lower bound for their sum.

However, my TA seems to disagree with me, stating that addition is not defined on $\Omega (f(x)) + \Omega (g(x))$, and therefore I can't sum or multiply the subproblems in this fashion.

I was wondering if someone could explain the error in my reasoning.

  • 0
    @Alon: My point was that the OP is not just working with functions of $n$ and adding them. He/she is working with complexities, problems, and algorithms.2011-01-03

2 Answers 2

6

Your original reasoning is indeed problematic, since you're adding together a variable number of things. Suppose you have $n$ subproblems, each of which takes constant time; can you conclude that the whole thing is $O(n)$, or $\Omega(n)$? No, since the constants may vary: the $i$th subproblem may take time $1/i$, or $i^3$; each is still a constant (independent of $n$), but their sum could be a constant, or grow with $n$ - you can't tell. It depends on the distribution of the constants, and that's something the notation $O$ and $\Omega$ conceals (deliberately, of course).

However, as Moron pointed out, adding together just two (or any constant number) of functions is fine. Thus I'm not sure I understand the TA's claim.

  • 0
    @Alon: That was my point too. It depends on what you consider "variable" when you consider the sub-problems. btw, having a variable number of sub-problems is not unheard of in algorithm design and the complexities there are usually found by adding together the bounds for the variable number of sub-problems. Anyway, I suppose that is a semantic issue we are talking about...2011-01-03
2

What you need to be talking about is $\displaystyle \Omega(f+g)$ and not $\Omega(f) + \Omega(g)$ as $\Omega(f)$ and $\Omega(g)$ are sets.

If $\displaystyle |h(x)| \ge c_1 |f(x)|$ and $\displaystyle |h(x)| \ge c_2 |g(x)|$ then we have that

\displaystyle |h(x)| \ge \frac{c}{2} (|f(x)| + |g(x)|) \ge c' |f(x) + g(x)|, where $\displaystyle c = \text{min}\{c_1, c_2\}$.

Thus if $\displaystyle h \in \Omega(f)$ and $\displaystyle h \in \Omega(g)$, then $h \in \Omega(f+g)$.

But in your case, even though you have a variable number of sub-problems, I am not really sure I completely agree with Alon.

It really depends on what you consider to be variable when you consider each sub-problem.

If $n,k$ are the variables, and you have a a constant $c$ (independent of $n$ and $k$) such that each of the $\frac{n}{k}$ independent sub-problems is of size at least $c k \log k$, then you can add them up the way you did, to get the total size.

For instance consider the problem where an array is almost sorted: each element is within $k$ of its original position. You can sort the array by sorting $\frac{n}{k}$ chunks of $2k$ elements each. If you wanted to consider the total time taken, say with the algorithm you picked for sorting, each chunk always takes $\Omega(k\log k)$ time, then the total time is indeed $\Omega(n \log k)$.

If you had the $i^{th}$ subproblem constant, be a function of $i$, then it is not really a constant, as it depends on $\frac{n}{k}$.

Now if you allowed the $i^{th}$ sub-problem constant to be independent of $k$, but not $n$, then adding them up becomes kind of meaningless and could lead to incorrect results, as Alon noted.

Note: If you were trying to prove an $\Omega(n \log k)$ lower bound for sorting in the above problem, by considering $n/k$ sub-problems of size $k$, just adding like that is not a proof of overall lower bounds. In the worst case, each sub-problem could be $\Omega(k \log k)$, but when you hit the worst case for one sub-problem, you might be hitting the best case for another. So even claiming $\Omega(k \log k)$ lower bounds for each sub-problem needs more proof.

You could add up for the upper bounds though. You can say that an algorithm would be $\mathcal{O}(n \log k)$ in the worst case (for instance, if you picked heap-sort), even though some sub-problem might take $\mathcal{o}(k \log k)$ comparisons (note: small oh), the total number of comparisons is essentially $\mathcal{O}(n \log k)$.