4
$\begingroup$

It seems to my uneducated mind that if I have $\frac{n}{k}$ subproblems, each of size $\Omega (k \log k)$, that my overall problem must be at least of size $\Omega (n \log k)$, by the reasoning that if I have $\frac{n}{k}$ things, each of which is at least $k \log k$, I should be able to multiply those together to get a lower bound for their sum.

However, my TA seems to disagree with me, stating that addition is not defined on $\Omega (f(x)) + \Omega (g(x))$, and therefore I can't sum or multiply the subproblems in this fashion.

I was wondering if someone could explain the error in my reasoning.

  • 0
    When you add two functions with the same lower bound you can get zero, or something much smaller than the original two functions if there is a cancellation. In terms of algorithms, it maybe possible to solve a big problem easier than the sum of its smaller versions, although I am not sure this is what you had in mind. I think you should clarify your question. Is it lower bound for the complexity of a problem, or of a specific algorithm, or is it a lower bound for memory?2011-01-01
  • 0
    @Timur, it's a complexity problem. The particular problem we were assigned had to do with the lower bound on the number of comparisons necessary to sort a set if I knew that it was already pre-sorted in some way.2011-01-01
  • 1
    I can say that in general you cannot *just* add complexities, you need some additional reasoning to justify what you are doing. I suggest you to think carefully, and it may help not to use undefined words such as "sum of problems" because they give a false impression that you should also take the sum of their complexities (unless you have a precise definition of what "sum of problems" means and you do not confuse this definition with the usual sum of numbers, of course).2011-01-01
  • 1
    Agree with timur, seems like the problem is not about what the Omega will be if we can add the sub-problem sizes, but about whether we can even add them at all, or even if you can claim Omega for each independent sub-problem. Would you care to elaborate exactly what your proof was and what the objections were?2011-01-02
  • 0
    Again, I don't think I agree. There's a simple and precise mathematical meaning for $f = O(g)$, and it's true that $f1+f2 = O(|g1|+|g2|)$ whenever $f1=O(g1)$ and $f2=O(g2)$. No additional reasoning is needed. At the same time, if $f_i$ is a family of functions, then determining a bound for $\sum f_i$ can indeed be tricky, as I try to point out in my answer.2011-01-03
  • 1
    @Alon: Not sure which comment you are referring to. None of the above comments mentions BigOh.2011-01-03
  • 0
    @Alon: My point was that the OP is not just working with functions of $n$ and adding them. He/she is working with complexities, problems, and algorithms.2011-01-03

2 Answers 2

6

Your original reasoning is indeed problematic, since you're adding together a variable number of things. Suppose you have $n$ subproblems, each of which takes constant time; can you conclude that the whole thing is $O(n)$, or $\Omega(n)$? No, since the constants may vary: the $i$th subproblem may take time $1/i$, or $i^3$; each is still a constant (independent of $n$), but their sum could be a constant, or grow with $n$ - you can't tell. It depends on the distribution of the constants, and that's something the notation $O$ and $\Omega$ conceals (deliberately, of course).

However, as Moron pointed out, adding together just two (or any constant number) of functions is fine. Thus I'm not sure I understand the TA's claim.

  • 0
    Hi @Alon, thanks for the answer. I'm not sure where Moron's answer went to, though I saw it for a few minutes before I went to bed. In my case, we actually knew that the subproblems were identically sized, so the constant (whatever it was) was the same.2011-01-01
  • 0
    $\mathcal{O}(.)$ and the other symbols denote classes of functions. There is indeed no $+$ defined on sets of functions (that I knew of); its common use is abuse of notation.2011-01-01
  • 1
    @Alon: If the $i^{th}$ sub-problem constant is $i^3$, then it is not really a 'constant'!2011-01-01
  • 0
    @dsol: I had deleted my answer as it was incomplete. Since I don't completely agree with Alon's answer, I have undeleted mine.2011-01-01
  • 0
    @Moron: yes, it is, as it doesn't depend on n. The notation O() implies a certain parameter, and anything which doesn't depend on this parameter is O(1). The point is that the 6th subtask takes 216 steps regardless if n is 7 or 7,000.2011-01-02
  • 0
    @dsolimano: it's very common that your constants are uniform and I was expecting it to be the case - but formally your TA was right in saying that from Om(k log k) per subtask you cannot conclude Om(n log k) for the whole thing.2011-01-02
  • 1
    @Alon: The constant of the last subtask is $n^3$ for instance.2011-01-02
  • 0
    @Moron: True, but being "last" isn't a feature of the subtask. The subtask which takes $n^3$ steps continues to take the same number of steps as you increase $n$. There's now a different subtask that takes $n^3$ steps for this new value of $n$, but this does nothing to change the fact that the "old" last subtask takes $O(1)$ steps.2011-01-03
  • 1
    @Alon: Suppose I had an algorithm which given an array of size $n$, split into tasks of size 1 to n, each of which found the minimum of the first i elements. According to your logic, each sub-task is O(1)!2011-01-03
  • 0
    @Alon: quote "There's now a different subtask that takes $n^3$ steps for this new value of $n$, but this does nothing to change the fact that the "old" last subtask takes $O(1)$ steps." $n$ is the number of subproblems, so if you increase $n$ you have to divide the original problem into even smaller pieces, and the "old" task is no more there.2011-01-03
  • 0
    @Moron: Suppose you had infinitely many machines, each of which is capable of just one task: finding the minimum of some constant number of elements. What's the time complexity of their operation? That's a slightly silly question since each machine solves a problem of fixed size, but if anything, the complexity of each machine is constant. You can implement your algorithm by calling the i'th machine at the i'th step. So, true, each subtask takes $O(1)$ time, and put together they happen to take $O(n^2)$ - this was my point, you can't add up a variable number of bounds.2011-01-03
  • 0
    @timur: why do I have to divide the original problem into even smaller pieces? this may or may not be the case.2011-01-03
  • 0
    @Moron, @timur, I feel like I was drawn into an almost semantic discussion that I don't particularly feel good about. MY original point was really much simpler. If you add up a fixed number of functions each of which has an $O()$ estimate or $\Omega$ estimate, you can estimate the sum. If you add up a variable number of such functions, you can't. Can we agree on that?2011-01-03
  • 0
    @Alon: I agree. I think my uneasiness was because in your setup BigOh mean two different things for the subproblems and for the whole thing. If one wants to draw conclusion with BigOh the implicit constants can depend *only* on parameters that are kept fixed throughout the whole process of reasoning.2011-01-03
  • 0
    @Alon: That was my point too. It depends on what you consider "variable" when you consider the sub-problems. btw, having a variable number of sub-problems is not unheard of in algorithm design and the complexities there are usually found by adding together the bounds for the variable number of sub-problems. Anyway, I suppose that is a semantic issue we are talking about...2011-01-03
2

What you need to be talking about is $\displaystyle \Omega(f+g)$ and not $\Omega(f) + \Omega(g)$ as $\Omega(f)$ and $\Omega(g)$ are sets.

If $\displaystyle |h(x)| \ge c_1 |f(x)|$ and $\displaystyle |h(x)| \ge c_2 |g(x)|$ then we have that

$\displaystyle |h(x)| \ge \frac{c}{2} (|f(x)| + |g(x)|) \ge c' |f(x) + g(x)|$, where $\displaystyle c = \text{min}\{c_1, c_2\}$.

Thus if $\displaystyle h \in \Omega(f)$ and $\displaystyle h \in \Omega(g)$, then $h \in \Omega(f+g)$.

But in your case, even though you have a variable number of sub-problems, I am not really sure I completely agree with Alon.

It really depends on what you consider to be variable when you consider each sub-problem.

If $n,k$ are the variables, and you have a a constant $c$ (independent of $n$ and $k$) such that each of the $\frac{n}{k}$ independent sub-problems is of size at least $c k \log k$, then you can add them up the way you did, to get the total size.

For instance consider the problem where an array is almost sorted: each element is within $k$ of its original position. You can sort the array by sorting $\frac{n}{k}$ chunks of $2k$ elements each. If you wanted to consider the total time taken, say with the algorithm you picked for sorting, each chunk always takes $\Omega(k\log k)$ time, then the total time is indeed $\Omega(n \log k)$.

If you had the $i^{th}$ subproblem constant, be a function of $i$, then it is not really a constant, as it depends on $\frac{n}{k}$.

Now if you allowed the $i^{th}$ sub-problem constant to be independent of $k$, but not $n$, then adding them up becomes kind of meaningless and could lead to incorrect results, as Alon noted.

Note: If you were trying to prove an $\Omega(n \log k)$ lower bound for sorting in the above problem, by considering $n/k$ sub-problems of size $k$, just adding like that is not a proof of overall lower bounds. In the worst case, each sub-problem could be $\Omega(k \log k)$, but when you hit the worst case for one sub-problem, you might be hitting the best case for another. So even claiming $\Omega(k \log k)$ lower bounds for each sub-problem needs more proof.

You could add up for the upper bounds though. You can say that an algorithm would be $\mathcal{O}(n \log k)$ in the worst case (for instance, if you picked heap-sort), even though some sub-problem might take $\mathcal{o}(k \log k)$ comparisons (note: small oh), the total number of comparisons is essentially $\mathcal{O}(n \log k)$.