5
$\begingroup$

When my prof introduced us to the N/NP topics, the first thing he mentioned is input size, which he defines as the number of bytes needed to describe and write a problem's input into a file. Could someone please explain to my why the input size matters to these topics? Thank you.

EDIT: I'm concerned about why this way of measuring size is important, esp. to this P/NP topic. My prof mentioned the pseudo-polynomial running time (of the knapsack problem) which is somewhat relevant to this way of counting input size. I'm not sure how it is connected to the NP picture, mostly because right after redefining input size, he went into reduction examples and there is no mentioning of input size since. And for NP-hard problems, since there is no known way to solve them efficiently, why should we care about input anyway?

  • 0
    @JenniferDylan this is done by directly analyzing the algorithm. At the very least, (essentially) every nontrivial algorithm must read its whole input, and since there are something like $1000000^{1000000}$ graphs on $1000000$ vertices, a general encoding is going to require many-certainly over $1000000$-bits, so an algorithm working with such a graph will take much longer on large inputs.2012-08-14

2 Answers 2

1

The simplistic is answer is that complexity classes are defined in terms of the size of the input (of course there are other ways, via descriptive complexity and so on, but I'll stick to the basics). So introducing the size of the input is essential to defining the classes.

Naturally the question then becomes why define these things in terms of the size of the input? The whole raison d'être for these classes is to work out how much of a resource we need to expend to solve a problem. Perhaps we're interested in how much time it takes, or how much space, or some other resource, but above all we want to know what's possible and how much it'll cost us. Like the other comments and answer have said, it's pretty reasonable to expect that giving a bigger input will consume more of the resource we're interested in. Moreover, it's kind of uninteresting to see how much time/space/etc. it takes to solve a single instance - what we want is a classification that tells us how much we need to expend to solve any possible instance, without just trying it and seeing (imagine checking how long it took to solve a particularly hard instance of a particularly hard problem and finding out it took 10 years, or worse, the lifetime of the universe).

So then of course we need some way of relating the instance to the amount of resources needed to solve it. The first natural measure is the size of the instance - if you give me a bigger instance, it takes more time/space/etc.

  • 0
    On top of this, the classical split where we call $P$ "efficient" and everything outside $P$ "infeasible" is not really an accurate portrayal of reality, there are $NP$-hard problems that have excellent, practical algorithms (via parameterized complexity, exact exponential algorithmics, approximation, randomisation etc.) and problems in $P$ that have fairly miserable algorithms. In both cases, we still use the size of the input as the basic measure when talking about the complexity, e.g. we can solve SAT in time bounded by roughly $2^{n}$ (yes, there's some detail missing there).2012-08-14
1

Let's look at a simple example:

Sort the following numbers in ascending order:

8 485 314 2 146 

Now sort the following numbers in ascending order:

72 277 304 326 287 152 205 68 444 250 70 490 384 81 180 126 397 402 259 295 

I'm willing to bet that you were easily able to sort the first set of numbers in your head but the second list takes you quite a bit more time.

Okay, so sorting is actually a problem in P, not just NP. A big part of studying complexity of algorithms is determining how the time to solve a problem is affected as input sizes get larger and larger.