30
$\begingroup$

I have heard that the set of valid programs in a certain programming language is countably infinite. For instance, the set of all valid C++ programs is countably infinite.

I don't understand why though. A programming language has open curly braces and corresponding closing ones. Wouldn't one need a stack to track the braces? Hence, how can one design a DFA that accepts valid C++ programs?

  • 18
    valid c++ $\subset \Sigma^*$ and $\Sigma^*$ (all strings of alphabet $\Sigma$) is countably infinite,2012-09-29

8 Answers 8

40

Well, a valid C++ program (or really any C++ program) will simply be a finite sequence composed of a finite collection of characters and a few other things (indentation, spaces, etc.). It is a general result that the set of all finite sequences of entries from a finite alphabet will be countably infinite. To show that there are countably infinitely many valid C++ programs, you need only show there is no finite upper bound on the length of valid C++ programs.

Addendum: Another approach (an alternative to showing there is no finite upper bound on length) is to actually explicitly define (in a theoretic sense) countably infinitely many valid C++ programs. For example, for a given positive integer, the program that simply prints said integer, then ends (as I mentioned in the comments below).

The following program template should do the trick:

 #include  using namespace std;   int main ()  {  cout << "___________";  return 0;  } 

That "____" part is the spot where you'd type in whatever positive integer you wanted the program to print out--whether that be $1$, or $23234$, or $1763598730987307865$, or whatever--instead of the underscores.

Now, obviously, no matter how fast you can type, there are integers big enough that you couldn't finish typing in your lifetime, so in practice, there are programs of this type that you could never finish. Even if such a program were handed to you, you'll certainly run into memory problems for sufficiently large integers (depending on the computer), but should still be valid programs. We can say that such programs all exist in a "theoretical" sense. That is, given sufficient memory and power to store and run it--necessarily a finite (though perhaps prohibitively large) amount--and given sufficient time to program and run it--necessarily a finite (though perhaps prohibitively long) amount--this program will do what it's supposed to do.

Please don't give me any grief about the heat death of the universe or anything like that. ;)

  • 0
    @celtschk Yeah, we have to talk about C++ independent of actual implementations. Check MSVS compile error [`C1091`](http://msdn.microsoft.com/en-us/library/f27ch0t1(v=vs.90).aspx) for instance, which says that string constants in can't exceed certain lengths to compile successfully.2013-11-07
28

Countably infinite doesn't mean regular. The C++ grammar isn't regular. In fact, it isn't even context free. Yet, the set of all valid C++ programs is countably infinite. To see why, first notice that it's infinite. No matter what $n \in \mathbb{N}$ you pick, you can always write a C++ program that is longer than $n$. Next, let $S_n$ be the set of all C++ programs of length $n$. Each $S_n$ is finite. The set of all C++ programs (of all possible lengths) is a countable union of sets $S_n$:

$ S = \bigcup_{n=0}^\infty S_n $

Since the countable union of countable (or finite) sets is at most countable, we conclude that the set of all valid C++ programs is countable.

  • 0
    @Tim: Yep! Turns out it is precisely the Axiom of Countable Choice for families of countable sets. (Just learned of that equivalence recently, actually.)2013-10-18
14

A C++ program is a finite sequence of characters in a specified finite alphabet. The set of all finite sequences of characters in that alphabet is countably infinite. The set of all valid C++ programs is a subset of the set of all finite sequences of characters in that alphabet. An infinite subset of a countably infinite set is countably infinite.

(It's infinite because there is no finite upper bound on the lengths of C++ programs.)

6

I propose the following:

  1. Each natural number is a program (a file is nothing but a very large number).

  2. Some of these programs are valid C++ programs.

If we show now, that for every valid C++ program n, there exists a program n + m that is a valid C++ program as well, the number of C++ programs is countable infinite.

  1. Let n_0 be a classical hello world program.

  2. for every n, there is a m that adds a trivial line to n ( cout<<"Hello!";)

  3. Proofed.

2

As several posters already have pointed out, the set of valid c++ programs is countably infinite.

The OP's concern has some merit though. On an actual computer, the memory is finite, so a valid program is not just a certain finite string, but a finite string of bounded length, and thus the set of valid, parsable programs on a specific computer is finite (but extremely large).

  • 0
    While on any given computer the memory is finite, you still can just build a bigger computer. Also, there is nothing in the C++ standard demanding that a valid C++ program must at some instance of time be stored completely on the computer.2012-09-30
1

It is clear that since a C++ program is generated from a finite alphabet that the number of programs is at most countable.

To see that it is countable, consider the programs $P_n$ defined by

int main() { ; } 

where '< n >' is replaced by the decimal representation of $n$.

It is easy to see that $n \mapsto P_n$ is injective.

Only two braces needed.

  • 0
    @skyking: Good point, perhaps `{ /* n times */ 1; 1; ... 1; 1; }` instead?2016-02-02
1

The C++ differs from C in that part as it doesn't define limits of what the compliation or execution environment is required to handle. Consequently there are valid C++ programs that cannot be compiled by any implementation. Say you for example in your main nests $2^{2^{64}}$ curly brackets, the standard allows for this but there is no computer available that can count them, less store the source.

As pointed out in other comments. The set of finite strings with a finite non-empty alphabet is a countable (infinite) set. Some strings composed of the ASCII alphabet are valid C++ programs. So the programs are countable.

To see that they are infinite one could use the construct suggested above. Take a trivial program where main just returns 0. You can write the return statement as return followed by $N$ opening parentheses, a zero, followed by $N$ closing parentheses and followed by a semicolon (where $N$ is any natural number).

The C standard on the other hand puts limits on what the compiler is required to handle - here we end up with (probably) finitely many valid C programs that any compiler is required to handle.

0

The set of programs is countable: Set of all finite strings ( A subset such as set of all valid programs will be countable as well)

The set of all valid programs is not finite: If it is simply add a blank remark at the end of the largest program in the set....so the set of all valid programs has no largest program hence infinite.

Alternatively Since C++ is a Turing complete language and the total no of different Turing computable integer sequences are infinite the total number of valid programs can not be finite.

  • 0
    @skyking I completely agree. The trivial method though serves the purpose.2016-02-01