4
$\begingroup$
   procedure bit count(S: bit string)
    count := 0
          while S != 0
              count := count + 1
              S := S ∧ (S − 1)
    return count {count is the number of 1s in S}

Here S-1 is the bit string obtained by changing the rightmost 1 bit of S to a 0 and all the 0 bits to right of this to 1s.

So I understand why this is correct, and I have write a rough explanation;

After every iteration, the rightmost 1 bit in S, as well as all the bits to the right of it, is set equal to 0. Thus, after each iteration, the next right-most 1 is accounted for and set to 0, until the entire string is 0's and the loop breaks with the count equal to the number of 1's.

I know this kind of answer won't pass in any mathematics community, so I was hoping to write a formal proof, but I don't know how to go about doing that. My proof skills are particularly shoddy, so an explanation of the techniques involved would be greatly appreciated.

  • 0
    Is ^ and or xor?2012-12-05
  • 1
    @copper.hat: It has to be $\land$ for the algorithm to work.2012-12-05

3 Answers 3

2

Here’s one way to make your informal explanation a bit more formal.

Let $\sigma=b_1b_2\dots b_n$ be a bit string of length $n$. For convenience let $[n]=\{1,\dots,n\}$. If $\sigma$ is not the $0$ string, let $k=\max\{i\in[n]:b_i=1\}$, so that $b_k=1$ and $b_i=0$ for $i>k$. Then $\sigma-1$ is the bit string $a_1a_2\dots a_n$ such that for each $i\in[n]$

$$a_i=\begin{cases} b_i,&\text{if }ik\;. \end{cases}$$

Then $$\sigma\land(\sigma-1)=(b_1\land a_1)(b_2\land a_2)\ldots(b_n\land a_n)\;,$$ and for $i\in[n]$ we have

$$b_i\land a_i=\begin{cases} b_i,&\text{if }i

$b_k\land a_k=1\land 0=0$, and $b_i\land a_i=0\land 1=0$ for each $i>k$. In other words, if $$\sigma\land(\sigma-1)=c_1c_2\dots c_n\;,$$ then

$$c_i=\begin{cases} b_i,&\text{if }i

Recalling that $b_k=1$ and $b_i=0$ for $i>k$, we see that $\sigma$ and $\sigma\land(\sigma-1)$ differ only in the $k$-th bit, where $\sigma$ has a $1$ and $\sigma\land(\sigma-1)$ has a $0$.

If $|\sigma|_1$ is the number of $1$’s in $\sigma$, we’ve just shown that

$$|\sigma\land(\sigma-1)|_1=|\sigma|_1-1\;.$$

Thus, each pass through the loop adds $1$ to the count and reduces the number of $1$’s in the string by $1$. (In other words, $\text{count}+|\sigma|_1$ is a loop invariant for this loop.) Since $\text{count}$ starts at $0$ and $|\sigma|_1$ at the number of $1$’s in the input bit string, it’s now clear that after $|\sigma|_1$ passes through the loop, the input string will be the zero string, and $\text{count}$ will be the number of $1$’s in the input string. And since the input is now the zero string, we exit the loop with no further change in $\text{count}$.

1

What you observed is already quite good. If you would like to argue more formal I would suggest to prove first a lemma that says.

Lemma: If $S>0$ then the binary representation of $S$ has one digit 1 more than the binary representation of $S \land (S-1)$.

$Proof.$ Let the binary representation of $S\neq 0$ be the string $s=\text{bin}(S)$. We assume that $s$ ends on $v=10^k$ and that $s=uv$. Subtracting 1 from $S$ gives in the binary representation a string $u01^k$. A bit-wise AND of $S$ and $S-1$ is therefore the string $u0^{k+1}$, which proves the lemma.

Use this lemma and a loop invariant to prove the correctness of your algorithm.

1

Actually, I think you have a very good start.

What I think is needed are these ideas:

(1) Explicit specifying of all the locations in S where the one bits are.

(2) An analysis of what the critical step S := S ∧ (S − 1) does.

Here is my take on (2).

Suppose S, at this point, has k zero bits on the right with the k+1st bit a one. Then $S = a 2^k$, with a an odd integer (if $a$ were even, the j+1st bit would be zero).

Then $S-1 = a 2^k-1 = (a-1)2^k + 2^k-1$ so S ^ (S-1) = $(a-1)2^k$ since (1) the k lower bits of S are zero, and (2) since a is odd, a-1 is even, so the k+1st bit of S-1 is zero. Furthermore, since a is odd, all the bits of a-1 are the same as the bits of a except for the low order bit.

Thus the bits of S ^ (S-1) are the same as the bits of S except that the lowest one bit has been set to zero.

This lemma (which is what it really is) allows you to set up an inductive hypothesis that each time through the loop the current low order one bit has been set to zero and count is incremented by one.

At the end, count equals the number of one bits in the original number.

I have great admiration for whoever first came up with this, and I thank John Taylor for providing me with the opportunity of doing this analysis.

I had seen this algorithm before, but had never seen a proof, and I enjoyed working this out.

Aha! I see that Brian M. Scott has answered as I was entering my answer, and that our two analyses are essentially the same with quite different notations.