Counting bits! A lot of bits!


Tomorrow is an important day for me and I promised to myself that I will just chill all the day before, you know in order to keep the mind clean and fresh.. But.. I found this interesting problem and you know what, let’s work it out 🙂

The problem is: You’re given two numbers reppresented in 2’s complement notation: a and b where –2^31,<=a<=b<=2^31-1, and you’re asked to calculate the amount of bits set to 1 you may encounter when reading the binary rappresentation of all the numbers from a to b (inclusive).

So for example, if a=0 and b=7 then f(a,b)=12. If you count: 000, 001, 010, 011, 100, 101, 110 ,111 and sum all the one’s then you get 12. Since we are dealing with 2’s complement representation then one must remember that for example -1 is ~1+1, so basically it have 32 ones.. LINK for details 🙂

The brute force approach is just to loop from a to b summing the hamming weight of all the numbers, since we can perform the calculation in O(1) and we’re going to loop potentially from -2^31 to 2^31-1 (some 4 billions iteration) well, the performance is going to be a nightmare! Well, O(n) is not really so bad everywhere else, but here we really want to do better.

Anyway, the brute force code is the following:

int bit_count(long n)
     n = n - ((n >> 1) & 0x55555555);
     n = (n & 0x33333333) + ((n >> 2) & 0x33333333);
     return (((n + (n >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;

ll brute_force(long a,long b)
	ll ans{0};
		ans += bit_count(a);
	return ans;

bit_count calculate the hamming weight for the given number n. That’s all. Now let’s solve the very same problem doing some analysis first:


Here we have all the numbers from 0 to 15, now notice that the rightmost column (8-15) is identical to the left column but each number has one additional 1 (MSB), in the same way the numbers from 4-7 have the same amount of bits that the one from 0 to 3 but with one additional MSB. And so on..

Seems we have a pattern, now we need to notice one more thing: Look just a the left column and count the number of bits set to 1 in each subcolumn, you’ll notice that each subcolumn has 4 bits set.


The only exception is the first subcolumn which has only 0. But still, if you count all the 1 the you get 12! How many subcolumn do we have? Well, looks like we have exactly k=log2(n) for each n=2^k-1, this hold also for n=15 which is 2^4-1 and have 4 subcolumn. Please note that 0..7 has 4 bit in each subcolumn, and 0..15 has 8 bit in each subcolumn, or just 2^(k-1) .

So basically to calculate the amount of bits set to 1 in ALL the numbers between 0 and 2^k-1 all we need to do is: k*2^(k-1). But what if n is not a power of 2?? Let’s pick 12, and let’s notice the solution for n=12 we have:

  • For sure it will have more than k*2^(k-1) bits set, where k=floor(log2(n)), as we already have seen before.
  • n-2^k is the amount of numbers for which we need to calculate the amount of bits, which is 4 for n=12, and here we can exploit the first thing we have noticed from the table, the bit pattern between 0..4 and 8..12 is exacly the same for the first 3 leftmost subtables!
  • Now we miss only the additional 1 appearing from 8 to 12, we have n+1-2^k of them

Nasty huh?? The solution is f(n)=k*2^(k-1)+f(n-2^k)+(n-2^k+1), with  f(1)=1 and f(n<=0)=0. To calculate the number of bits between a and b we just can perform f(b)-f(a-1) (remember that a,b inclusive) and were are done.

But what about negative numbers? This is easy, since is sufficient to note that in 2’s complements the amount of bits set between -5 and 0 is the same as from f(2^31-1) – f(2^31-1-abs(-1)) + abs(-5) or in general: f(2^31-1)-f(2^31-1-abs(n))+abs(n). Let’s call this last fn, like f for negative. If a<0 and b>0 the clearly the solution is fn(a)+f(b).  (You may spot this equivalence by writing on paper -1,-2,-3 etc).

Here is finally the code:

using ll = long long;
constexpr long mx = 2147483647; //2^31-1

ll mpow(long n,long p)
	if(p<0) return 0; 	
        ll r{1}; 	
        while(p-->0) r*=n;
	return r;

long calc_positive(long n)
	if(n==1) return 1;
	if(n<=0) return 0;
	long x = log2(n);
	long p = mpow(2,x-1);
	return x*p + calc_positive(n-p*2) + (n-p*2+1);

long calc_negative(long n)
       return (calc_positive(mx) - calc_positive(mx - abs(n))+ abs(n));

int main()
	long n;
        ll ans{0};
	cin >> n;
	vector data(n);
	for(long i{0};i<n;i++) 		
		cerr<<"This is not good!n";
		for(ll i{0};i<data.size();i+=2) 		
 			long a = data[i];
 			long b = data[i+1];
 			if(a>=0 && b>=0)
			else if(a<=0 && b<0)
				cout<<calc_negative(a) - calc_negative(b+1)<<endl;
			else if(a=0)
				cout<<calc_negative(a) + calc_positive(b)<<endl;
	return 0;

And if you feed the code with the following input:

0 561651
-5616511 85545
-21154844 -22115
321 65816351
0 0

You’ll get:

you may check the correctness of the code by using the brute force approach, but remember, the brute force approach is way more slower, the claver implementation is O(log2(n)). No matter how many number we have between a and b, the amount of steps is virtually the same.

Thanks for reading 🙂

Leave a Reply