You probably think you know everything about what the Greatest Common Divisor is and how to calculate it quickly! Well, that’s most probably the case, but just in case you do not remember all the details then here’s a short post about the GCD! (*All the code is available here*)

### Simple GCD

Let’s start with the basic procedure everybody learn in the first week of any introduction to computer science class:

ll safe_gcd(ll u,ll v) { if( v == 0 ) { return u; } return safe_gcd( v, u % v ); }

I call this the *safe_gcd* since is very easy to remember and hard to screw up when coding. This code implements the original algorithm described in the book *Elements* from Euclid which is supposedly being one of his inventions, although there are people which believe that it might be already know some two hundred years before Euclid, or at least by a certain Eudoxus about 100 years before Euclid.. But that’s not that relevant after all..

The procedure works perfectly and it’s simple to code, an iterative implementation follows immediately from the recursive code:

ll safe_gcd_it(ll u,ll v) { while(v) { u %= v; v ^= u; //This is just a swap! u ^= v; v ^= u; } return u; }

This algorithm works perfectly and has served humanity for hundreds of years, it will require at most steps to calculate the GCD of a,b where . Things get’s interesting if one attempt to make some improvement on this basic algorithm.

### Optimized simple GCD

One possible optimization designed to reduce the amount of step required to calculate the GCD is obtained by noting that given the division algorithm: where () if *a* and *b* are integers with then there is a unique pair of integers *q* and *r* such that , which in turn means that if then can be substituted by or depending on* b* being greater or smaller than zero (*Make sure to always keep the absolute value of r – b*)

Let’s see how this work calculating the GCD of two very big numbers

This is list of steps performed by the standard algorithm:

And this is the list of steps performed by the improved algorithm:

Well, looks good! from the *36* steps required to calculate the GCD with the standard algorithm to just *23* with the improved one, on average this implementation will be faster of a factor of where , but keep in mind that the increased amount of operations which the computer need to perform during each iteration will reduce slightly the advantage of this procedure over the basic implementation, let’s see the code:

ll optimized_gcd(ll u,ll v) { if( v == 0 ) { return std::abs(u); } const ll t = u % v; if( t > std::abs(v) / 2 ) { if( v > 0 ) return optimized_gcd( v , std::abs(t - v) ); return optimized_gcd( v, t + v ); } return optimized_gcd( v , t ); }

As you can see there’s some more computing here, but at least on paper this is faster the the basic algorithm, see the last paragraph for some benchmarks results.

### Binary GCD

In 1961 Josef Stein discovered a cleaver way of improving the GCD algorithm by exploiting four simple facts about positive integers and GCD:

*a*and*b*are both even then- a is even and b is odd then
- from Euclid’s algorithm:
- if
*a*and*b*are both even then is even and

For more details you might have a look at Wikipedia or much preferably to the venerable Knuth bible. Given those facts a very basic implementation of the binary algorithm is the following:

ll binary_gcd(ll u,ll v) { if(!u) return v; if(!v) return u; if(u==v) return u; ll k{ 0 }; while( u % 2 == 0 && v % 2 == 0 )//Rule n. 1 u /= 2 , v /= 2, ++k; ll t{ u - v };//Rule n. 3 while( t ) { while( t % 2 == 0 )//Rule n. 2 t /= 2; if( t < 0 )//Rule n. 4 (Replacing the larger of the two) v = -t; else u = t; t = u - v;//Rule n. 3 } return std::pow(2,k)*u;//From rule n. 1, remember we've reduced u and v by 2^k }

This is a little more complex implementation that the previous one, but should be reasonable faster in real applications even if the asymptotic bound is the same as for the original GCD algorithm. After a short evaluation you might notice that this is exactly the same code as the *safe_gcd_it* with one additional loop for exploiting the rule n. *1/2* and few additional operations for the rule *4* and *3*.

### Conclusion

To verify which of those algorithms is faster let’s run a benchmark, you might test this code by yourself cloning it from my repo here.

To test those algorithms I’ll generate a certain amount of random pair of numbers* {a,b}* which then I’ll use to feed each of the functions we have seen here, the time required to execute the procedures is taken by the venerable *chrono::high_resolution_clock* functionality.

Running this simple bench program on a set of *5000000* pair of numbers with the code compiled with *-O3* using the latest clang, the output is the following:

```
Generating data
Running benchmarks
Results:
binary_gcd 2037ms 1
optimized_gcd 2192ms 1.07609
safe_gcd_it 2265ms 1.11193
safe_gcd 2294ms 1.12617
```

Looks like the safe and simple implementation of the GCD algorithm is the slower one! The faster algorithm seems to be the one discovered by Stein, that is somehow expected since even from the theoretical standpoint the average complexity of the binary GCD is out-performing the basic GCD.

I attempted a second test changing the input data, this time on a sequence of increasing integers where where *n* is the amount of input. This time the sequence will not contain very big numbers but at most *n *and thus the difference between two *a,b* would be no at most *n*, whereas in the previous test the difference might have been at most* numeric_limits<long long>::max()*, the output is the following:

```
Generating data
Running benchmarks
Results:
binary_gcd 838ms 1
optimized_gcd 893ms 1.06563
safe_gcd 928ms 1.1074
safe_gcd_it 935ms 1.11575
```

In this particular execution *safe_gcd_it* is slower than *safe_gcd*, but running this experiment multiple times you’ll notice that those two functions frequently switch between themselves the last two position of the result list (*Are always slower than the other algorithms*)

I ran this latest test with the same amount of input (*5000000*) but as you can see now the execution time for all of them is lower than the previous test, the reason is that is much less frequent to encounter extreme values of input where or vice-versa, which means than on average the amount of steps will be smaller (*For all the algorithms*).

But still, seems that the binary version of the GCD algorithm is the faster, followed by the optimized Euclid algorithm, that’s nice to know.

Thanks for reading!