You probably think you know everything about what the Greatest Common Divisor is and how to calculate it quickly! Well, that’s most probably the case, but just in case you do not remember all the details then here’s a short post about the GCD! (All the code is available here)
Simple GCD
Let’s start with the basic procedure everybody learn in the first week of any introduction to computer science class:
ll safe_gcd(ll u,ll v) { if( v == 0 ) { return u; } return safe_gcd( v, u % v ); }
I call this the safe_gcd since is very easy to remember and hard to screw up when coding. This code implements the original algorithm described in the book Elements from Euclid which is supposedly being one of his inventions, although there are people which believe that it might be already know some two hundred years before Euclid, or at least by a certain Eudoxus about 100 years before Euclid.. But that’s not that relevant after all..
The procedure works perfectly and it’s simple to code, an iterative implementation follows immediately from the recursive code:
ll safe_gcd_it(ll u,ll v) { while(v) { u %= v; v ^= u; //This is just a swap! u ^= v; v ^= u; } return u; }
This algorithm works perfectly and has served humanity for hundreds of years, it will require at most steps to calculate the GCD of a,b where
. Things get’s interesting if one attempt to make some improvement on this basic algorithm.
Optimized simple GCD
One possible optimization designed to reduce the amount of step required to calculate the GCD is obtained by noting that given the division algorithm: where
(
) if a and b are integers with
then there is a unique pair of integers q and r such that
, which in turn means that if
then
can be substituted by
or
depending on b being greater or smaller than zero (Make sure to always keep the absolute value of r – b)
Let’s see how this work calculating the GCD of two very big numbers
This is list of steps performed by the standard algorithm:
And this is the list of steps performed by the improved algorithm:
Well, looks good! from the 36 steps required to calculate the GCD with the standard algorithm to just 23 with the improved one, on average this implementation will be faster of a factor of where
, but keep in mind that the increased amount of operations which the computer need to perform during each iteration will reduce slightly the advantage of this procedure over the basic implementation, let’s see the code:
ll optimized_gcd(ll u,ll v) { if( v == 0 ) { return std::abs(u); } const ll t = u % v; if( t > std::abs(v) / 2 ) { if( v > 0 ) return optimized_gcd( v , std::abs(t - v) ); return optimized_gcd( v, t + v ); } return optimized_gcd( v , t ); }
As you can see there’s some more computing here, but at least on paper this is faster the the basic algorithm, see the last paragraph for some benchmarks results.
Binary GCD
In 1961 Josef Stein discovered a cleaver way of improving the GCD algorithm by exploiting four simple facts about positive integers and GCD:
- a and b are both even then
- a is even and b is odd then
- from Euclid’s algorithm:
- if a and b are both even then
is even and
For more details you might have a look at Wikipedia or much preferably to the venerable Knuth bible. Given those facts a very basic implementation of the binary algorithm is the following:
ll binary_gcd(ll u,ll v) { if(!u) return v; if(!v) return u; if(u==v) return u; ll k{ 0 }; while( u % 2 == 0 && v % 2 == 0 )//Rule n. 1 u /= 2 , v /= 2, ++k; ll t{ u - v };//Rule n. 3 while( t ) { while( t % 2 == 0 )//Rule n. 2 t /= 2; if( t < 0 )//Rule n. 4 (Replacing the larger of the two) v = -t; else u = t; t = u - v;//Rule n. 3 } return std::pow(2,k)*u;//From rule n. 1, remember we've reduced u and v by 2^k }
This is a little more complex implementation that the previous one, but should be reasonable faster in real applications even if the asymptotic bound is the same as for the original GCD algorithm. After a short evaluation you might notice that this is exactly the same code as the safe_gcd_it with one additional loop for exploiting the rule n. 1/2 and few additional operations for the rule 4 and 3.
Conclusion
To verify which of those algorithms is faster let’s run a benchmark, you might test this code by yourself cloning it from my repo here.
To test those algorithms I’ll generate a certain amount of random pair of numbers {a,b} which then I’ll use to feed each of the functions we have seen here, the time required to execute the procedures is taken by the venerable chrono::high_resolution_clock functionality.
Running this simple bench program on a set of 5000000 pair of numbers with the code compiled with -O3 using the latest clang, the output is the following:
Generating data
Running benchmarks
Results:
binary_gcd 2037ms 1
optimized_gcd 2192ms 1.07609
safe_gcd_it 2265ms 1.11193
safe_gcd 2294ms 1.12617
Looks like the safe and simple implementation of the GCD algorithm is the slower one! The faster algorithm seems to be the one discovered by Stein, that is somehow expected since even from the theoretical standpoint the average complexity of the binary GCD is out-performing the basic GCD.
I attempted a second test changing the input data, this time on a sequence of increasing integers where
where n is the amount of input. This time the sequence will not contain very big numbers but at most n and thus the difference between two a,b would be no at most n, whereas in the previous test the difference might have been at most numeric_limits<long long>::max(), the output is the following:
Generating data
Running benchmarks
Results:
binary_gcd 838ms 1
optimized_gcd 893ms 1.06563
safe_gcd 928ms 1.1074
safe_gcd_it 935ms 1.11575
In this particular execution safe_gcd_it is slower than safe_gcd, but running this experiment multiple times you’ll notice that those two functions frequently switch between themselves the last two position of the result list (Are always slower than the other algorithms)
I ran this latest test with the same amount of input (5000000) but as you can see now the execution time for all of them is lower than the previous test, the reason is that is much less frequent to encounter extreme values of input where or vice-versa, which means than on average the amount of steps will be smaller (For all the algorithms).
But still, seems that the binary version of the GCD algorithm is the faster, followed by the optimized Euclid algorithm, that’s nice to know.
Thanks for reading!