RSA is the most widespread public-key cryptosystem on the Internet. Suppose we are given a massive collection of 1024-bit RSA keys. Each of these keys contains a product N = pq of two 512-bit random-looking primes, and the security of the cryptosystem depends on the difficulty of factoring N (that is, deriving p and q given only N). Factoring any one N is a seriously hard problem. But in practice, if we start with a large enough set of keys, then we can often factor some of them. In an ideal world, no two keys share a p or a q. But in the real world, many keys are generated using poorly-configured (or compromised) random number generators, which means that occasionally the same prime will pop up in two different keys, and then we can easily find that common prime as the greatest common divisor (GCD) of the two keys, using the classic Euclidean algorithm. Computing all of the GCDs pairwise is slow (the difficulty increases quadratically with the number of keys), but we can do much better using a "batch GCD" algorithm. The aim of this project is to implement an efficient distributed batch GCD algorithm, and apply it to collections of millions of RSA keys. The sequential version of this algorithm mirrors a real world attack from 2012, which broke tens of thousands of deployed RSA keys. While the algorithm appears simple enough, creating a distributed version involves several subtle choices.
More details in the report file.