Topics Register • News • History • How to • Sequences statistics • Template prototypes

The multiple polynomial quadratic sieve (MPQS) is a factorization method.

## Introduction

Let $\displaystyle{ N }$ be the number to be factored. This number must not be a perfect power. If somehow we find two integers $\displaystyle{ X }$ and $\displaystyle{ Y }$ such that $\displaystyle{ X^2\equiv Y^2\,\pmod N }$ and $\displaystyle{ X\not\equiv \pm Y\pmod N }$, then $\displaystyle{ \gcd(X+Y,\, N) }$ will reveal a proper factor of $\displaystyle{ N }$.

In order to find these values $\displaystyle{ X }$ and $\displaystyle{ Y }$ the method finds relations which have the form $\displaystyle{ t^2 \equiv u\,\pmod N }$ where $\displaystyle{ u }$ is the product of small prime numbers. The set of these primes is the factor base. These relations will be found using sieving as will be explained below.

The relations found are combined using multiplications. The left hand side will always be a square because is a product of squares, so the goal is to have a square at the right hand side. A number is a square when all its prime factors appear an even number of times.

For example: let the number to factor be N = 1817 and we have found the following relations with factor base = {2, 7, 13}:

$\displaystyle{ 45^2\,\equiv \,2^4*7^0*13^1 }$
$\displaystyle{ 123^2\,\equiv \,2^{10}*7^0*13^1 }$

Both relations have non-square RHS, since the exponent of 13 is not even. But multiplying them we get:

$\displaystyle{ 84^2\,\equiv \,2^{14}*13^2 }$
$\displaystyle{ 84^2\,\equiv \,(2^7*13)^2 }$

Since $\displaystyle{ 2^7*13\,=\,1664 }$ we get the factor $\displaystyle{ \gcd(84+1664,\,1817) = 23 }$.

Which relations have to multiplied to find a square in the RHS is a linear algebra problem.

## Sieving

We will sieve in the interval [-M, +M] using several polynomials $\displaystyle{ g_{a,b}(x)\, =\, (ax + b)^2 - N }$ where $\displaystyle{ b^2 - n\, =\, ac }$ so $\displaystyle{ g_{a,b}(x)\, =\,a(ax^2+bx+c) }$ with different values of $\displaystyle{ a }$ and $\displaystyle{ b }$. The advantage here is that we know the factorization of $\displaystyle{ a }$, so the other factor is small compared to $\displaystyle{ N }$, thus increasing the probability that all its prime factors are included in the factor base. Using the notation of the previous section, $\displaystyle{ t^2\equiv u\,\pmod N }$ where $\displaystyle{ t = ax+b }$ and $\displaystyle{ u = g_{a,b}(x) }$.

### Generating the factor base

In order to determine the factor base, we have to notice that it includes all the possible factors of $\displaystyle{ g_{a,b}(x) }$. Since $\displaystyle{ g_{a,b}(x) = t^2 - N }$, we get that $\displaystyle{ 0 \equiv t^2 - N }$ or $\displaystyle{ t^2 \equiv N }$ modulo any prime of the factor base. This means that about half of the primes cannot be in the factor base because $\displaystyle{ N }$ cannot be a quadratic residue modulo that prime. Notice that the content of this paragraph does not depend on the particular values of $\displaystyle{ a }$ or $\displaystyle{ b }$, so the factor base can be computed at the beginning of MPQS. The upper bound for the factor base is determined from experimentation, for example:

• Factoring a 60 digit number: F = 60000
• Factoring a 70 digit number: F = 350000
• Factoring a 80 digit number: F = 900000

The next step consists in computing the modular square roots of $\displaystyle{ N }$ modulo the different primes in the factor base and computing the logarithms in base 2 of the different primes rounded to the nearest integer, storing these values in the arrays $\displaystyle{ \operatorname{tsqrt} }$ and $\displaystyle{ \operatorname{tlog} }$.

### Generating a polynomial

We have to compute the coefficients $\displaystyle{ a }$ and $\displaystyle{ b }$. The first coefficient will be the square of a prime near $\displaystyle{ \sqrt{\sqrt{2N}/M} }$ such that $\displaystyle{ N }$ is a quadratic residue modulo this prime, and the second will be a modular square root of $\displaystyle{ a }$. All these operations will require multiple precision arithmetic.

For each prime $\displaystyle{ p }$ in the factor base:

• Compute $\displaystyle{ \operatorname{ainv} }$ as the inverse of $\displaystyle{ a }$ modulo $\displaystyle{ p }$.
• Compute the first solution as $\displaystyle{ \operatorname{soln1} = \operatorname{ainv} * (\operatorname{tsqrt}[p] - b) \bmod p }$ and add $\displaystyle{ \operatorname{tlog}[p] }$ to all locations $\displaystyle{ \operatorname{soln1} + k*p }$ of the sieve array.
• Compute the second solution as $\displaystyle{ \operatorname{soln2} = \operatorname{ainv} * (-\operatorname{tsqrt}[p] - b) \bmod p }$ and add $\displaystyle{ \operatorname{tlog}[p] }$ to all locations $\displaystyle{ \operatorname{soln2} + k*p }$ of the sieve array.

### Obtaining relations

Walk through the sieve array. For each element greater than about $\displaystyle{ \log(2x\sqrt {N}) }$ minus a small error term (because the logarithms are rounded and we do not sieve with powers of prime numbers), perform a trial division of $\displaystyle{ g_{a,b}(x)/a }$ by the elements of the factor base. If all prime factors are less than or equal to F, save the relation. If there are enough relations (generally about 5% more relations than elements of the factor base), perform the linear algebra stage. Otherwise more relations must be collected. When the sieve stage is exhausted, change polynomial.

## Large prime variation

A variation that makes the sieving about twice as fast but requiring more memory, named PMPQS, consists in selecting a large prime bound (about 64 F). If after the trial division by the primes in the factor base the quotient is less than the large prime bound, we collect this partial relation in a different location than full relations (when the final quotient is 1). If the large prime found is the same than the large prime found in a previous partial relation, we can combine them:

Let $\displaystyle{ t^2\equiv u * p\,\pmod{N} }$ and $\displaystyle{ r^2\equiv v * p\,\pmod{N} }$ be two partial relations with the same large prime $\displaystyle{ p }$. Then a full relation will be:

$\displaystyle{ \large\left(\frac {t*r}{p}\right)^2\,\equiv u*v\,\pmod{N} }$.

This full relation can be stored with the other full relations found.

At the beginning of PMPQS the number of full relations found in this way will be small, but it increases as the algorithm progresses, and then there will be more relations found from partial relations than without large primes.