A General Number Field Sieve (GNFS) implementation written from the ground up in C++, C, and x86_64 NASM.
This project exists because using external libraries for number theory feels like cheating. Every component, from the arbitrary-precision arithmetic to the relation filtering, was implemented manually to understand the underlying mechanics of integer factorization at scale.
The General Number Field Sieve is the most efficient known algorithm for factoring large integers (over 100 digits). This implementation follows the standard pipeline: polynomial selection, relation collection (sieving), linear algebra over GF(2), and square root extraction.
Standard integer types are insufficient for GNFS. We implemented a custom BigInt library that handles numbers of arbitrary length.
-
Assembly Core: Critical loops for addition, subtraction, and multiplication are implemented in hand-written x86_64 assembly (
asm/bigint_asm.asm). -
Karatsuba Multiplication: High-performance
$O(N^{1.585})$ recursive multiplication for large bit-lengths. - Memory Management: Magnitude is stored as a vector of 64-bit limbs, with safe assembly wrappers for memory protection.
Since no libraries like GMP or NTL are used, all required number-theoretic functions were built from scratch:
- Miller-Rabin: Primality testing for factor base validation.
- Tonelli-Shanks: Robust square roots modulo primes and composite numbers.
- Jacobi Symbol: Required for the square root phase and Legendre-style checks.
- Quadratic Characters: Legendre symbol filtering to ensure dependencies are perfect squares in the algebraic field.
The sieving engine implements a logarithm-based lattice sieve.
- Parallel Execution: Multi-threaded sieve using all available CPU cores for maximum throughput.
- Log-Sieving: Uses a fixed-point logarithm array for efficient prime distribution.
- Factor Bases: Generates rational, algebraic, and quadratic character bases.
Once relations are collected, they are converted into a large sparse matrix over GF(2).
- Matrix Construction: Primes are mapped to indices, and parity is tracked.
- Solver: A bit-packed Gaussian Elimination engine finds the nullspace of the relation matrix, yielding dependencies for the square root phase.
The final phase combines the found dependencies to find
- Rational Side: Square root is computed by halving the exponents of the prime factorization.
- Algebraic Side: Currently implements the framework for algebraic square root extraction in the number field.
- GCC: C++17 compliant.
- NASM: For the assembly arithmetic core.
- Make: Standard build automation.
To build the library and the main binary:
make clean && make -j4To run the factorizer:
./gnfs <number>Testing the components:
make test_bigint
make test_math
make test_pipelineBuilding this was a lesson in pain. Debugging carry-propogation in NASM and tracking down sign errors in the sieving array required hundreds of hours of manual trace analysis. There are no safety nets here; if the math is wrong, the factorization fails.
The current implementation is optimized for x86_64 environments. While the code is portable C++, the performance relies heavily on the bigint_asm.asm routines.
None. Use it for whatever you want. If you break something, you own both pieces.