dcb2e3b3fff0b287d576842aabe5c79f2fe4df30 variable signing precompute table (djb) Pull request description: This pull request gives an option to reduce the precomputed table size for the signing context (`ctx`) by setting `#define ECMULT_GEN_PREC_BITS [N_BITS]`. Motivation: Per #251 and #254, the static table can be reduced to 64kB. However, this is still too big for some of my embedded applications. Setting `#define ECMULT_GEN_PREC_BITS 2` produces a 32kB table at a tradeoff of about 75% of the signing speed. Not defining this value will default to the existing implementation of 4 bits. Statistics: ``` ECMULT_GEN_PREC_BITS = 1 Precomputed table size: 32kB ./bench_sign ecdsa_sign: min 195us / avg 200us / max 212us ECMULT_GEN_PREC_BITS = 2 Precomputed table size: 32kB ./bench_sign ecdsa_sign: min 119us / avg 126us / max 134us ECMULT_GEN_PREC_BITS = 4 (default) Precomputed table size: 64kB ./bench_sign ecdsa_sign: min 83.5us / avg 89.6us / max 95.3us ECMULT_GEN_PREC_BITS = 8 Precomputed table size: 512kB ./bench_sign ecdsa_sign: min 96.4us / avg 99.4us / max 104us ``` Only values of 2 and 4 make sense. 8 bits causes a larger table size with no increase in speed. 1 bit runs, actually, but does not reduce table size and is slower than 2 bits. ACKs for top commit: real-or-random: ACK dcb2e3b3fff0b287d576842aabe5c79f2fe4df30 verified that all changes to the previous ACKed 1d26b27ac90092306bfbc9cdd5123e8a5035202a were due to the rebase jonasnick: ACK dcb2e3b3fff0b287d576842aabe5c79f2fe4df30 read the code and tested various configurations with valgrind Tree-SHA512: ed6f68ca23ffdc4b59d51525336b34b25521233537edbc74d32dfb3eafd8196419be17f01cbf10bd8d87ce745ce143085abc6034727f742163f7e5f13f26f56e
libsecp256k1
Optimized C library for EC operations on curve secp256k1.
This library is a work in progress and is being used to research best practices. Use at your own risk.
Features:
- secp256k1 ECDSA signing/verification and key generation.
- Adding/multiplying private/public keys.
- Serialization/parsing of private keys, public keys, signatures.
- Constant time, constant memory access signing and pubkey generation.
- Derandomized DSA (via RFC6979 or with a caller provided function.)
- Very efficient implementation.
Implementation details
- General
- No runtime heap allocation.
- Extensive testing infrastructure.
- Structured to facilitate review and analysis.
- Intended to be portable to any system with a C89 compiler and uint64_t support.
- Expose only higher level interfaces to minimize the API surface and improve application security. ("Be difficult to use insecurely.")
- Field operations
- Optimized implementation of arithmetic modulo the curve's field size (2^256 - 0x1000003D1).
- Using 5 52-bit limbs (including hand-optimized assembly for x86_64, by Diederik Huys).
- Using 10 26-bit limbs.
- Field inverses and square roots using a sliding window over blocks of 1s (by Peter Dettman).
- Optimized implementation of arithmetic modulo the curve's field size (2^256 - 0x1000003D1).
- Scalar operations
- Optimized implementation without data-dependent branches of arithmetic modulo the curve's order.
- Using 4 64-bit limbs (relying on __int128 support in the compiler).
- Using 8 32-bit limbs.
- Optimized implementation without data-dependent branches of arithmetic modulo the curve's order.
- Group operations
- Point addition formula specifically simplified for the curve equation (y^2 = x^3 + 7).
- Use addition between points in Jacobian and affine coordinates where possible.
- Use a unified addition/doubling formula where necessary to avoid data-dependent branches.
- Point/x comparison without a field inversion by comparison in the Jacobian coordinate space.
- Point multiplication for verification (aP + bG).
- Use wNAF notation for point multiplicands.
- Use a much larger window for multiples of G, using precomputed multiples.
- Use Shamir's trick to do the multiplication with the public key and the generator simultaneously.
- Optionally (off by default) use secp256k1's efficiently-computable endomorphism to split the P multiplicand into 2 half-sized ones.
- Point multiplication for signing
- Use a precomputed table of multiples of powers of 16 multiplied with the generator, so general multiplication becomes a series of additions.
- Intended to be completely free of timing sidechannels for secret-key operations (on reasonable hardware/toolchains)
- Access the table with branch-free conditional moves so memory access is uniform.
- No data-dependent branches
- Optional runtime blinding which attempts to frustrate differential power analysis.
- The precomputed tables add and eventually subtract points for which no known scalar (private key) is known, preventing even an attacker with control over the private key used to control the data internally.
Build steps
libsecp256k1 is built using autotools:
$ ./autogen.sh
$ ./configure
$ make
$ make check
$ sudo make install # optional
Exhaustive tests
$ ./exhaustive_tests
With valgrind, you might need to increase the max stack size:
$ valgrind --max-stackframe=2500000 ./exhaustive_tests
Description
Experimental fork of libsecp256k1 with support for pedersen commitments and range proofs.
Languages
C
93.2%
Sage
1.6%
CMake
1.2%
M4
1.2%
Assembly
1.1%
Other
1.7%