Bugfix: mark outputs as early clobber in scalar x86_64 asm

In the existing code, the compiler is allowed to allocate the RSI register for outputs m0, m1, or m2, which are written to before the input in RSI is read from. Fix this by marking them as early clobber. Reported by ehoffman2 in https://github.com/bitcoin-core/secp256k1/issues/766
2023-05-12 05:15:05 -04:00 · 2023-05-12 05:15:05 -04:00 · 0c729ba70d
commit 0c729ba70d
parent 3353d3c753
1 changed files with 1 additions and 1 deletions
--- a/src/scalar_4x64_impl.h
+++ b/src/scalar_4x64_impl.h
@ -383,7 +383,7 @@ static void secp256k1_scalar_reduce_512(secp256k1_scalar *r, const uint64_t *l)
    "movq %%r10, %q5\n"
    /* extract m6 */
    "movq %%r8, %q6\n"
-    : "=g"(m0), "=g"(m1), "=g"(m2), "=g"(m3), "=g"(m4), "=g"(m5), "=g"(m6)
+    : "=&g"(m0), "=&g"(m1), "=&g"(m2), "=g"(m3), "=g"(m4), "=g"(m5), "=g"(m6)
    : "S"(l), "i"(SECP256K1_N_C_0), "i"(SECP256K1_N_C_1)
    : "rax", "rdx", "r8", "r9", "r10", "r11", "r12", "r13", "r14", "cc");