Add RSA/DH SP non-blocking support for C/Small 2048/3072/4096#10394
Add RSA/DH SP non-blocking support for C/Small 2048/3072/4096#10394dgarske wants to merge 1 commit intowolfSSL:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Extends wolfCrypt’s non-blocking (*_NONBLOCK) support to RSA and Diffie-Hellman using the SP “small” backends, enabling async/bare-metal TLS handshakes to make incremental progress without long big-int stalls.
Changes:
- Adds SP non-blocking RSA public/private and DH/modexp APIs for 2048/3072/4096 (C32/C64) and wires them into RSA/DH compute paths.
- Introduces DH non-block API (
DhNb,wc_DhSetNonBlock) and TLS-layer per-SSL nb-context allocation/freeing. - Updates build/config and CI matrix to exercise the new non-block configurations (including forced
SP_WORD_SIZE=32).
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| wolfssl/wolfcrypt/sp_int.h | Adds fixed-size SP non-blocking context buffers for modexp/RSA/DH. |
| wolfssl/wolfcrypt/sp.h | Declares new SP non-blocking RSA/DH entry points. |
| wolfssl/wolfcrypt/rsa.h | Expands WC_RSA_NONBLOCK backend validation and extends RsaNb for SP context. |
| wolfssl/wolfcrypt/dh.h | Adds WC_DH_NONBLOCK validation, DhNb, and wc_DhSetNonBlock() API. |
| wolfcrypt/test/test.c | Adds DH non-block test path driven via wc_DhSetNonBlock() / async wait loop. |
| wolfcrypt/src/sp_c64.c | Implements SP small non-blocking modexp + RSA/DH wrappers for 2048/3072/4096 (C64). |
| wolfcrypt/src/sp_c32.c | Implements SP small non-blocking modexp + RSA/DH wrappers for 2048/3072/4096 (C32). |
| wolfcrypt/src/rsa.c | Dispatches RSA non-block compute through SP non-block state machines when available. |
| wolfcrypt/src/dh.c | Adds DH non-block context binding and routes agree() through SP non-block wrapper. |
| src/tls.c | Allocates/binds DhNb contexts for TLS 1.3 key share paths when async SW is active. |
| src/internal.c | Allocates/frees per-SSL RSA/DH nb contexts alongside existing key allocation flows. |
| configure.ac | Adds --enable-rsa=nonblock / --enable-dh=nonblock and auto-enables async SW shim. |
| .github/workflows/os-check.yml | Extends CI matrix to build/test RSA/DH non-block (including SP_WORD_SIZE=32). |
Comments suppressed due to low confidence (2)
wolfcrypt/src/sp_c64.c:1
- The BIT_INIT logic can read the wrong exponent limb (and potentially out-of-bounds) when
bitsis an exact multiple of the limb bit-size (e.g.,bits == 61makesi == 1and readse[1]even though the top bits are ine[0]). A safer derivation is to base indexing onbits - 1(e.g.,i = (bits - 1) / WORD_BITS,c = (bits - 1) % WORD_BITS + 1) or explicitly handlec == 0by decrementingiand settingc = WORD_BITSbefore loadinge[i]. The same pattern appears in the other*_mod_exp_*_nbimplementations (sp_c64.c/sp_c32.c) and should be updated consistently.
wolfcrypt/src/sp_c64.c:1 - Casting
&ctx->mod_exp_ctx(a concrete state struct) tosp_modexp_ctx_t*is undefined behavior in C because the pointed-to object is not actually asp_modexp_ctx_t(even if the first byte address matches). This is especially risky under optimization/strict-aliasing rules. Prefer storing a realsp_modexp_ctx_tin the wrapper and placing the state struct inside itsdatabuffer, or changesp_*_mod_exp_*_nbto accept avoid*/byte-buffer and cast directly to the concrete ctx type without dereferencing assp_modexp_ctx_t.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
9e193cd to
df805fd
Compare
df805fd to
2309476
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ctx->n = e[ctx->i--] << (61 - ctx->c); | ||
| ctx->state = 5; | ||
| break; | ||
| case 5: /* BIT_NEXT: refill on word boundary, peel one exponent bit */ | ||
| if (ctx->c == 0) { |
| break; | ||
| } | ||
| ctx->n = e[ctx->i--]; | ||
| ctx->c = 57; | ||
| } |
| ctx->c = 60; | ||
| } | ||
| ctx->y = (byte)((ctx->n >> 59) & 1); | ||
| ctx->n <<= 1; | ||
| ctx->state = 6; |
| sp_3072_mont_mul_54(ctx->t[ctx->y ^ 1], ctx->t[0], ctx->t[1], | ||
| m, ctx->mp); | ||
| ctx->state = 7; | ||
| break; | ||
| case 7: /* COPY_OUT: constant-time copy &t[y] -> t[2] */ |
| break; | ||
| case 7: /* COPY_OUT: constant-time copy &t[y] -> t[2] */ | ||
| XMEMCPY(ctx->t[2], (void*)(((size_t)ctx->t[0] & addr_mask[ctx->y ^ 1]) + | ||
| ((size_t)ctx->t[1] & addr_mask[ctx->y])), | ||
| sizeof(sp_digit) * 70 * 2); |
| sp_4096_mont_mul_142(ctx->t[ctx->y ^ 1], ctx->t[0], ctx->t[1], | ||
| m, ctx->mp); | ||
| ctx->state = 7; | ||
| break; | ||
| case 7: /* COPY_OUT: constant-time copy &t[y] -> t[2] */ |
| sizeof(sp_digit) * 162 * 2); | ||
| ctx->state = 8; | ||
| break; | ||
| case 8: /* SQR: t[2] = t[2]^2 in Montgomery form */ | ||
| sp_4096_mont_sqr_162(ctx->t[2], ctx->t[2], m, ctx->mp); |
| /* Always validate peer public key (2 <= y <= p-2) per SP 800-56A */ | ||
| if (wc_DhCheckPubKey(key, otherPub, pubSz) != 0) { | ||
| WOLFSSL_MSG("wc_DhAgree wc_DhCheckPubKey failed"); | ||
| return DH_CHECK_PUB_E; | ||
| } |
| * is available, drive the chunked state machine here. wc_AsyncSimulate | ||
| * (line "if (ret == MP_WOULDBLOCK) ret = WC_PENDING_E;" at the bottom | ||
| * of the SW switch in wolfcrypt/src/async.c) translates per-yield | ||
| * MP_WOULDBLOCK into WC_PENDING_E so the TLS / async event loop can | ||
| * drive the operation to completion. */ |
| /* Async marker takes precedence: when wc_AsyncSimulate re-enters the | ||
| * compute path, wc_DhAgree_Async dispatches to the SP nonblock wrapper | ||
| * if key->nb is attached, and per-yield MP_WOULDBLOCK is translated to | ||
| * WC_PENDING_E by wc_AsyncSimulate so the TLS event loop drives it. */ |
Summary
Extends
WOLFSSL_SP_NONBLOCK(already covering ECC and Curve25519) to RSA and Diffie-Hellman so a bare-metal loop or async-driven TLS handshake never blocks for more than ~1 ms / 100 MHz on a single big-int op. Targets theWOLFSSL_SP_SMALLC32/C64 backend; assembly variants are unchanged.The chunked state machine yields per Montgomery op and per inner bit-extract step (mirrors TFM's
fp_exptmod_nbdesign). A 2048-bit RSA private op yields ~10240 times =2 * 2048 + ~13.What's added
sp_RsaPublic_<bits>_nbandsp_RsaPrivate_<bits>_nb—D-only path (RSA_LOW_MEM/SP_RSA_PRIVATE_EXP_D); CRT private is unsupported in non-block mode (configure-time#error).sp_DhExp_<bits>_nbandsp_ModExp_<bits>_nb— byte-buffer base for nb-friendly TLS use.WC_DH_NONBLOCKAPI:DhNbcontext +wc_DhSetNonBlock(DhKey*, DhNb*), parallel toWC_RSA_NONBLOCK/WC_ECC_NONBLOCK.wc_RsaFunctionAsync/wc_DhAgree_Syncdispatch to the SP non-block state machine in their compute path.wc_AsyncSimulatealready translates per-yieldFP_WOULDBLOCKintoWC_PENDING_E, so the TLS state machine drives completion via the standard async-event loop.AllocKey/FreeKeyand the TLS 1.3 keyShare paths, gated onssl->devId != INVALID_DEVID(mirrors the ECC / Curve25519 hooks).Configure
--enable-rsa=nonblockand--enable-dh=nonblockauto-enable--enable-asynccrypt --enable-asynccrypt-swand defineRSA_LOW_MEM. Build-time#errorchecks inrsa.h/dh.henforce the requiredWOLFSSL_SP_SMALL,WOLFSSL_SP_NO_MALLOC, and!WOLFSSL_SP_FAST_MODEXPcompanion flags.Tests
wolfcrypt/test/testwolfcrypt:RSA non-block sign: 10249 times, verify 94 times, inline verify 94 times;DH non-block agree: 1 times(drives viawc_AsyncWait); bothRSA test passed!andDH test passed!.make check: 5 PASS / 3 SKIP / 0 FAIL.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) verified end-to-end withexamples/server+examples/client; server's RSA private signature is driven through ~10437 SP non-block yields by the SW-shim translation inwc_AsyncSimulate.CI
.github/workflows/os-check.ymlextends the existing nonblock matrix entry to add--enable-rsa=nonblock --enable-dh=nonblock, plus a companion entry that forces-DSP_WORD_SIZE=32so bothsp_c32.candsp_c64.care exercised on every push.