Skip to content

Adds new STM32 Bare support for Hash, SAES/AES and PKA#10395

Draft
dgarske wants to merge 1 commit intowolfSSL:masterfrom
dgarske:stm32_bare
Draft

Adds new STM32 Bare support for Hash, SAES/AES and PKA#10395
dgarske wants to merge 1 commit intowolfSSL:masterfrom
dgarske:stm32_bare

Conversation

@dgarske
Copy link
Copy Markdown
Member

@dgarske dgarske commented May 4, 2026

Summary

Adds WOLFSSL_STM32_BARE, a third STM32 crypto port flavor that talks directly to CRYP/AES/HASH/RNG/PKA peripherals through CMSIS device-header registers, with no stm32xxxx_hal_* or Standard Peripheral Library dependency. Selecting WOLFSSL_STM32_BARE is mutually exclusive with WOLFSSL_STM32_CUBEMX and uses the existing WOLFSSL_STM32_RNG_NOLIB direct-register RNG path.

What's covered

  • AES: ECB / CBC / CTR / GCM on the CRYP IP (F2/F4/F7/H7/MP13) and on the TinyAES IP (L4/L5/U3/U5/H5/G4/WB/WL/WBA/G0). GCM HW phase machine for 12-byte IV + whole-block PT; falls back to SW GHASH + HW ECB for partial-block / non-12B-IV cases. Per-family final-phase byte-order quirk handled (H7 rev.B+ vs F2/F4/F7).
  • HASH: existing direct-register HASH datapath now also drives the clock-enable line via RCC->...ENR directly when BARE is set.
  • RNG: routed through the existing WOLFSSL_STM32_RNG_NOLIB path; per-family WC_STM32_RNG_CLK_ENABLE() so G0 (AHBENR) is correct alongside F4/F7/L4/U5/H5/H7.
  • PKA (ECC scalar mul): new direct-register driver covering V1 (WB) and V2 (H5/U3/U5/G4/WBA) layouts. Bare-metal stand-ins for the HAL_PKA_* symbols wc_ecc_mulmod_ex2() calls, so the existing PKA call sites compile unchanged. ECDSA sign/verify under BARE+PKA still go through the SW ECDSA helper paths (which call HW-accelerated wc_ecc_mulmod_ex2() underneath); a !WOLFSSL_STM32_BARE guard keeps that behavior tidy.

Files

  • wolfssl/wolfcrypt/settings.h: gate HAL-header includes on !WOLFSSL_STM32_BARE; pull in only the CMSIS device header (e.g. stm32h5xx.h); auto-define WOLFSSL_STM32_RNG_NOLIB; mutex with WOLFSSL_STM32_CUBEMX.
  • wolfssl/wolfcrypt/port/st/stm32.h: per-family WC_STM32_AES/HASH/RNG/PKA_CLK_ENABLE() macros, HASH ALGO value table for the new H5/U3/MP13/N6/H7S generation, BARE-only HAL_PKA stand-in typedefs.
  • wolfcrypt/src/port/st/stm32.c: ~600 LOC bare-metal AES driver (CRYP + TinyAES), HASH clock-enable override, PKA driver (V1 + V2). Plus a STM32_AES_CLEAR_CCF() macro that picks AES_ICR_CCF or AES_CR_CCFC based on which the device header defines (newer vs WB/WL/G0 TinyAES).
  • wolfcrypt/src/aes.c: WOLFSSL_STM32_BARE arms next to the existing WOLFSSL_STM32_CUBEMX blocks in ECB / CBC / GCM-enc / GCM-dec dispatchers, and a small wc_AesSetKey byte-reverse so HW key registers see big-endian words.
  • wolfcrypt/src/random.c: replace the hard-coded RCC->AHB2ENR |= RCC_AHB2ENR_RNGEN write with the per-family macro from port/st/stm32.h.
  • wolfcrypt/src/ecc.c: existing WOLFSSL_STM32_PKA -> SW-helper-bypass guards become ... && !WOLFSSL_STM32_BARE. v1 of the bare PKA driver only implements ECCMul HW; the SW ECDSA path then drives that HW underneath.

Validated on hardware

  • NUCLEO-F439ZI (Cortex-M4F, 144 MHz PLL, CRYP+HASH+RNG): full wolfcrypt_test PASSES; bench shows BARE 11.4 MiB/s AES-CBC, 25.8 MiB/s SHA-256.
  • NUCLEO-H753ZI (Cortex-M7F, 480 MHz PLL, CRYP+HASH+RNG): full wolfcrypt_test PASSES; bench 19.2 MiB/s AES-CBC, 25.9 MiB/s SHA-256.
  • NUCLEO-H563ZI (Cortex-M33, HASH+RNG, no AES): builds clean. Runtime hits a board-side flash ECC fault around 0x08002000 that reproduces on multiple physical boards even with CONFIG=c (no BARE drivers); see STM32_BARE_BOARD_STATUS.md in this repo for the full reproduction recipe.
  • NUCLEO-U575ZI-Q (Cortex-M33, HASH+RNG, no AES, no PKA): wolfcrypt_test PASSES with HASH+RNG via BARE.
  • NUCLEO-U385RG-Q (Cortex-M33, TinyAES+HASH+RNG+PKA, V2 layout): full wolfcrypt_test PASSES; PKA HW path exercised end-to-end on P-256 ECDHE.
  • NUCLEO-WB55RG (Cortex-M4F, 64 MHz PLL, TinyAES+RNG+PKA, V1 layout): full wolfcrypt_test PASSES; PKA HW path validated. Note: WB55 PKA HW @ 64 MHz is slower per-op than SP-ECC software for P-256 (ST publishes the same observation); driver is correctness-validated, performance is silicon-bound.
  • NUCLEO-G491RE (Cortex-M4F, 170 MHz PLL, no AES/HASH/PKA on this variant): full wolfcrypt_test PASSES in pure-software baseline; useful as a software baseline platform until G4A1 (which has the full crypto block) is brought up.

Companion

Matching multi-board example: wolfSSL/wolfssl-examples-stm32 PR #13 -- wolfSSL/wolfssl-examples-stm32#13. Selectable per-board with BOARD={h5,h7,u3,u5,wb55,g491,f439} and CONFIG={c,asm,bare} × TARGET={test,bench} at the make line. Reproduces every result in the test matrix below.

Test matrix (validated on real hardware)

Run via make BOARD=<x> CONFIG=bare TARGET=test (full wolfcrypt_test) and make BOARD=<x> CONFIG=bare TARGET=bench. Pass = wolfcrypt_test exits with Result: 0 (PASS). Bench numbers are from the wolfcrypt benchmark.c block-1024 default.

Board Chip CPU / Clock HW used by BARE wolfcrypt_test AES-128-CBC enc SHA-256 ECDHE secp256r1
NUCLEO-F439ZI STM32F439 M4F / 144 MHz PLL CRYP+HASH+RNG PASS 11.401 MiB/s 25.757 MiB/s n/a (no PKA HW)
NUCLEO-H753ZI STM32H753 M7F / 480 MHz PLL CRYP+HASH+RNG PASS 19.165 MiB/s 25.928 MiB/s n/a (no PKA HW)
NUCLEO-U385RG-Q STM32U385 M33 / 96 MHz HSI TinyAES+HASH+RNG+PKA PASS TinyAES HW HASH HW 1.115 ops/s (PKA)
NUCLEO-WB55RG STM32WB55 M4F / 64 MHz PLL TinyAES+RNG+PKA PASS 7.237 MiB/s 1.243 (sw) 4.83 ops/s (PKA)
NUCLEO-U575ZI-Q STM32U575 M33 / 160 MHz HASH+RNG (no AES, no PKA) PASS (sw) HASH HW n/a
NUCLEO-G491RE STM32G491 M4F / 170 MHz PLL RNG only (G491 has no AES/HASH/PKA) PASS 1.017 MiB/s (sw) 3.037 (sw) n/a
NUCLEO-H563ZI STM32H563 M33 / 250 MHz HASH+RNG+PKA -- PKA v2 path Builds clean; runtime blocked by board-side flash ECC fault (see STM32_BARE_BOARD_STATUS.md in this branch) -- -- --

@dgarske dgarske self-assigned this May 4, 2026
Copilot AI review requested due to automatic review settings May 4, 2026 23:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new STM32 “bare-metal” crypto port flavor (WOLFSSL_STM32_BARE) that uses CMSIS device-header register access (no HAL/StdPeriph dependency) and wires it into wolfCrypt’s AES/HASH/RNG paths, plus a direct-register PKA implementation used by the existing STM32 PKA integration.

Changes:

  • Add WOLFSSL_STM32_BARE selection in settings to include only CMSIS device headers and auto-enable the no-lib RNG path.
  • Add per-family bare-metal clock-enable macros and HAL/PKA stand-in types to support a direct-register PKA driver.
  • Add bare-metal AES (CRYP + TinyAES), HASH clock enable override, and bare PKA shims/driver, plus AES dispatcher updates in aes.c and RNG clock-enable macro use in random.c.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
wolfssl/wolfcrypt/settings.h Adds WOLFSSL_STM32_BARE selection, CMSIS header includes, and mutual exclusion with CubeMX.
wolfssl/wolfcrypt/port/st/stm32.h Adds BARE clock-enable macros, HASH ALGO defines for new IP, and PKA stand-in types.
wolfcrypt/src/port/st/stm32.c Implements bare-metal AES (CRYP/TinyAES), HASH clock enable override, and bare-metal PKA shims/driver.
wolfcrypt/src/aes.c Routes ECB/CBC/CTR and GCM-encrypt through the BARE STM32 implementation with SW fallback behavior.
wolfcrypt/src/random.c Uses a per-family RNG clock-enable macro (for BARE) instead of a fixed RCC register bit.
wolfcrypt/src/ecc.c Adjusts STM32 PKA guards so BARE uses SW ECDSA paths while still leveraging HW scalar mul.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +32 to +52
#ifdef WOLFSSL_STM32_BARE
/* Per-family direct-register clock-enable macros. CMSIS device header is
* already included via settings.h. RCC->...ENR bit names come from CMSIS. */
#if defined(WOLFSSL_STM32H5)
#define WC_STM32_AES_CLK_ENABLE() \
do { RCC->AHB2ENR |= RCC_AHB2ENR_AESEN; (void)RCC->AHB2ENR; } while (0)
#define WC_STM32_AES_CLK_DISABLE() \
do { RCC->AHB2ENR &= ~RCC_AHB2ENR_AESEN; } while (0)
#define WC_STM32_HASH_CLK_ENABLE() \
do { RCC->AHB2ENR |= RCC_AHB2ENR_HASHEN; (void)RCC->AHB2ENR; } while (0)
#define WC_STM32_HASH_CLK_DISABLE() \
do { RCC->AHB2ENR &= ~RCC_AHB2ENR_HASHEN; } while (0)
#define WC_STM32_RNG_CLK_ENABLE() \
do { RCC->AHB2ENR |= RCC_AHB2ENR_RNGEN; (void)RCC->AHB2ENR; } while (0)
#elif defined(WOLFSSL_STM32F4) || defined(WOLFSSL_STM32F7) || \
defined(WOLFSSL_STM32H7)
#define WC_STM32_AES_CLK_ENABLE() \
do { RCC->AHB2ENR |= RCC_AHB2ENR_CRYPEN; (void)RCC->AHB2ENR; } while (0)
#define WC_STM32_AES_CLK_DISABLE() \
do { RCC->AHB2ENR &= ~RCC_AHB2ENR_CRYPEN; } while (0)
#define WC_STM32_HASH_CLK_ENABLE() \
Comment on lines +142 to +176
/* Bare-metal stand-ins for the small subset of HAL types/values that
* wolfcrypt/src/port/st/stm32.c references in the PKA path. These mirror
* the ST HAL definitions; the bare-metal HAL_PKA_* shims in stm32.c
* implement the actual register sequence. */
#ifdef WOLFSSL_STM32_PKA

#include <stdint.h>

typedef enum {
HAL_OK = 0x00U,
HAL_ERROR = 0x01U,
HAL_BUSY = 0x02U,
HAL_TIMEOUT = 0x03U
} HAL_StatusTypeDef;

#ifndef HAL_MAX_DELAY
#define HAL_MAX_DELAY 0xFFFFFFFFU
#endif

typedef struct {
PKA_TypeDef *Instance;
} PKA_HandleTypeDef;

typedef struct {
uint32_t modulusSize;
uint32_t coefSign;
const uint8_t *coefA;
const uint8_t *coefB; /* V2 only */
const uint8_t *modulus;
const uint8_t *primeOrder; /* V2 only */
uint32_t scalarMulSize;
const uint8_t *scalarMul;
const uint8_t *pointX;
const uint8_t *pointY;
} PKA_ECCMulInTypeDef;
Comment on lines +446 to +448
/* Note: under WOLFSSL_STM32_BARE the GCM HW phase machine is not yet
* implemented. GCM falls back to the software path which still uses HW
* AES (via wc_AesEncrypt -> wc_Stm32_Aes_Ecb) for the underlying blocks. */
Comment on lines 607 to +611
static WC_INLINE void wc_Stm32_Hash_Clock_Enable(STM32_HASH_Context* stmCtx)
{
#ifdef WOLFSSL_STM32_CUBEMX
#if defined(WOLFSSL_STM32_BARE)
WC_STM32_HASH_CLK_ENABLE();
#elif defined(WOLFSSL_STM32_CUBEMX)
Comment on lines +233 to +251
HAL_StatusTypeDef HAL_PKA_Init(PKA_HandleTypeDef *hpkah)
{
if (hpkah == NULL) {
return HAL_ERROR;
}
if (hpkah->Instance == NULL) {
hpkah->Instance = PKA;
}

#ifdef WC_STM32_PKA_CLK_ENABLE
WC_STM32_PKA_CLK_ENABLE();
#endif

/* Reset CR, enable the PKA, clear any pending flags. */
hpkah->Instance->CR = PKA_CR_EN;
hpkah->Instance->CLRFR = PKA_CLRFR_PROCENDFC | PKA_CLRFR_RAMERRFC |
PKA_CLRFR_ADDRERRFC;
return HAL_OK;
}
Comment on lines +297 to +305
/* Wait for end-of-operation flag. */
while ((p->SR & PKA_SR_PROCENDF) == 0U) {
/* No timeout in BARE -- HAL_MAX_DELAY semantics. The hardware
* completes WB55 P-256 scalar multiplication in well under
* 100 ms; if we're here forever something is wrong upstream. */
}

/* Clear all status flags. */
p->CLRFR = PKA_CLRFR_PROCENDFC | PKA_CLRFR_RAMERRFC | PKA_CLRFR_ADDRERRFC;
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

MemBrowse Memory Report

No memory changes detected for:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants