tests/eclient: rotate controller cert twice in ctrl_cert_change#1153
Draft
eriknordmark wants to merge 1 commit intolf-edge:masterfrom
Draft
tests/eclient: rotate controller cert twice in ctrl_cert_change#1153eriknordmark wants to merge 1 commit intolf-edge:masterfrom
eriknordmark wants to merge 1 commit intolf-edge:masterfrom
Conversation
Extend the ctrl_cert_change test to perform two consecutive controller signing certificate rotations between the initial eclient1 deploy and the eclient2 deploy. The first rotation runs while the device has no /persist/checkpoint/controllercerts.bak, so EVE's auth-verify fallback in VerifyAuthContainerHeader cannot use the backup file and must reach SenderStatusCertMiss to trigger a /certs fetch. The second rotation runs after MaybeSaveControllerCerts has populated .bak, so the backup-load path is actually exercised before the fetch is triggered, and MaybeSaveControllerCerts also has to refresh an existing .bak in place. This catches a regression class where the cert-miss status returned by the first verify is silently overwritten by the .bak fallback's status, which caused the test to time out waiting for the "updating app connectivity" log when .bak did not exist. Each rotation waits for the rebuild log produced by zedrouter's UpdateAppConn so that a second TestLog invocation only matches a log emitted after the second rotation (LogChecker is started in LogNew mode by the test bus). Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extend
ctrl_cert_changeto perform two consecutive controller signingcertificate rotations between the initial
eclient1deploy and theeclient2deploy, so that both states of/persist/checkpoint/controllercerts.bakon the device are covered:
.bakdoes not yet exist. EVE'sauth-verify fallback in
VerifyAuthContainerHeader(
pkg/pillar/controllerconn/authen.go) cannot find the backupfile and must surface
SenderStatusCertMissso thatzedagent triggers a
/certsfetch. Saving the new chain viaMaybeSaveControllerCertspopulates.bakas a side effect ofWriteRenameWithBackup..bakhas been populated by thefirst rotation. The backup-fallback path actually loads a cert
(the previous-but-one cert), determines that its hash still does
not match the message, and only after that fails does EVE trigger
/certsagain.MaybeSaveControllerCertsalso has to refresh anexisting
.bakin place.Both
TestLoginvocations matchRebuilding intended global config, reasons: updating app connectivity.Because the test bus subscribes with
elog.LogNew(
pkg/testcontext/testProc.go:197), the second invocation only seeslogs that arrive after the second rotation, so the two events do not
alias.
Motivation
The single-rotation form of this test was masking a real EVE bug:
when
.bakdoes not exist, the backup attempt inVerifyAuthContainerHeaderreturnsSenderStatusNonefromLoadSavedServerSigningCert(file-not-found), which clobbers theoriginal
SenderStatusCertMissproduced by the primary attempt.Callers in zedagent then fall through to the
defaultcase insteadof
case types.SenderStatusCertMiss:, sotriggerControllerCertEventnever fires, EVE never fetches the rotated cert, and the test ends up
timing out waiting for the rebuild log. See
run 25392481670
for an example failure.
The fix is on the EVE side and is in flight separately. With both
rotations in place, this test will:
so that any future regression that breaks either branch fails the
test rather than silently passing.
Test plan
authen.goSenderStatusCertMiss-preservation fix applied — both rotationsshould complete inside the 15m windows.
should still time out (regression confirmation).
existing matrix entries.
MaybeSaveControllerCerts updated the certsappears after eachrotation and that
controllercerts.bak workedappears on thesecond rotation (or, equivalently, that the
controllercerts.bakfile size is non-zero after the run).
Notes for reviewers
gen-signing-certalways readssigning.pem/signing-key.pemfromedenHome and only varies
NotBefore/NotAfter(
pkg/utils/x509.go:GenServerCertFromPrevCertAndKey). Twoback-to-back invocations produce two distinct certs because the
timestamps differ, so the second
change-signing-certis a realrotation, not a no-op.
check_sign_cert.shis unchanged. It compares EVE's installedcert against
/tmp/signing-new.pem, which after the secondrotation holds the second new cert (overwritten by the second
gen-signing-cert -o /tmp/signing-new.pem), so the existing checkstill validates the correct end state.
the worst case. The window can be tightened later if warranted.
🤖 Generated with Claude Code