ci: add no-public-api-changes label based on public olean diff#38230
ci: add no-public-api-changes label based on public olean diff#38230kim-em wants to merge 4 commits intoleanprover-community:masterfrom
Conversation
This PR adds a CI step that compares the public `.olean` files produced by the PR build against `master`. When they are byte-identical, the label `no-public-api-changes` is applied; otherwise it is removed. It also introduces a contributor/reviewer-controlled label `expect-no-public-api-changes`; when present, CI fails unless `no-public-api-changes` is applicable. This is intended to make proof-only golfing PRs cheap to review with confidence that nothing downstream of the touched files is affected. With the module system, theorems are exported as axioms in the public `.olean`, so proof-body changes only affect `.olean.private`. See `scripts/public_olean_diff.sh` for the comparison logic, and the discussion at https://leanprover.zulipchat.com/#narrow/channel/180721-mathlib-maintainers/topic/yuanyi-350.20golfing.20PRs for context. Caveats: - runs only on main-repo PRs (build_fork.yml is not yet updated); - silently skips when the master cache is cold; - labels `no-public-api-changes` and `expect-no-public-api-changes` must be created on the repo manually. 🤖 Prepared with Claude Code
PR summary caf1dee8e5Import changes for modified filesNo significant changes to the import graph Import changes for all files
Declarations diffNo declarations were harmed in the making of this PR! 🐙 You can run this locally as follows## summary with just the declaration names:
./scripts/pr_summary/declarations_diff.sh <optional_commit>
## more verbose report:
./scripts/pr_summary/declarations_diff.sh long <optional_commit>The doc-module for No changes to technical debt.You can run this locally as
|
* Tri-state diff result (match/differs/unavailable) so script errors don't masquerade as public-API diffs. * Validate the fetched master baseline with `lake build --no-build Mathlib.Init`; a bare `cache get` isn't sufficient evidence of a warm cache. Skip the diff entirely when validation fails. * Only label when the PR head still points at this run's SHA, to avoid stale-commit TOCTOU after a fast follow-up push. * Tighten the guard to same-repo PR-branch pushes on leanprover-community/mathlib4: exclude master, bors `staging`/`trying`, and any run that overrides `tools_branch_ref`. 🤖 Prepared with Claude Code
Address second-round review on public-olean diff:
* Validate the cached baseline over `Mathlib` (the full diff scope)
instead of just `Mathlib.Init`. A `--no-build` check of a single
module only proves a prefix is cached; a partial cache would
otherwise produce spurious added:/removed: entries in the diff.
* Qualify the PR lookup as `${owner}:${branch}` so we cannot match a
same-named head branch on an unrelated fork. The TOCTOU SHA check
also catches this, but explicitness is cheap.
🤖 Prepared with Claude Code
|
Since this only requires a boolean answer, how about comparing sha256 hashes of the public oleans rather than their contents? |
|
I think the disk usage is not a big deal since all our self-hosted runners are ephemeral now; everything will get cleaned up after each run. However, after discussing with @jcommelin, I think we might want a different version of this:
I'm working on that now in a branch based off of this one. |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@bryangingechen, please feel free to close this once you have an alternative up! |
This PR adds a CI step that compares the public
.oleanfiles produced by the PR build againstmaster. When they are byte-identical, the labelno-public-api-changesis applied; otherwise the label is removed.It also introduces a contributor/reviewer-controlled label
expect-no-public-api-changes; when present, CI fails unlessno-public-api-changesalso applies. Together these are intended to make proof-only golfing PRs (see the mathlib-maintainers thread) cheap to review with confidence that nothing downstream of the touched files is affected.The current implementation fetches master's cached oleans via a second
lake exe cache getin the existingtools-branchcheckout. I'd like feedback on whether the steady-state cost is acceptable before investing further.Measured cost (sampled from master run 24620430633, 2026-04-19):
cache get(8,241 files from Azure)lake build --no-build Mathlib(baseline validation)public_olean_diff.sh(byte-comparison of ~8k files).lake/build/lib/lean/Mathlib)So the wall-clock hit is small, but disk usage on PR runners roughly doubles (on top of the ~5.8 GB the PR build already materializes).
Three concrete questions for reviewers:
expect-no-public-api-changes, losing the informationalno-public-api-changeslabel on most PRs) or delaying this PR until we extendlake exe cacheto fetch public*.oleanonly (and skip.olean.private/.olean.server/.ir.olean, which would cut egress and disk substantially)?How it works
With the module system, a Mathlib build produces three files per module:
Foo.olean(public/exported),Foo.olean.server(server data), andFoo.olean.private(full private data). Theorems are exported as axioms in the public.olean(seeaddDeclatsrc/Lean/AddDecl.lean:100-104), so a proof-only change leavesFoo.oleanbyte-identical and only updatesFoo.olean.private. A byte-level diff of public oleans therefore certifies "nothing exported changed" — signatures, exposed definition bodies, and public environment-extension state all included.New pieces:
scripts/public_olean_diff.sh— compares two.lake/build/lib/leantrees, ignoring.olean.private,.olean.server, and.ir.olean. Tri-state exit:0=match,1=differs,2=error/unavailable..github/workflows/build_template.yml:lake exe cache getinsidetools-branch.lake build --no-build Mathlibover the full diff scope (not justMathlib.Init, which would only prove a prefix is cached).no-public-api-changeslabel (only when the diff state ismatchordiffers;unavailableleaves existing labels alone).expect-no-public-api-changesis set but the state isdiffers.Gated to same-repo PR-branch pushes on
leanprover-community/mathlib4, excludingmaster, borsstaging/trying, and any run overridingtools_branch_ref.Caveats
build_fork.ymlneeds an analogous change in a follow-up.unavailableand labels are left untouched (no spurious flipping).no-public-api-changes,expect-no-public-api-changes).🤖 Prepared with Claude Code