Commit 3c631e5
fix(cli): retry upload-complete and cancel orphaned server deploys (#15371)
Server deploys could get permanently stuck in 'queued' status if the
upload-complete call (Step 3 of `uploadServerDeploy`) hit a transient
failure — network error, 5xx, or expired token. There was no retry and
no cleanup, so the deploy just sat there forever. A customer hit this
twice in a row.
Now the CLI retries upload-complete up to 3 times with exponential
backoff (1s, 2s, 4s), refreshes the auth token on 401, and cancels the
orphaned deploy if retries are exhausted or if artifact upload fails.
Also added console.warn messages so the user can actually see what's
happening during retries and cleanup instead of staring at a silent
spinner.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## ELI5
If the final step of uploading a deploy fails (network hiccup or expired
login), the deployment could stay stuck. This PR makes the CLI retry the
confirmation step, refresh the login if needed, and cancel
failed/orphaned deploys so nothing remains stuck.
## Changes
### Changeset
- Adds a patch changeset describing a fix for deploys getting stuck in
`queued` when upload confirmation fails. Notes retries, user-visible
retry/cleanup logs, and automatic cancellation of orphaned deploys.
### CLI Platform API (server & studio)
- uploadServerDeploy / uploadDeploy improvements:
- Retry upload-complete on transient failures:
- Retries up to 3 times (4 attempts total) with exponential backoff
delays of 1s, 2s, and 4s.
- Treats HTTP 5xx, 401, and network-level polling/connection errors as
retryable.
- Refresh authentication on 401:
- Calls getToken() and recreates the typed API client before retrying;
if token refresh fails, cleanup is attempted and the error is surfaced.
- Handle network-level errors:
- Uses isRetryablePollingError to include connection-level failures
(ECONNRESET, ETIMEDOUT, ECONNREFUSED, ENOTFOUND, fetch failures) in
retryable handling.
- Cancel orphaned/failing deploys:
- Adds a best-effort cancelDeploy helper that POSTs
/v1/{server,studio}/deploys/{id}/cancel and logs warnings if
cancellation fails.
- Attempts cancellation when artifact upload fails, when confirmation
retries are exhausted, or on non-retryable failures; cancellation
failures are logged and do not mask the original error.
- User-visible logging:
- Replaces a silent spinner with console.warn messages to surface retry
attempts, token refresh attempts, and cleanup actions to users.
- API client migration:
- Replaces raw platformFetch usage with a typed createApiClient() for
create-deploy and upload-complete calls; tests updated to mock the typed
client.
### Types / API contract updates
- Regenerated platform-api.d.ts with multiple request/response typing
updates:
- Added cancel endpoints for studio deploys and extended deploy status
enums to include 'cancelled'.
- Adjusted pause/restart response contracts and added optional help_url
to many error payloads.
- Added/updated several requestBody schemas (e.g., studio deploy
creation now accepts optional envVars).
No exported/public function signatures were intentionally changed.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Mastra Code (anthropic/claude-opus-4-6) <noreply@mastra.ai>1 parent 620e0b2 commit 3c631e5
File tree
7 files changed
+629
-141
lines changed- .changeset
- packages/cli/src
- commands
- server
- studio
- utils
7 files changed
+629
-141
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
0 commit comments