Skip to content

Add ability to save optimizer and resume while training#275

Merged
coolkp merged 1 commit intomainfrom
optimizer-resume
Oct 22, 2025
Merged

Add ability to save optimizer and resume while training#275
coolkp merged 1 commit intomainfrom
optimizer-resume

Conversation

@coolkp
Copy link
Copy Markdown
Collaborator

@coolkp coolkp commented Oct 22, 2025

This pr adds

  1. Add optimizer state to checkpoint (save checkpoint path, whole state saved)
  2. Load optimizer and step, overwrite optimizer in training loop (be careful of shardings) its inherited.
  3. resume training from last checkpointed step

@github-actions
Copy link
Copy Markdown

@coolkp coolkp merged commit 662d501 into main Oct 22, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants