Skip to content

Fixes for FLUX#258

Merged
coolkp merged 3 commits intoAI-Hypercomputer:mainfrom
hx89:hx89/gpu_fix_20251001
Oct 7, 2025
Merged

Fixes for FLUX#258
coolkp merged 3 commits intoAI-Hypercomputer:mainfrom
hx89:hx89/gpu_fix_20251001

Conversation

@hx89
Copy link
Copy Markdown
Contributor

@hx89 hx89 commented Oct 2, 2025

Fix FLUX config and issues with the latest TE.

Comment thread src/maxdiffusion/train_flux.py Outdated
""" If TransformerEngine is available, this context manager will provide the library with MaxText-specific details needed for correcct operation. """
try:
from transformer_engine.jax.sharding import global_shard_guard, MeshResource
# Inform TransformerEngine of MaxText's physical mesh resources.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maxdiffusion

Comment thread src/maxdiffusion/configs/base_flux_schnell.yml

if __name__ == "__main__":
app.run(main)
with transformer_engine_context():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed for sdxl too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's needed for any model that uses TE to let TE know the physical mesh axes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then can you add in the train_sdxl as well?

coolkp
coolkp previously approved these changes Oct 6, 2025
@coolkp coolkp merged commit 230c460 into AI-Hypercomputer:main Oct 7, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants