From 65fe07943a0434e9c02b5a8012ba679ced9f17af Mon Sep 17 00:00:00 2001
From: "serenagu@google.com" <serenagu@google.com>
Date: Mon, 28 Jul 2025 23:55:49 +0000
Subject: [PATCH 1/7] ltx instruction update

---
 README.md | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 68b887f9f..e2992431c 100644
--- a/README.md
+++ b/README.md
@@ -171,7 +171,16 @@ To generate images, run the following command:
   ```bash
   python -m src.maxdiffusion.generate src/maxdiffusion/configs/base21.yml run_name="my_run"
   ```
-  
+- **LTX Video**
+  1. In the folder src/maxdiffusion/models/ltx_video/utils, run:
+    ```bash
+    python convert_torch_weights_to_jax.py --ckpt_path [LOCAL DIRECTORY FOR WEIGHTS] --transformer_config_path ../xora_v1.2-13B-balanced-128.json
+    ```
+  2. In the repo folder, run:
+    ```bash
+    python src/maxdiffusion/generate_ltx_video.py src/maxdiffusion/configs/ltx_video.yml output_dir="[SAME DIRECTORY]" config_path="src/maxdiffusion/models/ltx_video/xora_v1.2-13B-balanced-128.json"
+    ```
+  3. Other generation parameters can be set in ltx_video.yml file.
   ## Flux
 
   First make sure you have permissions to access the Flux repos in Huggingface.

From 8af7225b4b78a99373ab4c26a63de4df414faab1 Mon Sep 17 00:00:00 2001
From: "serenagu@google.com" <serenagu@google.com>
Date: Tue, 29 Jul 2025 18:59:35 +0000
Subject: [PATCH 2/7] updated whatsnew

---
 README.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index e2992431c..64b043410 100644
--- a/README.md
+++ b/README.md
@@ -24,6 +24,7 @@
 - **`2024/10/22`**: LoRA support for Hyper SDXL.
 - **`2024/8/1`**: Orbax is the new default checkpointer. You can still use `pipeline.save_pretrained` after training to save in diffusers format.
 - **`2024/7/20`**: Dreambooth training for Stable Diffusion 1.x,2.x is now supported.
+- **`2025/7/29`**: LTX-Video text2video generation is now supported.
 
 # Overview
 
@@ -41,6 +42,7 @@ MaxDiffusion supports
 * Load Multiple LoRA (SDXL inference).
 * ControlNet inference (Stable Diffusion 1.4 & SDXL).
 * Dreambooth training support for Stable Diffusion 1.x,2.x.
+* LTX-Video (inference).
 
 
 # Table of Contents
@@ -172,15 +174,15 @@ To generate images, run the following command:
   python -m src.maxdiffusion.generate src/maxdiffusion/configs/base21.yml run_name="my_run"
   ```
 - **LTX Video**
-  1. In the folder src/maxdiffusion/models/ltx_video/utils, run:
+  - In the folder src/maxdiffusion/models/ltx_video/utils, run:
     ```bash
     python convert_torch_weights_to_jax.py --ckpt_path [LOCAL DIRECTORY FOR WEIGHTS] --transformer_config_path ../xora_v1.2-13B-balanced-128.json
     ```
-  2. In the repo folder, run:
+  - In the repo folder, run:
     ```bash
     python src/maxdiffusion/generate_ltx_video.py src/maxdiffusion/configs/ltx_video.yml output_dir="[SAME DIRECTORY]" config_path="src/maxdiffusion/models/ltx_video/xora_v1.2-13B-balanced-128.json"
     ```
-  3. Other generation parameters can be set in ltx_video.yml file.
+  - Other generation parameters can be set in ltx_video.yml file.
   ## Flux
 
   First make sure you have permissions to access the Flux repos in Huggingface.

From 14b8c1f1a56043ee9601d23c7d8ae0c6b6047145 Mon Sep 17 00:00:00 2001
From: "serenagu@google.com" <serenagu@google.com>
Date: Tue, 29 Jul 2025 22:16:02 +0000
Subject: [PATCH 3/7] updated table of contents

---
 README.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 64b043410..929ea2146 100644
--- a/README.md
+++ b/README.md
@@ -24,7 +24,7 @@
 - **`2024/10/22`**: LoRA support for Hyper SDXL.
 - **`2024/8/1`**: Orbax is the new default checkpointer. You can still use `pipeline.save_pretrained` after training to save in diffusers format.
 - **`2024/7/20`**: Dreambooth training for Stable Diffusion 1.x,2.x is now supported.
-- **`2025/7/29`**: LTX-Video text2video generation is now supported.
+- **`2025/7/29`**: LTX-Video text2vid generation is now supported.
 
 # Overview
 
@@ -42,7 +42,7 @@ MaxDiffusion supports
 * Load Multiple LoRA (SDXL inference).
 * ControlNet inference (Stable Diffusion 1.4 & SDXL).
 * Dreambooth training support for Stable Diffusion 1.x,2.x.
-* LTX-Video (inference).
+* LTX-Video text2vid (inference).
 
 
 # Table of Contents
@@ -55,6 +55,7 @@ MaxDiffusion supports
   - [Training](#training)
   - [Dreambooth](#dreambooth)
   - [Inference](#inference)
+  - [LTX-Video](#ltx-video)
   - [Flux](#flux)
     - [Fused Attention for GPU:](#fused-attention-for-gpu)
   - [Hyper SDXL LoRA](#hyper-sdxl-lora)
@@ -173,7 +174,7 @@ To generate images, run the following command:
   ```bash
   python -m src.maxdiffusion.generate src/maxdiffusion/configs/base21.yml run_name="my_run"
   ```
-- **LTX Video**
+  ## LTX-Video
   - In the folder src/maxdiffusion/models/ltx_video/utils, run:
     ```bash
     python convert_torch_weights_to_jax.py --ckpt_path [LOCAL DIRECTORY FOR WEIGHTS] --transformer_config_path ../xora_v1.2-13B-balanced-128.json
@@ -216,7 +217,6 @@ To generate images, run the following command:
   ```bash
   python src/maxdiffusion/generate_flux.py src/maxdiffusion/configs/base_flux_schnell.yml jax_cache_dir=/tmp/cache_dir run_name=flux_test output_dir=/tmp/ prompt="photograph of an electronics chip in the shape of a race car with trillium written on its side" per_device_batch_size=1 ici_data_parallelism=1 ici_fsdp_parallelism=-1 offload_encoders=False
   ```
-
     ## Fused Attention for GPU:
     Fused Attention for GPU is supported via TransformerEngine. Installation instructions:
 
@@ -333,3 +333,5 @@ This script will automatically format your code with `pyink` and help you identi
 
 
 The full suite of -end-to end tests is in `tests` and `src/maxdiffusion/tests`. We run them with a nightly cadance.
+
+

From 06a8f55596c87f487ca9d60f3c8ad60260bf87af Mon Sep 17 00:00:00 2001
From: "serenagu@google.com" <serenagu@google.com>
Date: Tue, 29 Jul 2025 22:27:23 +0000
Subject: [PATCH 4/7] changed order

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 929ea2146..e3d7ab075 100644
--- a/README.md
+++ b/README.md
@@ -17,6 +17,7 @@
 [![Unit Tests](https://github.com/google/maxtext/actions/workflows/UnitTests.yml/badge.svg)](https://github.com/google/maxdiffusion/actions/workflows/UnitTests.yml)
 
 # What's new?
+- **`2025/7/29`**: LTX-Video text2vid generation is now supported.
 - **`2025/04/17`**: Flux Finetuning.
 - **`2025/02/12`**: Flux LoRA for inference.
 - **`2025/02/08`**: Flux schnell & dev inference.
@@ -24,7 +25,6 @@
 - **`2024/10/22`**: LoRA support for Hyper SDXL.
 - **`2024/8/1`**: Orbax is the new default checkpointer. You can still use `pipeline.save_pretrained` after training to save in diffusers format.
 - **`2024/7/20`**: Dreambooth training for Stable Diffusion 1.x,2.x is now supported.
-- **`2025/7/29`**: LTX-Video text2vid generation is now supported.
 
 # Overview
 

From 16f6b1f66627183a5b61832bbccf698ed45e0d98 Mon Sep 17 00:00:00 2001
From: Serenagu525 <41308432+Serenagu525@users.noreply.github.com>
Date: Tue, 5 Aug 2025 15:28:12 -0700
Subject: [PATCH 5/7] Fix logging error

---
 .../pipelines/ltx_video/ltx_video_pipeline.py          | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/maxdiffusion/pipelines/ltx_video/ltx_video_pipeline.py b/src/maxdiffusion/pipelines/ltx_video/ltx_video_pipeline.py
index 0ca816f9e..1e0abe698 100644
--- a/src/maxdiffusion/pipelines/ltx_video/ltx_video_pipeline.py
+++ b/src/maxdiffusion/pipelines/ltx_video/ltx_video_pipeline.py
@@ -60,12 +60,10 @@
 
 def validate_transformer_inputs(prompt_embeds, fractional_coords, latents, encoder_attention_segment_ids):
   # Note: reference shape annotated for first pass default inference parameters
-  max_logging.log("prompts_embeds.shape: ", prompt_embeds.shape, prompt_embeds.dtype)  # (3, 256, 4096) float32
-  max_logging.log("fractional_coords.shape: ", fractional_coords.shape, fractional_coords.dtype)  # (3, 3, 3072) float32
-  max_logging.log("latents.shape: ", latents.shape, latents.dtype)  # (1, 3072, 128) float 32
-  max_logging.log(
-      "encoder_attention_segment_ids.shape: ", encoder_attention_segment_ids.shape, encoder_attention_segment_ids.dtype
-  )  # (3, 256) int32
+  max_logging.log(f"prompts_embeds.shape: {prompt_embeds.shape}")  # (3, 256, 4096) float32
+  max_logging.log(f"fractional_coords.shape: {fractional_coords.shape}")  # (3, 3, 3072) float32
+  max_logging.log(f"latents.shape: {latents.shape}")  # (1, 3072, 128) float 32
+  max_logging.log(f"encoder_attention_segment_ids.shape: {encoder_attention_segment_ids.shape}")  # (3, 256) int32
 
 
 class LTXVideoPipeline:

From bf4a64635d87cf6ecae6cfa288526f370b9ddb4a Mon Sep 17 00:00:00 2001
From: "serenagu@google.com" <serenagu@google.com>
Date: Wed, 6 Aug 2025 17:00:47 +0000
Subject: [PATCH 6/7] renamed json

---
 .../ltx_video/{xora_v1.2-13B-balanced-128.json => ltxv-13B.json}  | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename src/maxdiffusion/models/ltx_video/{xora_v1.2-13B-balanced-128.json => ltxv-13B.json} (100%)

diff --git a/src/maxdiffusion/models/ltx_video/xora_v1.2-13B-balanced-128.json b/src/maxdiffusion/models/ltx_video/ltxv-13B.json
similarity index 100%
rename from src/maxdiffusion/models/ltx_video/xora_v1.2-13B-balanced-128.json
rename to src/maxdiffusion/models/ltx_video/ltxv-13B.json

From 738afda0a617d839e9860690d32fea95f6320aba Mon Sep 17 00:00:00 2001
From: "serenagu@google.com" <serenagu@google.com>
Date: Wed, 6 Aug 2025 18:56:35 +0000
Subject: [PATCH 7/7] renamed files

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index e3d7ab075..4d776ca21 100644
--- a/README.md
+++ b/README.md
@@ -177,11 +177,11 @@ To generate images, run the following command:
   ## LTX-Video
   - In the folder src/maxdiffusion/models/ltx_video/utils, run:
     ```bash
-    python convert_torch_weights_to_jax.py --ckpt_path [LOCAL DIRECTORY FOR WEIGHTS] --transformer_config_path ../xora_v1.2-13B-balanced-128.json
+    python convert_torch_weights_to_jax.py --ckpt_path [LOCAL DIRECTORY FOR WEIGHTS] --transformer_config_path ../ltxv-13B.json
     ```
   - In the repo folder, run:
     ```bash
-    python src/maxdiffusion/generate_ltx_video.py src/maxdiffusion/configs/ltx_video.yml output_dir="[SAME DIRECTORY]" config_path="src/maxdiffusion/models/ltx_video/xora_v1.2-13B-balanced-128.json"
+    python src/maxdiffusion/generate_ltx_video.py src/maxdiffusion/configs/ltx_video.yml output_dir="[SAME DIRECTORY]" config_path="src/maxdiffusion/models/ltx_video/ltxv-13B.json"
     ```
   - Other generation parameters can be set in ltx_video.yml file.
   ## Flux