You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**`2026/04/16`**: Support for Tokamax Ring Attention kernel is now added.
20
21
-**`2026/03/31`**: Wan2.2 SenCache inference is now supported for T2V and I2V (up to 1.4x speedup)
21
22
-**`2026/03/25`**: Wan2.1 and Wan2.2 Magcache inference is now supported
22
23
-**`2026/03/25`**: LTX-2 Video Inference is now supported
@@ -535,6 +536,12 @@ To generate images, run the following command:
535
536
536
537
Supports both Text2Vid and Img2Vid pipelines.
537
538
539
+
**Note**: The product of per_device_batch_size and num_devices must be equal to a whole number.
540
+
541
+
The below command uses 4 devices and a per_device_batch_size=0.25. Thus, 4 * 0.25 = 1. This will generate a single video. Setting per_device_batch_size to 0.5, will generate 2 videos and so on.
542
+
543
+
If using 8 devices, then per_device_batch_size=0.125 will generate 1 video, per_device_batch_size=0.25 generates 2 videos.
544
+
538
545
The following command will run Wan2.1 T2V:
539
546
540
547
```bash
@@ -553,7 +560,7 @@ To generate images, run the following command:
MaxDiffusion supports automated profiling and performance tracking via [Google Cloud ML Diagnostics](https://docs.cloud.google.com/tpu/docs/ml-diagnostics/sdk).
4
+
5
+
## 1. Manual Installation
6
+
To keep the core MaxDiffusion repository lightweight and ensure it runs without dependencies for users who don't need profiling, the ML Diagnostics packages are **not** installed by default.
7
+
8
+
To use this feature, you must manually install the required package in your environment:
9
+
```bash
10
+
pip install google-cloud-mldiagnostics
11
+
```
12
+
13
+
## 2. Configuration Settings
14
+
To enable ML Diagnostics for your training or generation jobs, you need to update your configuration. You can either add these directly to your .yml config file or pass them as command-line arguments:
The GCS bucket you provide in `profiler_gcs_path` **must** have the correct IAM permissions to allow the Hypercompute Cluster service account to write data.
25
+
26
+
If permissions are not configured correctly, your job will fail with an error similar to this:
27
+
> `message: 'service-32478767326@gcp-sa-hypercomputecluster.iam.gserviceaccount.com does not have storage.buckets.get access to the GCS bucket <your-bucket>: permission denied'`
28
+
29
+
**Fix:** Ensure you grant the required Storage roles (e.g., `Storage Object Admin`) to the service account mentioned in your error message for your specific GCS bucket.
30
+
31
+
## 4. Viewing Your Runs
32
+
Once your job is running with diagnostics enabled, you can monitor the profiles, execution times, and metrics in the Cluster Director console here:
Copy file name to clipboardExpand all lines: src/maxdiffusion/configs/base_wan_14b.yml
+6-1Lines changed: 6 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -409,4 +409,9 @@ eval_data_dir: ""
409
409
enable_generate_video_for_eval: False # This will increase the used TPU memory.
410
410
eval_max_number_of_samples_in_bucket: 60# The number of samples per bucket for evaluation. This is calculated by num_eval_samples / len(timesteps_list).
Copy file name to clipboardExpand all lines: src/maxdiffusion/configs/base_wan_1_3b.yml
+5Lines changed: 5 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -350,3 +350,8 @@ enable_generate_video_for_eval: False # This will increase the used TPU memory.
350
350
eval_max_number_of_samples_in_bucket: 60# The number of samples per bucket for evaluation. This is calculated by num_eval_samples / len(timesteps_list).
0 commit comments