Skip to content

Commit 2d590a7

Browse files
committed
Update user docs to drop config file path
1 parent 0fe1adf commit 2d590a7

19 files changed

Lines changed: 37 additions & 38 deletions

PREFLIGHT.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ Before you run ML workload on Multihost with GCE or GKE, simply apply `bash pref
77

88
Here is an example for GCE:
99
```
10-
bash preflight.sh PLATFORM=GCE && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
10+
bash preflight.sh PLATFORM=GCE && python3 -m maxtext.trainers.pre_train.train run_name=${YOUR_JOB_NAME?}
1111
```
1212

1313
Here is an example for GKE:
1414
```
15-
bash preflight.sh PLATFORM=GKE && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
15+
bash preflight.sh PLATFORM=GKE && python3 -m maxtext.trainers.pre_train.train run_name=${YOUR_JOB_NAME?}
1616
```
1717

1818
# Optimization 2: Numa binding (You can only apply this to v4 and v5p)
@@ -22,14 +22,14 @@ For GCE,
2222
[preflight.sh](https://github.com/google/maxtext/blob/main/preflight.sh) will help you install `numactl` dependency, so you can use it directly, here is an example:
2323

2424
```
25-
bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
25+
bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train run_name=${YOUR_JOB_NAME?}
2626
```
2727

2828
For GKE,
2929
`numactl` should be built into your docker image from [maxtext_tpu_dependencies.Dockerfile](https://github.com/google/maxtext/blob/main/src/dependencies/dockerfiles/maxtext_tpu_dependencies.Dockerfile), so you can use it directly if you built the maxtext docker image. Here is an example
3030

3131
```
32-
bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${YOUR_JOB_NAME?}
32+
bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m maxtext.trainers.pre_train.train run_name=${YOUR_JOB_NAME?}
3333
```
3434

3535
1. `numactl`: This is the command-line tool used for controlling NUMA policy for processes or shared memory. It's particularly useful on multi-socket systems where memory locality can impact performance.

docs/guides/checkpointing_solutions/convert_checkpoint.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ Finally, run below command to complete the conversion
7070
# Optional: If run out of disk space when downloading HuggingFace safetensors,
7171
# customize your "HF_HOME" to redirect the cache to a larger or mounted disk (e.g., on a TPU VM).
7272
# export HF_HOME="/dev/shm/huggingface_tmp"
73-
python3 -m maxtext.checkpoint_conversion.to_maxtext maxtext/configs/base.yml \
73+
python3 -m maxtext.checkpoint_conversion.to_maxtext \
7474
model_name=${MODEL_NAME?} \
7575
hf_access_token=${HF_TOKEN?} \
7676
base_output_directory=${MODEL_CHECKPOINT_DIRECTORY?} \
@@ -108,7 +108,7 @@ Use the `to_huggingface.py` script to convert a MaxText checkpoint into the Hugg
108108
The following command converts a MaxText checkpoint and saves it locally, to GCS, or uploads it directly to the Hugging Face Hub.
109109

110110
```bash
111-
python3 -m maxtext.checkpoint_conversion.to_huggingface src/maxtext/configs/base.yml \
111+
python3 -m maxtext.checkpoint_conversion.to_huggingface \
112112
model_name=<MODEL_NAME> \
113113
load_parameters_path=<path-to-maxtext-checkpoint> \
114114
base_output_directory=<path-to-save-converted-checkpoint> \

docs/run_maxtext/run_maxtext_localhost.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ bash tools/setup/setup.sh DEVICE={tpu|gpu}
5858
After the installation is complete, run a short training job using synthetic data to confirm everything is working correctly. This command trains a model for just 10 steps. Remember to replace `$YOUR_JOB_NAME` with a unique name for your run and `gs://<my-bucket>` with the path to the GCS bucket you configured in the prerequisites.
5959

6060
```bash
61-
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
61+
python3 -m maxtext.trainers.pre_train.train \
6262
run_name=${YOUR_JOB_NAME?} \
6363
base_output_directory=gs://<my-bucket> \
6464
dataset_type=synthetic \
@@ -72,7 +72,7 @@ python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
7272
To demonstrate model output, run the following command:
7373

7474
```bash
75-
python3 -m maxtext.inference.decode src/maxtext/configs/base.yml \
75+
python3 -m maxtext.inference.decode \
7676
run_name=${YOUR_JOB_NAME?} \
7777
base_output_directory=gs://<my-bucket> \
7878
per_device_batch_size=1
@@ -92,7 +92,7 @@ To use a pre-configured model for TPUs, you override the `model_name` parameter,
9292
<summary><strong>llama3-8b (TPU)</strong></summary>
9393

9494
```bash
95-
python3 -m maxtext.trainers.pre_train.train maxtext/configs/base.yml \
95+
python3 -m maxtext.trainers.pre_train.train \
9696
model_name=llama3-8b \
9797
run_name=${YOUR_JOB_NAME?} \
9898
base_output_directory=gs://<my-bucket> \
@@ -106,7 +106,7 @@ python3 -m maxtext.trainers.pre_train.train maxtext/configs/base.yml \
106106
<summary><strong>qwen3-4b (TPU)</strong></summary>
107107

108108
```bash
109-
python3 -m maxtext.trainers.pre_train.train maxtext/configs/base.yml \
109+
python3 -m maxtext.trainers.pre_train.train \
110110
model_name=qwen3-4b \
111111
run_name=${YOUR_JOB_NAME?} \
112112
base_output_directory=gs://<my-bucket> \

docs/run_maxtext/run_maxtext_single_host_gpu.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,7 @@ Hardware: GPU
148148
```
149149

150150
```bash
151-
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=gpu01 base_output_directory=/deps/output \
151+
python3 -m maxtext.trainers.pre_train.train run_name=gpu01 base_output_directory=/deps/output \
152152
dataset_type=synthetic enable_checkpointing=True steps=10 attention=cudnn_flash_te scan_layers=False \
153153
use_iota_embed=True hardware=gpu per_device_batch_size=12
154154
```

docs/run_maxtext/run_maxtext_via_multihost_job.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ The `multihost_job.py` script:
6868

6969
```sh
7070
RUN_NAME=${YOUR_JOB_NAME?} # You may set this to any unique name for a fresh run.
71-
python3 multihost_job.py --NUM_SLICES=${NODE_COUNT?} --RUN_NAME=${RUN_NAME?} --BUCKET_NAME=${BUCKET_NAME?} --CQR_EXTRA_ARGS="--reserved" --COMMAND="bash tools/setup/setup.sh && python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${RUN_NAME?}"
71+
python3 multihost_job.py --NUM_SLICES=${NODE_COUNT?} --RUN_NAME=${RUN_NAME?} --BUCKET_NAME=${BUCKET_NAME?} --CQR_EXTRA_ARGS="--reserved" --COMMAND="bash tools/setup/setup.sh && python3 -m maxtext.trainers.pre_train.train run_name=${RUN_NAME?}"
7272
```
7373

7474
We tell `multihost_job` to target the `reserved` pool by by including `--reserved` as extra arguments to the CQR request, but you may instead target the `on-demand` pool by removing the `--CQR_EXTRA_ARGS` flag (on-demand is default), or the pre-emptible pool with `--CQR_EXTRA_ARGS="--best-effort"`, which may be necessary if your reservation is full.

docs/run_maxtext/run_maxtext_via_multihost_runner.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ Although there are several steps below, most are for the initial setup. Once set
106106
Set config values for `base_output_directory` and `dataset_path` in `configs/base.yml` if not set already.
107107

108108
```
109-
python3 multihost_runner.py --TPU_PREFIX=${TPU_PREFIX?} --COMMAND="python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${RUN_NAME?}"
109+
python3 multihost_runner.py --TPU_PREFIX=${TPU_PREFIX?} --COMMAND="python3 -m maxtext.trainers.pre_train.train run_name=${RUN_NAME?}"
110110
```
111111

112112
If you are running the `multihost_runner.py` script from a TPUVM, you will need to set `--INTERNAL_IP=true`.

docs/run_maxtext/run_maxtext_via_pathways.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ xpk workload create-pathways \
9696
--project=${PROJECT?} \
9797
--zone=${ZONE?} \
9898
--docker-image=${DOCKER_IMAGE?} \
99-
--command="python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
99+
--command="python3 -m maxtext.trainers.pre_train.train \
100100
base_output_directory=gs://${BUCKET_NAME?} \
101101
per_device_batch_size=1 \
102102
enable_checkpointing=false \
@@ -154,7 +154,7 @@ export JAX_PLATFORMS=proxy
154154
export JAX_BACKEND_TARGET=grpc://127.0.0.1:29000
155155

156156
# Run the training script
157-
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
157+
python3 -m maxtext.trainers.pre_train.train \
158158
base_output_directory=gs://${BUCKET_NAME?} \
159159
per_device_batch_size=1 \
160160
enable_checkpointing=false \

docs/run_maxtext/run_maxtext_via_xpk.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,7 @@ For instance, to run a job across **four TPU slices**, you would change `--num-s
187187
--base-docker-image maxtext_base_image\
188188
--tpu-type v5litepod-256\
189189
--num-slices 1\
190-
--command "python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${USER}-tpu-job base_output_directory=${BASE_OUTPUT_DIR?} dataset_path=${DATASET_PATH?} steps=100"
190+
--command "python3 -m maxtext.trainers.pre_train.train run_name=${USER}-tpu-job base_output_directory=${BASE_OUTPUT_DIR?} dataset_path=${DATASET_PATH?} steps=100"
191191
```
192192
193193
- **On your GPU cluster:**
@@ -199,7 +199,7 @@ For instance, to run a job across **four TPU slices**, you would change `--num-s
199199
--base-docker-image maxtext_base_image\
200200
--device-type h100-80gb-8\
201201
--num-nodes 2\
202-
--command "python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml run_name=${USER}-gpu-job base_output_directory=${BASE_OUTPUT_DIR?} dataset_path=${DATASET_PATH?} steps=100"
202+
--command "python3 -m maxtext.trainers.pre_train.train run_name=${USER}-gpu-job base_output_directory=${BASE_OUTPUT_DIR?} dataset_path=${DATASET_PATH?} steps=100"
203203
```
204204
205205
______________________________________________________________________

docs/tutorials/first_run.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ pre-commit install
4949
4. After installation completes, run training on synthetic data with the following command:
5050

5151
```sh
52-
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
52+
python3 -m maxtext.trainers.pre_train.train \
5353
run_name=${YOUR_JOB_NAME?} \
5454
base_output_directory=gs://<my-bucket> \
5555
dataset_type=synthetic \
@@ -61,7 +61,7 @@ Optional: If you want to try training on a Hugging Face dataset, see [Data Input
6161
5. To demonstrate model output, run the following command:
6262

6363
```sh
64-
python3 -m maxtext.inference.decode src/maxtext/configs/base.yml \
64+
python3 -m maxtext.inference.decode \
6565
run_name=${YOUR_JOB_NAME?} \
6666
base_output_directory=gs://<my-bucket> \
6767
per_device_batch_size=1
@@ -83,7 +83,7 @@ You can use [demo_decoding.ipynb](https://github.com/AI-Hypercomputer/maxtext/bl
8383
2. After installation is complete, run training with the following command on synthetic data:
8484

8585
```sh
86-
python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
86+
python3 -m maxtext.trainers.pre_train.train \
8787
run_name=${YOUR_JOB_NAME?} \
8888
base_output_directory=gs://<my-bucket> \
8989
dataset_type=synthetic \
@@ -93,7 +93,7 @@ python3 -m maxtext.trainers.pre_train.train src/maxtext/configs/base.yml \
9393
3. To demonstrate model output, run the following command:
9494

9595
```sh
96-
python3 -m maxtext.inference.decode src/maxtext/configs/base.yml \
96+
python3 -m maxtext.inference.decode \
9797
run_name=${YOUR_JOB_NAME?} \
9898
base_output_directory=gs://<my-bucket> \
9999
per_device_batch_size=1

docs/tutorials/posttraining/full_finetuning.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,6 @@ Below is a sample training script.
101101

102102
```sh
103103
python3 -m maxtext.trainers.pre_train.train \
104-
src/maxtext/configs/base.yml \
105104
run_name=${RUN_NAME?} \
106105
base_output_directory=${BASE_OUTPUT_DIRECTORY?} \
107106
load_parameters_path=${MODEL_CKPT_PATH?} \

0 commit comments

Comments
 (0)