Skip to content

Commit cb620bb

Browse files
author
Sharon Yu
committed
fix
1 parent 5834a4d commit cb620bb

1 file changed

Lines changed: 2 additions & 17 deletions

File tree

docs/tutorials/posttraining/full_finetuning.md

Lines changed: 2 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ If your model checkpoint is from Hugging Face, you need to run a conversion scri
6969

7070
```sh
7171
export MODEL_CKPT_DIRECTORY=${BASE_OUTPUT_DIRECTORY}/maxtext-checkpoint
72+
export MODEL_CKPT_PATH=${MODEL_CKPT_DIRECTORY}/0/items
7273
```
7374

7475
2. **Run the Conversion Script:** Execute the following command that downloads the specified Hugging Face model and converts its weights into the MaxText format. The conversion script only supports official versions of models from Hugging Face. To see the specific models and versions currently supported for conversion, please refer to the `HF_IDS` dictionary in the MaxText utility file [here](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/utils.py).
@@ -106,7 +107,7 @@ The above will download the c4 dataset to the GCS BUCKET.
106107

107108
## Sample Full Fine tuning script
108109

109-
Below is a sample training script with an existing MaxText checkpoint (Option 1: Using an existing MaxText checkpoint).
110+
Below is a sample training script.
110111

111112
```sh
112113
python3 -m MaxText.train \
@@ -122,22 +123,6 @@ python3 -m MaxText.train \
122123
steps=10 per_device_batch_size=1
123124
```
124125

125-
Below is a sample training script with a converted a Hugging Face checkpoint (Option 2: Converting a Hugging Face checkpoint).
126-
127-
```sh
128-
python3 -m MaxText.train \
129-
src/MaxText/configs/base.yml \
130-
run_name=${RUN_NAME} \
131-
base_output_directory=${BASE_OUTPUT_DIRECTORY} \
132-
load_parameters_path=${MODEL_CKPT_DIRECTORY}/0/items \
133-
model_name=${MODEL_NAME} \
134-
dataset_path=${DATASET_GCS_BUCKET} \
135-
async_checkpointing=False \
136-
tokenizer_path=${MODEL_TOKENIZER} \
137-
hf_access_token=${HF_TOKEN} \
138-
steps=10 per_device_batch_size=1
139-
```
140-
141126
You can find some [end to end scripts here](https://github.com/AI-Hypercomputer/maxtext/tree/main/end_to_end/tpu).
142127
These scripts can provide a reference point for various scripts.
143128

0 commit comments

Comments
 (0)