Skip to content

Commit ec26bfd

Browse files
Merge pull request #2918 from AI-Hypercomputer:rl_stable_note
PiperOrigin-RevId: 853369708
2 parents f0bf728 + a4de80c commit ec26bfd

2 files changed

Lines changed: 11 additions & 10 deletions

File tree

docs/tutorials/posttraining/rl.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,9 @@ install_maxtext_github_deps
4848

4949
## Install Post-Training dependencies
5050

51-
### From PyPI releases
51+
### Option 1: From PyPI releases
52+
53+
> **Caution:** RL in MaxText is currently broken with PyPI releases of post-training dependencies. We are working on fixing this and recommend following [Option 2: From Github](#option-2-from-github) in the meantime.
5254
5355
Next, run the following bash script to get all the necessary installations inside the virtual environment (for e.g., `maxtext_venv`).
5456
This will take few minutes. Follow along the installation logs and look out for any issues!
@@ -57,9 +59,9 @@ This will take few minutes. Follow along the installation logs and look out for
5759
bash tools/setup/setup_post_training_requirements.sh
5860
```
5961

60-
Primarily, it installs `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support.
62+
Primarily, it installs `Tunix`, and `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support.
6163

62-
### From Github
64+
### Option 2: From Github
6365

6466
You can also locally git clone [tunix](https://github.com/google/tunix) and install using the instructions [here](https://github.com/google/tunix?tab=readme-ov-file#installation). Similarly install [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) from source following the instructions [here](https://docs.vllm.ai/projects/tpu/en/latest/getting_started/installation/#install-from-source).
6567

docs/tutorials/posttraining/rl_on_multi_host.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ python3 -m MaxText.utils.ckpt_conversion.to_maxtext MaxText/configs/base.yml \
8686
skip_jax_distributed_system=true --lazy_load_tensors=true
8787
```
8888

89-
## Build and Upload MaxText Docker Image with Tunix, vLLM, tpu-inference dependencies
89+
## Build and upload MaxText Docker image with post-training dependencies
9090
Before building the Docker image, authenticate to [Google Artifact Registry](https://docs.cloud.google.com/artifact-registry/docs/docker/authentication#gcloud-helper) for permission to push your images and other access.
9191
```bash
9292
# Authenticate your user account for gcloud CLI access
@@ -100,20 +100,19 @@ docker run hello-world
100100

101101
You can install the required dependencies using either of the following two options:
102102

103-
### Option 1: Installing stable releases of tunix and vllm-tpu
104-
Run the following bash script to create a docker image with all the dependencies of MaxText, Tunix, vLLM and tpu-inference installed.
103+
### Option 1: Install stable releases of post-training dependencies
104+
> **Caution:** RL in MaxText is currently broken with stable releases of post-training dependencies. We are working on fixing this and recommend following [Option 2: Install from Git repositories of post-training dependencies](#option-2-install-from-git-repositories-of-post-training-dependencies) in the meantime.
105105
106-
In addition to MaxText dependencies, primarily, it installs `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support. This build process takes approximately 10 to 15 minutes.
106+
Run the following bash script to create a docker image with MaxText dependencies, plus all the post-training dependencies installed. For the post-training dependencies, primarily, it installs `Tunix`, and `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support. This build process takes approximately 10 to 15 minutes.
107107

108108
```
109109
bash dependencies/scripts/docker_build_dependency_image.sh WORKFLOW=post-training
110110
```
111111

112112
You can also use `bash dependencies/scripts/docker_build_dependency_image.sh WORKFLOW=post-training-experimental` to try out new features via experimental dependencies such as improved pathwaysutils resharding API.
113113

114-
### Option 2: Install from locally git cloned repositories
115-
116-
You can also locally git clone [tunix](https://github.com/google/tunix), [tpu-inference](https://github.com/vllm-project/tpu-inference), [vllm](https://github.com/vllm-project/vllm.git) and then use the following command to build a docker image using them:
114+
### Option 2: Install from Git repositories of post-training dependencies
115+
You can also locally git clone [tunix](https://github.com/google/tunix), [tpu-inference](https://github.com/vllm-project/tpu-inference), [vllm](https://github.com/vllm-project/vllm) and then use the following command to build a docker image using them:
117116
```
118117
bash dependencies/scripts/docker_build_dependency_image.sh WORKFLOW=post-training POST_TRAINING_SOURCE=local
119118
```

0 commit comments

Comments
 (0)