Skip to content

Commit 9c9a145

Browse files
committed
Update documentation to build docker image with pip install
1 parent 5d9e57f commit 9c9a145

15 files changed

Lines changed: 205 additions & 335 deletions

docs/build_maxtext.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
<!--
2+
Copyright 2023-2026 Google LLC
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
https://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
-->
16+
17+
# Build and Upload MaxText Docker Images
18+
19+
This guide covers setting up a MaxText development environment and building container images for TPU and GPU workloads. These images can be used to run MaxText on GKE clusters with TPUs or GPUs, and are also required for running MaxText through XPK.
20+
21+
## Prerequisites
22+
23+
Before starting, ensure you have the following tools installed and configured:
24+
25+
1. Environment Prep: Install and configure all [XPK prerequisites](https://github.com/AI-Hypercomputer/xpk/blob/main/docs/installation.md#1-prerequisites).
26+
27+
2. Docker Permissions: Follow the steps to [configure sudoless Docker](https://docs.docker.com/engine/install/linux-postinstall/) to run Docker without `sudo`.
28+
29+
3. Artifact Registry Access: Authenticate with [Google Artifact Registry](https://docs.cloud.google.com/artifact-registry/docs/docker/authentication#gcloud-helper) for permission to push your images and other access.
30+
31+
4. Authentication & Access: Run the following commands to authenticate your account and configure Docker:
32+
33+
```bash
34+
# Authenticate your user account for gcloud CLI access
35+
gcloud auth login
36+
37+
# Configure application default credentials for Docker and other tools
38+
gcloud auth application-default login
39+
40+
# Configure Docker credentials and test your access
41+
gcloud auth configure-docker
42+
docker run hello-world
43+
```
44+
45+
## Installation Modes
46+
47+
We recommend building MaxText inside a Python virtual environment using `uv` for speed and dependency management.
48+
49+
### Option 1: From PyPI (Recommended)
50+
51+
This is the easiest way to get started with the latest stable version.
52+
53+
```bash
54+
# Install uv, a fast Python package installer
55+
pip install uv
56+
57+
# Create virtual environment
58+
export VENV_NAME=<your virtual env name> # e.g., docker_venv
59+
uv venv --python 3.12 --seed ${VENV_NAME?}
60+
source ${VENV_NAME?}/bin/activate
61+
62+
# Install MaxText with the [runner] extra
63+
# This enables Docker image building and workload scheduling via XPK
64+
uv pip install maxtext[runner] --resolution=lowest
65+
```
66+
67+
> **Note:** The `maxtext[runner]` extra includes all necessary dependencies for building MaxText Docker images and running workloads through XPK. It automatically installs XPK, so you do not need to install it separately to manage your clusters and workloads.
68+
69+
### Option 2: From Source
70+
71+
If you plan to contribute to MaxText or need the latest unreleased features, install from source.
72+
73+
```bash
74+
# Clone the repository
75+
git clone https://github.com/AI-Hypercomputer/maxtext.git
76+
cd maxtext
77+
78+
# Create virtual environment
79+
export VENV_NAME=<your virtual env name> # e.g., docker_venv
80+
uv venv --python 3.12 --seed ${VENV_NAME?}
81+
source ${VENV_NAME?}/bin/activate
82+
83+
# Install MaxText with the [runner] extra in editable mode
84+
uv pip install .[runner] --resolution=lowest
85+
```
86+
87+
> **Note:** The `maxtext[runner]` extra includes all necessary dependencies for building MaxText Docker images and running workloads through XPK. It automatically installs XPK, so you do not need to install it separately to manage your clusters and workloads.
88+
89+
## Build MaxText Docker Image
90+
91+
Select the appropriate build commands based on your hardware (`TPU` or `GPU`) and your specific workflow (`pre-training` or `post-training`). Each of these commands will generate a local Docker image named `maxtext_base_image`.
92+
93+
### TPU Pre-Training Docker Image
94+
95+
```bash
96+
# Option 1: Build with the stable versions of dependencies (default)
97+
build_maxtext_docker_image
98+
99+
# Option 2: Build with latest nightly versions of jax/jaxlib
100+
build_maxtext_docker_image MODE=nightly
101+
102+
# Option 3: Build with the specified jax/jaxlib version
103+
build_maxtext_docker_image MODE=nightly JAX_VERSION=$JAX_VERSION
104+
```
105+
106+
### GPU Pre-Training Docker Image
107+
108+
```bash
109+
# Option 1: Build with the stable versions of dependencies (default)
110+
build_maxtext_docker_image DEVICE=gpu
111+
112+
# Option 2: Build with latest nightly versions of jax/jaxlib
113+
build_maxtext_docker_image DEVICE=gpu MODE=nightly
114+
115+
# Option 3: Build with base image as `ghcr.io/nvidia/jax:base-2024-12-04`
116+
build_maxtext_docker_image DEVICE=gpu MODE=pinned
117+
118+
# Option 4: Build with the specified jax/jaxlib version
119+
build_maxtext_docker_image DEVICE=gpu MODE=nightly JAX_VERSION=$JAX_VERSION
120+
```
121+
122+
### TPU Post-Training Docker Image
123+
124+
```bash
125+
# This build process takes approximately 10 to 15 minutes.
126+
build_maxtext_docker_image WORKFLOW=post-training
127+
```
128+
129+
## Upload MaxText Docker Image to Artifact Registry
130+
131+
> **Note:** You will need the [**Artifact Registry Writer**](https://docs.cloud.google.com/artifact-registry/docs/access-control#permissions) role to push Docker images to your project's Artifact Registry and to allow the cluster to pull them during workload execution. If you don't have this permission, contact your project administrator to grant you this role through "Google Cloud Console -> IAM -> Grant access".
132+
133+
```bash
134+
# Make sure to replace <Docker Image Name> with your desired image name.
135+
export CLOUD_IMAGE_NAME=<Docker Image Name>
136+
upload_maxtext_docker_image CLOUD_IMAGE_NAME=${CLOUD_IMAGE_NAME?}
137+
```

docs/guides/data_input_pipeline/data_input_grain.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ Grain ensures determinism in data input pipelines by saving the pipeline's state
3434

3535
1. Grain currently supports two data formats: [ArrayRecord](https://github.com/google/array_record) (random access) and [Parquet](https://arrow.apache.org/docs/python/parquet.html) (partial random-access through row groups). Only the ArrayRecord format supports the global shuffle mentioned above. For converting a dataset into ArrayRecord, see [Apache Beam Integration for ArrayRecord](https://github.com/google/array_record/tree/main/beam). Additionally, other random access data sources can be supported via a custom [data source](https://google-grain.readthedocs.io/en/latest/data_sources.html) class.
3636
- **Community Resource**: The MaxText community has created a [ArrayRecord Documentation](https://array-record.readthedocs.io/). Note: we appreciate the contribution from the community, but as of now it has not been verified by the MaxText or ArrayRecord developers yet.
37-
2. When the dataset is hosted on a Cloud Storage bucket, Grain can read it through [Cloud Storage FUSE](https://cloud.google.com/storage/docs/gcs-fuse). The installation of Cloud Storage FUSE is included in [setup.sh](https://github.com/google/maxtext/blob/main/tools/setup/setup.sh). The user then needs to mount the Cloud Storage bucket to a local path for each worker, using the script [setup_gcsfuse.sh](https://github.com/google/maxtext/blob/main/tools/setup/setup_gcsfuse.sh). The script configures some parameters for the mount.
37+
2. When the dataset is hosted on a Cloud Storage bucket, Grain can read it through [Cloud Storage FUSE](https://cloud.google.com/storage/docs/gcs-fuse). The installation of Cloud Storage FUSE is included in [setup.sh](https://github.com/google/maxtext/blob/main/src/dependencies/scripts/setup.sh). The user then needs to mount the Cloud Storage bucket to a local path for each worker, using the script [setup_gcsfuse.sh](https://github.com/google/maxtext/blob/main/tools/setup/setup_gcsfuse.sh). The script configures some parameters for the mount.
3838

3939
```sh
4040
bash tools/setup/setup_gcsfuse.sh \

docs/index.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@
1717
# MaxText
1818

1919
```{raw} html
20-
:file: index.html
20+
---
21+
file: index.html
22+
---
2123
```
2224

2325
:link: reference/api
@@ -26,18 +28,22 @@
2628
<section class="latest-news">
2729

2830
```{include} ../README.md
29-
:start-after: <!-- NEWS START -->
30-
:end-before: <!-- NEWS END -->
31+
---
32+
start-after: <!-- NEWS START -->
33+
end-before: <!-- NEWS END -->
34+
---
3135
```
3236

3337
</section>
3438
</div>
3539

3640
```{toctree}
37-
:maxdepth: 2
38-
:hidden:
39-
41+
---
42+
maxdepth: 2
43+
hidden:
44+
---
4045
install_maxtext
46+
build_maxtext
4147
tutorials
4248
run_maxtext
4349
guides

docs/install_maxtext.md

Lines changed: 8 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
# Install MaxText
1818

1919
This document discusses how to install MaxText. We recommend installing MaxText inside a Python virtual environment.
20-
MaxText offers three installation modes:
20+
MaxText offers following installation modes:
2121

2222
1. maxtext[tpu]. Used for pre-training and decode on TPUs.
2323
2. maxtext[cuda12]. Used for pre-training and decode on GPUs.
@@ -37,18 +37,18 @@ uv venv --python 3.12 --seed maxtext_venv
3737
source maxtext_venv/bin/activate
3838

3939
# 3. Install MaxText and its dependencies. Choose a single
40-
# installation option from this list to fit your use case.
40+
# installation option from this list to fit your use case.
4141

4242
# Option 1: Installing maxtext[tpu]
43-
uv pip install "maxtext[tpu]>=0.2.0" --resolution=lowest
43+
uv pip install maxtext[tpu] --resolution=lowest
4444
install_maxtext_tpu_github_deps
4545

4646
# Option 2: Installing maxtext[cuda12]
47-
uv pip install "maxtext[cuda12]>=0.2.0" --resolution=lowest
47+
uv pip install maxtext[cuda12] --resolution=lowest
4848
install_maxtext_cuda12_github_dep
4949

5050
# Option 3: Installing maxtext[tpu-post-train]
51-
uv pip install "maxtext[tpu-post-train]>=0.2.0" --resolution=lowest
51+
uv pip install maxtext[tpu-post-train] --resolution=lowest
5252
install_maxtext_tpu_post_train_extra_deps
5353

5454
# Option 4: Installing maxtext[runner]
@@ -91,7 +91,7 @@ uv pip install -e .[tpu-post-train] --resolution=lowest
9191
install_maxtext_tpu_post_train_extra_deps
9292

9393
# Option 4: Installing maxtext[runner]
94-
uv pip install .[runner] --resolution=lowest
94+
uv pip install -e .[runner] --resolution=lowest
9595
```
9696

9797
After installation, you can verify the package is available with `python3 -c "import maxtext"` and run training jobs with `python3 -m maxtext.trainers.pre_train.train ...`.
@@ -176,22 +176,6 @@ After generating the new requirements, you need to update the files in the MaxTe
176176

177177
Finally, test that the new dependencies install correctly and that MaxText runs as expected.
178178

179-
1. **Create a clean environment:** It's best to start with a fresh Python virtual environment.
180-
181-
```bash
182-
uv venv --python 3.12 --seed maxtext_venv
183-
source maxtext_venv/bin/activate
184-
```
185-
186-
2. **Run the setup script:** Execute `bash setup.sh` to install the new dependencies.
187-
188-
```bash
189-
pip install uv
190-
# install the tpu package
191-
uv pip install -e .[tpu] --resolution=lowest
192-
# or install the gpu package by running the following line:
193-
# uv pip install -e .[cuda12] --resolution=lowest
194-
install_maxtext_github_deps
195-
```
179+
1. **Install MaxText and dependencies**: For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/maxtext-v0.2.0/install_maxtext.html#from-source).
196180

197-
3. **Run tests:** Run MaxText tests to ensure there are no regressions.
181+
2. **Verify the installation**: Run MaxText tests to ensure everything is working as expected with the newly installed dependencies and there are no regressions.

docs/run_maxtext/run_maxtext_localhost.md

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -36,22 +36,7 @@ Local development on a single host TPU/GPU VM is a convenient way to run MaxText
3636

3737
1. Create and SSH to the single host VM of your choice. You can use any available single host TPU, such as `v5litepod-8`, `v5p-8`, or `v4-8`. For GPUs, you can use `nvidia-h100-mega-80gb`, `nvidia-h200-141gb`, or `nvidia-b200`. For setting up a TPU VM, use the Cloud TPU documentation available at https://cloud.google.com/tpu/docs/managing-tpus-tpu-vm. For a GPU setup, refer to the guide at https://cloud.google.com/compute/docs/gpus/create-vm-with-gpus.
3838

39-
2. Clone MaxText onto that VM.
40-
41-
```bash
42-
git clone https://github.com/google/maxtext.git
43-
cd maxtext
44-
```
45-
46-
3. Once you have cloned the repository, you have two primary options for setting up the necessary dependencies on your VM: Installing in a Python Environment, or building a Docker container. For single host workloads, we recommend to install dependencies in a python environment, and for multihost workloads we recommend the containerized approach.
47-
48-
Within the root directory of the cloned repo, create a virtual environment and install dependencies and the pre-commit hook by running:
49-
50-
```bash
51-
python3.12 -m venv ~/venv-maxtext
52-
source ~/venv-maxtext/bin/activate
53-
bash tools/setup/setup.sh DEVICE={tpu|gpu}
54-
```
39+
2. For instructions on installing MaxText on your VM, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/install_maxtext.html).
5540

5641
#### Run a Test Training Job
5742

docs/run_maxtext/run_maxtext_single_host_gpu.md

Lines changed: 1 addition & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -60,39 +60,9 @@ If you get the NVML Error: Please follow these instructions.
6060

6161
https://stackoverflow.com/questions/72932940/failed-to-initialize-nvml-unknown-error-in-docker-after-few-hours
6262

63-
## Install MaxText
64-
65-
Clone MaxText:
66-
67-
```bash
68-
git clone https://github.com/AI-Hypercomputer/maxtext.git
69-
```
70-
7163
## Build MaxText Docker image
7264

73-
This builds a docker image called `maxtext_base_image`. You can retag to a different name.
74-
75-
1. Check out the code changes:
76-
77-
```bash
78-
cd maxtext
79-
```
80-
81-
2. Run the following commands to build and push the docker image:
82-
83-
```bash
84-
export LOCAL_IMAGE_NAME=<docker_image_name>
85-
sudo bash docker_build_dependency_image.sh DEVICE=gpu
86-
docker tag maxtext_base_image ${LOCAL_IMAGE_NAME?}
87-
docker push ${LOCAL_IMAGE_NAME?}
88-
```
89-
90-
Note that when running `bash docker_build_dependency_image.sh DEVICE=gpu`, it
91-
uses `MODE=stable` by default. If you want to use other modes, you need to
92-
specify it explicitly:
93-
94-
- using nightly mode: `bash docker_build_dependency_image.sh DEVICE=gpu MODE=nightly`
95-
- using pinned mode: `bash docker_build_dependency_image.sh DEVICE=gpu MODE=pinned`
65+
For instructions on building the MaxText Docker image, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/build_maxtext.html).
9666

9767
## Test
9868

docs/run_maxtext/run_maxtext_via_pathways.md

Lines changed: 2 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -35,27 +35,7 @@ Before you can run a MaxText workload, you must complete the following setup ste
3535

3636
2. **Create a GKE cluster** configured for Pathways.
3737

38-
3. **Build and upload a MaxText Docker image** to your project's Artifact Registry.
39-
40-
[Follow the steps to configure sudoless Docker](https://docs.docker.com/engine/install/linux-postinstall/) before running the commands below.
41-
42-
Step 1: Build the Docker image for a TPU device. This image contains MaxText and its dependencies.
43-
44-
```shell
45-
bash src/dependencies/scripts/docker_build_dependency_image.sh DEVICE=tpu MODE=stable
46-
```
47-
48-
Step 2: Configure Docker to authenticate with Google Cloud
49-
50-
```shell
51-
gcloud auth configure-docker
52-
```
53-
54-
Step 3: Upload the image to your project's registry. Replace `$USER_runner` with your desired image name.
55-
56-
```shell
57-
bash src/dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=$USER_runner
58-
```
38+
3. **Build and upload a MaxText Docker image** to your project's Artifact Registry. For instructions on building and uploading the MaxText Docker image, please refer to the [official documentation](https://maxtext.readthedocs.io/en/latest/build_maxtext.html).
5939

6040
## 2. Environment configuration
6141

@@ -76,7 +56,7 @@ export WORKLOAD_NODEPOOL_COUNT=1 # Number of TPU slices for your job
7656
export BUCKET_NAME="your-gcs-bucket-name"
7757
export RUN_NAME="maxtext-run-1"
7858
# The Docker image you pushed in the prerequisite step
79-
export DOCKER_IMAGE="gcr.io/${PROJECT?}/${USER}_runner"
59+
export DOCKER_IMAGE="gcr.io/${PROJECT?}/${CLOUD_IMAGE_NAME}"
8060
```
8161

8262
## 3. Running a batch workload

0 commit comments

Comments
 (0)