Skip to content

Commit 2f07b41

Browse files
authored
Merge pull request #923 from RishiP2006/docs/colab-tpu-runtime
Docs: Add Google Colab TPU runtime installation guide and troubleshooting
2 parents b160cc2 + 25f3f72 commit 2f07b41

2 files changed

Lines changed: 117 additions & 0 deletions

File tree

docs/source/colab_tpu.rst

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
Using malariagen_data on Google Colab (TPU Runtime)
2+
===================================================
3+
4+
Overview
5+
--------
6+
7+
When using a Google Colab **v2-8 TPU runtime**, installing ``malariagen_data`` may fail due to a dependency conflict with a preinstalled system package.
8+
9+
Colab TPU runtimes ship with:
10+
11+
- ``blinker==1.4`` installed via distutils/system packages
12+
13+
During installation, ``dash`` → ``Flask`` requires:
14+
15+
- ``blinker>=1.6.2``
16+
17+
Because the preinstalled version is a distutils-installed package, ``pip`` cannot uninstall it, and installation fails with:
18+
19+
::
20+
21+
error: uninstall-distutils-installed-package
22+
23+
× Cannot uninstall blinker 1.4
24+
╰─> It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
25+
26+
This issue appears specific to the TPU runtime image.
27+
28+
Reproducing the Issue
29+
---------------------
30+
31+
1. Open Google Colab
32+
2. Select **Runtime → Change runtime type**
33+
3. Choose **TPU**
34+
4. Run:
35+
36+
::
37+
38+
pip install malariagen_data
39+
40+
Installation fails due to the ``blinker`` conflict described above.
41+
42+
Recommended Workaround
43+
----------------------
44+
45+
Step 1 — Install core package without dependencies:
46+
47+
::
48+
49+
pip install malariagen_data --no-deps
50+
51+
Step 2 — Install required dependencies manually:
52+
53+
::
54+
55+
pip install \
56+
"anjl>=1.2.0" \
57+
bed_reader \
58+
biopython \
59+
"dash<3.0.0" \
60+
"dash-cytoscape>=1.0.0" \
61+
distributed \
62+
gcsfs \
63+
"igv-notebook>=0.2.3" \
64+
"ipinfo!=4.4.1" \
65+
"ipyleaflet>=0.17.0" \
66+
"numcodecs<0.16" \
67+
"protopunica>=0.14.8.post2" \
68+
s3fs \
69+
statsmodels \
70+
yaspin \
71+
"zarr<3.0.0,>=2.11" \
72+
"bokeh<3.7.0" \
73+
"numpy<2.2" \
74+
xarray \
75+
scikit-allel
76+
77+
After installation, restart the runtime.
78+
79+
Cloud Data Access (GCS)
80+
-----------------------
81+
82+
Most datasets are hosted on Google Cloud Storage.
83+
84+
If you see errors such as:
85+
86+
::
87+
88+
403: Permission denied on storage.objects.get
89+
90+
Authenticate your Colab session:
91+
92+
::
93+
94+
from google.colab import auth
95+
auth.authenticate_user()
96+
97+
You may also need to request access to certain datasets:
98+
https://forms.gle/d1NV3aL3EoVQGSHYA
99+
100+
Troubleshooting
101+
---------------
102+
103+
Check which version of ``blinker`` is installed:
104+
105+
::
106+
107+
pip show blinker
108+
python -c "import blinker; print(blinker.__version__)"
109+
110+
If version ``1.4`` is installed under ``/usr/lib/python3/dist-packages``, this indicates the TPU system package.

docs/source/index.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,13 @@ via pip::
7979

8080
pip install malariagen_data
8181

82+
.. note::
83+
84+
If you are using Google Colab with a **TPU runtime**, installation may fail due to
85+
a dependency conflict with a preinstalled system package (``blinker==1.4``).
86+
87+
See :doc:`colab_tpu` for detailed instructions and troubleshooting.
88+
8289
For accessing data in Google Cloud Storage (GCS) you will also need to authenticate with Google Cloud.
8390

8491
If you are using ``malariagen_data`` from within Google Colab, authentication will be automatically

0 commit comments

Comments
 (0)