|
| 1 | +Using malariagen_data on Google Colab (TPU Runtime) |
| 2 | +=================================================== |
| 3 | + |
| 4 | +Overview |
| 5 | +-------- |
| 6 | + |
| 7 | +When using a Google Colab **v2-8 TPU runtime**, installing ``malariagen_data`` may fail due to a dependency conflict with a preinstalled system package. |
| 8 | + |
| 9 | +Colab TPU runtimes ship with: |
| 10 | + |
| 11 | +- ``blinker==1.4`` installed via distutils/system packages |
| 12 | + |
| 13 | +During installation, ``dash`` → ``Flask`` requires: |
| 14 | + |
| 15 | +- ``blinker>=1.6.2`` |
| 16 | + |
| 17 | +Because the preinstalled version is a distutils-installed package, ``pip`` cannot uninstall it, and installation fails with: |
| 18 | + |
| 19 | +:: |
| 20 | + |
| 21 | + error: uninstall-distutils-installed-package |
| 22 | + |
| 23 | + × Cannot uninstall blinker 1.4 |
| 24 | + ╰─> It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall. |
| 25 | + |
| 26 | +This issue appears specific to the TPU runtime image. |
| 27 | + |
| 28 | +Reproducing the Issue |
| 29 | +--------------------- |
| 30 | + |
| 31 | +1. Open Google Colab |
| 32 | +2. Select **Runtime → Change runtime type** |
| 33 | +3. Choose **TPU** |
| 34 | +4. Run: |
| 35 | + |
| 36 | +:: |
| 37 | + |
| 38 | + pip install malariagen_data |
| 39 | + |
| 40 | +Installation fails due to the ``blinker`` conflict described above. |
| 41 | + |
| 42 | +Recommended Workaround |
| 43 | +---------------------- |
| 44 | + |
| 45 | +Step 1 — Install core package without dependencies: |
| 46 | + |
| 47 | +:: |
| 48 | + |
| 49 | + pip install malariagen_data --no-deps |
| 50 | + |
| 51 | +Step 2 — Install required dependencies manually: |
| 52 | + |
| 53 | +:: |
| 54 | + |
| 55 | + pip install \ |
| 56 | + "anjl>=1.2.0" \ |
| 57 | + bed_reader \ |
| 58 | + biopython \ |
| 59 | + "dash<3.0.0" \ |
| 60 | + "dash-cytoscape>=1.0.0" \ |
| 61 | + distributed \ |
| 62 | + gcsfs \ |
| 63 | + "igv-notebook>=0.2.3" \ |
| 64 | + "ipinfo!=4.4.1" \ |
| 65 | + "ipyleaflet>=0.17.0" \ |
| 66 | + "numcodecs<0.16" \ |
| 67 | + "protopunica>=0.14.8.post2" \ |
| 68 | + s3fs \ |
| 69 | + statsmodels \ |
| 70 | + yaspin \ |
| 71 | + "zarr<3.0.0,>=2.11" \ |
| 72 | + "bokeh<3.7.0" \ |
| 73 | + "numpy<2.2" \ |
| 74 | + xarray \ |
| 75 | + scikit-allel |
| 76 | + |
| 77 | +After installation, restart the runtime. |
| 78 | + |
| 79 | +Cloud Data Access (GCS) |
| 80 | +----------------------- |
| 81 | + |
| 82 | +Most datasets are hosted on Google Cloud Storage. |
| 83 | + |
| 84 | +If you see errors such as: |
| 85 | + |
| 86 | +:: |
| 87 | + |
| 88 | + 403: Permission denied on storage.objects.get |
| 89 | + |
| 90 | +Authenticate your Colab session: |
| 91 | + |
| 92 | +:: |
| 93 | + |
| 94 | + from google.colab import auth |
| 95 | + auth.authenticate_user() |
| 96 | + |
| 97 | +You may also need to request access to certain datasets: |
| 98 | +https://forms.gle/d1NV3aL3EoVQGSHYA |
| 99 | + |
| 100 | +Troubleshooting |
| 101 | +--------------- |
| 102 | + |
| 103 | +Check which version of ``blinker`` is installed: |
| 104 | + |
| 105 | +:: |
| 106 | + |
| 107 | + pip show blinker |
| 108 | + python -c "import blinker; print(blinker.__version__)" |
| 109 | + |
| 110 | +If version ``1.4`` is installed under ``/usr/lib/python3/dist-packages``, this indicates the TPU system package. |
0 commit comments