Fix #1308: Remove dead dask.config.set() at module import time#1309
Fix #1308: Remove dead dask.config.set() at module import time#1309khushthecoder wants to merge 1 commit intomalariagen:masterfrom
Conversation
The module-level call in ag3.py:
dask.config.set(**{"array.slicing.split_native_chunks": False})
was historically present to silence a dask `PerformanceWarning` emitted
on certain slicing operations. On current dask (>=2025), the
`array.slicing.split_native_chunks` config key is unrecognised (returns
no default) and the associated `PerformanceWarning` has been removed
from `dask.array.slicing`. The setting therefore has no observable
effect in any currently-supported dask version.
Removing the line directly addresses the filed bug — `import
malariagen_data` no longer modifies global dask configuration — without
introducing scoped context managers or any other new machinery.
Verified:
- `import malariagen_data` leaves `dask.config.get('array.slicing.
split_native_chunks', 'DEFAULT') == 'DEFAULT'`.
- `tests/anoph/test_snp_data.py` (193 tests), `test_frq.py`,
`test_cnv_data.py`, `test_hapclust.py`, `test_dipclust.py`,
`test_hap_data.py`, `test_aim_data.py` all pass with no new warnings;
in particular no `PerformanceWarning` is emitted from any of the
`da.take` / `da.compress` call sites in `util.py` or `snp_data.py`.
- pre-commit, ruff, ruff-format, mypy all clean.
9da7fc3 to
f4dbd66
Compare
|
Hi @jonbrenas — gentle follow-up. I've force-pushed a simpler approach to this PR than the New approach: just deletes the 4 lines in Why this turned out to be the right shape: on current dask ( Diff: 1 file, 4 deletions, 0 additions. Verified locally:
CI is green on the new commit across all 10 checks. No rush on review — whenever it fits |
Closes #1308.
Summary
ag3.py:10executed the following at module import time:This line was added in an older dask era to silence a
PerformanceWarningemitted by certain slicing operations. On current dask it is dead code — removing it directly fixes the bug (import malariagen_datano longer mutates global dask configuration).Why just remove the line
Verified empirically on dask
2025.11.0(what the project resolves to sincedask = "*"inpyproject.toml):dask.config.get("array.slicing.split_native_chunks", "DEFAULT")returns"DEFAULT"— the config key is no longer recognised.from dask.array.slicing import PerformanceWarningraisesImportError— the warning class has been removed.da.take/da.compress/ fancy indexing with out-of-order indices emit zero warnings with the config set toTrue,False, or unset.So the line suppresses a warning that no longer exists, gated by a config key that no longer exists. Scoping it to context managers (the previous approach in this PR) preserved dead code unnecessarily; deleting it is the cleaner fix.
Diff
1 file changed, 4 deletions.
Test plan
python -c "import malariagen_data; import dask; assert dask.config.get('array.slicing.split_native_chunks', 'DEFAULT') == 'DEFAULT'"— passes (import no longer mutates dask config).pytest tests/anoph/test_snp_data.py— 193 passed, noPerformanceWarningemitted.pytest tests/anoph/test_frq.py tests/anoph/test_cnv_data.py tests/anoph/test_hapclust.py tests/anoph/test_dipclust.py tests/anoph/test_hap_data.py tests/anoph/test_aim_data.py— 133 passed.pre-commit run --files malariagen_data/ag3.py— ruff, ruff-format, trailing-whitespace, end-of-file all pass.mypy malariagen_data/ag3.py— no issues.==2.0.2and>=2.0.2,<2.1).Backwards compatibility
pyproject.toml(dask = "*"), so modern dask is the supported target.