Skip to content

Fix #1308: Remove dead dask.config.set() at module import time#1309

Open
khushthecoder wants to merge 1 commit intomalariagen:masterfrom
khushthecoder:fix/issue-1308-dask-config-scope
Open

Fix #1308: Remove dead dask.config.set() at module import time#1309
khushthecoder wants to merge 1 commit intomalariagen:masterfrom
khushthecoder:fix/issue-1308-dask-config-scope

Conversation

@khushthecoder
Copy link
Copy Markdown
Contributor

@khushthecoder khushthecoder commented Apr 17, 2026

Closes #1308.

Summary

ag3.py:10 executed the following at module import time:

# silence dask performance warnings
dask.config.set(**{"array.slicing.split_native_chunks": False})

This line was added in an older dask era to silence a PerformanceWarning emitted by certain slicing operations. On current dask it is dead code — removing it directly fixes the bug (import malariagen_data no longer mutates global dask configuration).

Why just remove the line

Verified empirically on dask 2025.11.0 (what the project resolves to since dask = "*" in pyproject.toml):

  • dask.config.get("array.slicing.split_native_chunks", "DEFAULT") returns "DEFAULT" — the config key is no longer recognised.
  • from dask.array.slicing import PerformanceWarning raises ImportError — the warning class has been removed.
  • da.take / da.compress / fancy indexing with out-of-order indices emit zero warnings with the config set to True, False, or unset.

So the line suppresses a warning that no longer exists, gated by a config key that no longer exists. Scoping it to context managers (the previous approach in this PR) preserved dead code unnecessarily; deleting it is the cleaner fix.

Diff

1 file changed, 4 deletions.

-import dask
 import pandas as pd
 ...
-# silence dask performance warnings
-dask.config.set(**{"array.slicing.split_native_chunks": False})

Test plan

  • python -c "import malariagen_data; import dask; assert dask.config.get('array.slicing.split_native_chunks', 'DEFAULT') == 'DEFAULT'" — passes (import no longer mutates dask config).
  • pytest tests/anoph/test_snp_data.py — 193 passed, no PerformanceWarning emitted.
  • pytest tests/anoph/test_frq.py tests/anoph/test_cnv_data.py tests/anoph/test_hapclust.py tests/anoph/test_dipclust.py tests/anoph/test_hap_data.py tests/anoph/test_aim_data.py — 133 passed.
  • pre-commit run --files malariagen_data/ag3.py — ruff, ruff-format, trailing-whitespace, end-of-file all pass.
  • mypy malariagen_data/ag3.py — no issues.
  • CI matrix on PR (linting, coverage, tests across Python 3.10 / 3.11 / 3.12 × numpy ==2.0.2 and >=2.0.2,<2.1).

Backwards compatibility

  • No public API changed.
  • If anyone pins a very old dask version where the warning still fires, they may see it again; the project does not currently pin a minimum dask version in pyproject.toml (dask = "*"), so modern dask is the supported target.

The module-level call in ag3.py:

    dask.config.set(**{"array.slicing.split_native_chunks": False})

was historically present to silence a dask `PerformanceWarning` emitted
on certain slicing operations. On current dask (>=2025), the
`array.slicing.split_native_chunks` config key is unrecognised (returns
no default) and the associated `PerformanceWarning` has been removed
from `dask.array.slicing`. The setting therefore has no observable
effect in any currently-supported dask version.

Removing the line directly addresses the filed bug — `import
malariagen_data` no longer modifies global dask configuration — without
introducing scoped context managers or any other new machinery.

Verified:
- `import malariagen_data` leaves `dask.config.get('array.slicing.
  split_native_chunks', 'DEFAULT') == 'DEFAULT'`.
- `tests/anoph/test_snp_data.py` (193 tests), `test_frq.py`,
  `test_cnv_data.py`, `test_hapclust.py`, `test_dipclust.py`,
  `test_hap_data.py`, `test_aim_data.py` all pass with no new warnings;
  in particular no `PerformanceWarning` is emitted from any of the
  `da.take` / `da.compress` call sites in `util.py` or `snp_data.py`.
- pre-commit, ruff, ruff-format, mypy all clean.
@khushthecoder khushthecoder force-pushed the fix/issue-1308-dask-config-scope branch from 9da7fc3 to f4dbd66 Compare April 24, 2026 16:21
@khushthecoder khushthecoder changed the title Fix #1308: Scope dask.config.set() to specific operations instead of module import Fix #1308: Remove dead dask.config.set() at module import time Apr 24, 2026
@khushthecoder
Copy link
Copy Markdown
Contributor Author

Hi @jonbrenas — gentle follow-up. I've force-pushed a simpler approach to this PR than the
original scoped-context-manager version; quick summary so the change isn't confusing.

New approach: just deletes the 4 lines in ag3.py. No scoped context managers added anywhere.

Why this turned out to be the right shape: on current dask (2025.11.0, what poetry.lock
resolves to), the config key array.slicing.split_native_chunks no longer exists in dask's
schema — grep across the installed dask package returns zero matches, and
from dask.array.slicing import PerformanceWarning raises ImportError. So the original line
in ag3.py was silently setting a dead config key to suppress a warning class that no longer
exists. Wrapping dead code in context managers (the earlier approach) felt like overkill once I
verified that, so the PR is now just the deletion.

Diff: 1 file, 4 deletions, 0 additions.

Verified locally:

  • import malariagen_data no longer mutates dask.config.
  • pytest tests/anoph/test_snp_data.py (193 tests) plus test_frq, test_cnv_data,
    test_hapclust, test_dipclust, test_hap_data, test_aim_data — all pass, zero
    PerformanceWarning emitted from any da.take / da.compress call site.
  • pre-commit + ruff + mypy clean.

CI is green on the new commit across all 10 checks. No rush on review — whenever it fits
your queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dask.config.set() at Module Import Time in ag3.py Silently Modifies Global Dask Configuration

1 participant