Description
Both hapclust.py:89 and dipclust.py:92 call sys.setrecursionlimit(10_000) as a side effect of plotting dendrograms, but never restore the original limit. This permanently modifies the global Python recursion limit for the entire process, affecting all subsequent code — not just malariagen_data.
How to Reproduce
import sys
print(sys.getrecursionlimit()) # 1000 (default)
import malariagen_data
ag3 = malariagen_data.Ag3()
# Call any function that triggers hapclust or dipclust:
ag3.plot_haplotype_clustering(region="2L:28,545,000-28,550,000", analysis="gamb_colu")
print(sys.getrecursionlimit()) # 10000 — permanently changed!
# Now ALL code in this notebook/session uses the elevated limit,
# potentially masking stack overflow bugs in unrelated code.
Why It Is Important
- Side-effect pollution: A plotting function should not permanently modify global interpreter state. Users working in long-running Jupyter sessions have no idea their recursion limit changed.
- Masks bugs: Legitimate
RecursionError bugs in user code or other libraries will be silently suppressed because the limit is now 10x higher.
- Memory risk: A deeper call stack consumes more memory per thread. In environments with many threads (e.g., dask distributed workers), this compounds.
Proposed Fix
The fix is trivial — wrap the recursion limit change in a try/finally block to restore the original limit:
# Before (current behaviour)
sys.setrecursionlimit(10_000)
# ... dendrogram computation ...
# After (proposed fix)
_original_limit = sys.getrecursionlimit()
try:
sys.setrecursionlimit(10_000)
# ... dendrogram computation ...
finally:
sys.setrecursionlimit(_original_limit)
This pattern should be applied in both:
hapclust.py line 89
dipclust.py line 92
Expected Impact After Resolution
The recursion limit is temporarily elevated only for the duration of the dendrogram computation and immediately restored afterward, eliminating global state pollution.
Description
Both
hapclust.py:89anddipclust.py:92callsys.setrecursionlimit(10_000)as a side effect of plotting dendrograms, but never restore the original limit. This permanently modifies the global Python recursion limit for the entire process, affecting all subsequent code — not justmalariagen_data.How to Reproduce
Why It Is Important
RecursionErrorbugs in user code or other libraries will be silently suppressed because the limit is now 10x higher.Proposed Fix
The fix is trivial — wrap the recursion limit change in a
try/finallyblock to restore the original limit:This pattern should be applied in both:
hapclust.pyline 89dipclust.pyline 92Expected Impact After Resolution
The recursion limit is temporarily elevated only for the duration of the dendrogram computation and immediately restored afterward, eliminating global state pollution.