Description
All remote data access (GCS via gcsfs, S3 via s3fs, and HTTP via fsspec) relies entirely on the default timeout/retry behavior of the underlying filesystem libraries. The codebase has zero explicit timeout, retry, or backoff configuration — confirmed by searching the entire malariagen_data/ directory for these terms.
How to Reproduce
import malariagen_data
ag3 = malariagen_data.Ag3()
# On a slow/unreliable network (e.g., field station in sub-Saharan Africa):
# This will hang indefinitely if GCS is unreachable, with no timeout:
ds = ag3.snp_calls(region="3R", sample_sets="3.0")
# No retry on transient 503/429 errors from GCS:
# A single failed request causes the entire operation to fail.
Why It Is Important
- MalariaGEN's primary user base includes researchers in malaria-endemic regions (sub-Saharan Africa, Southeast Asia) where internet connectivity can be unreliable.
- The library accesses multi-GB zarr stores over the network; a single transient error (GCS 503, network timeout, DNS failure) causes the entire operation to fail with a cryptic
OSError.
gcsfs and s3fs support configurable retries, timeout, and retry_delay parameters that are simply never set.
- The
base.py constructor at lines 100-105 already silently swallows network errors from ipinfo — the same pattern of "hope the network works" pervades the entire data access layer.
- Adding
retries=3, timeout=60 to the filesystem initialization in util.py would cover all downstream operations.
Expected Impact After Resolution
- Transient network errors are automatically retried with exponential backoff.
- Operations have configurable timeouts (with sensible defaults).
- Users on unreliable connections get clear timeout errors instead of indefinite hangs.
Description
All remote data access (GCS via
gcsfs, S3 vias3fs, and HTTP viafsspec) relies entirely on the default timeout/retry behavior of the underlying filesystem libraries. The codebase has zero explicit timeout, retry, or backoff configuration — confirmed by searching the entiremalariagen_data/directory for these terms.How to Reproduce
Why It Is Important
OSError.gcsfsands3fssupport configurableretries,timeout, andretry_delayparameters that are simply never set.base.pyconstructor at lines 100-105 already silently swallows network errors from ipinfo — the same pattern of "hope the network works" pervades the entire data access layer.retries=3, timeout=60to the filesystem initialization inutil.pywould cover all downstream operations.Expected Impact After Resolution