Description
base.py lines 100–105 perform a network call to ipinfo.io to geolocate the user, but catch all OSError exceptions silently:
self._client_details = None
if check_location:
try:
self._client_details = ipinfo.getHandler().getDetails()
except OSError:
pass # Complete silence — no logging, no warning
When this call fails, _client_details remains None, which cascades to _get_gcp_region() returning None, causing the GCS URL optimisation (choosing a region-local bucket) to silently fall back to the default. The user receives no indication that the location check failed or that their downloads may be coming from a suboptimal region.
Steps to Reproduce
import malariagen_data
# In an environment where ipinfo.io is blocked
# (corporate firewall, institutional network, etc.):
ag3 = malariagen_data.Ag3()
# No warning, no log message — but data may be downloading from a distant GCS region.
print(ag3._gcp_region) # None — silently fell back to default
Root Cause
The except OSError: pass block discards the exception entirely — no log entry, no user-facing warning, no debug message. The resulting None value for _client_details propagates silently through _get_gcp_region(), meaning the GCS bucket selection optimisation is disabled without the user's knowledge.
The failure mode is indistinguishable from a successful lookup that returned no location data.
Why This Is a Problem
1. Performance Impact
Researchers in regions with nearby GCS buckets (e.g., us-central1) may unknowingly download multi-GB datasets from a distant bucket. The latency increase can be dramatic for large genomic data access, with no indication that anything is wrong.
2. Debuggability
When users report slow data access, there is no log entry to indicate the location check failed. Support and debugging become guesswork — there is nothing in the application state or logs to correlate slow downloads with a failed ipinfo.io lookup.
3. Prevalence in Target Environments
Institutional and corporate networks — the environments where many researchers operate — frequently block external lookup services like ipinfo.io. These are precisely the users most likely to be affected, and least likely to understand why performance is degraded.
Expected Behavior
A failed location check should:
- Be logged at
DEBUG level so that users running with debug=True can diagnose the issue.
- Optionally emit a user-facing warning explaining that region-optimal storage selection is unavailable and how to manually specify a region-local URL.
The fallback behaviour itself is acceptable — the fix is about making the failure visible and diagnosable, not about preventing it.
Proposed Fix
Before (current behaviour — base.py, lines 100–105):
self._client_details = None
if check_location:
try:
self._client_details = ipinfo.getHandler().getDetails()
except OSError:
pass # Complete silence — no logging, no warning
After (proposed fix):
self._client_details = None
if check_location:
try:
self._client_details = ipinfo.getHandler().getDetails()
except OSError as exc:
self._log.debug("Location check failed: %s", exc)
warnings.warn(
"Could not determine client location via ipinfo.io "
f"({exc}). Region-optimal GCS bucket selection is unavailable. "
"You can manually specify a region using the `url` parameter.",
UserWarning,
stacklevel=2,
)
The debug-level log ensures the failure is captured when diagnostic logging is enabled. The UserWarning gives researchers in restricted network environments an actionable message without being intrusive in normal usage.
Expected Impact After Resolution
- Failed location checks are logged at
DEBUG level, visible when debug=True.
- Users in restricted network environments receive an informative
UserWarning explaining the fallback and how to manually specify a region-local URL.
- Support burden is reduced — slow download reports can be immediately correlated with the warning rather than requiring infrastructure-level investigation.
- No change to fallback behaviour; the fix is purely additive and fully backward-compatible.
Affected File
| File |
Lines |
Code |
base.py |
100–105 |
except OSError: pass |
Severity
Medium — No crash or data corruption, but causes silent, undiagnosable performance degradation for users on restricted networks, which is a common deployment environment for the target research audience.
Description
base.pylines 100–105 perform a network call toipinfo.ioto geolocate the user, but catch allOSErrorexceptions silently:When this call fails,
_client_detailsremainsNone, which cascades to_get_gcp_region()returningNone, causing the GCS URL optimisation (choosing a region-local bucket) to silently fall back to the default. The user receives no indication that the location check failed or that their downloads may be coming from a suboptimal region.Steps to Reproduce
Root Cause
The
except OSError: passblock discards the exception entirely — no log entry, no user-facing warning, no debug message. The resultingNonevalue for_client_detailspropagates silently through_get_gcp_region(), meaning the GCS bucket selection optimisation is disabled without the user's knowledge.The failure mode is indistinguishable from a successful lookup that returned no location data.
Why This Is a Problem
1. Performance Impact
Researchers in regions with nearby GCS buckets (e.g.,
us-central1) may unknowingly download multi-GB datasets from a distant bucket. The latency increase can be dramatic for large genomic data access, with no indication that anything is wrong.2. Debuggability
When users report slow data access, there is no log entry to indicate the location check failed. Support and debugging become guesswork — there is nothing in the application state or logs to correlate slow downloads with a failed
ipinfo.iolookup.3. Prevalence in Target Environments
Institutional and corporate networks — the environments where many researchers operate — frequently block external lookup services like
ipinfo.io. These are precisely the users most likely to be affected, and least likely to understand why performance is degraded.Expected Behavior
A failed location check should:
DEBUGlevel so that users running withdebug=Truecan diagnose the issue.The fallback behaviour itself is acceptable — the fix is about making the failure visible and diagnosable, not about preventing it.
Proposed Fix
Before (current behaviour —
base.py, lines 100–105):After (proposed fix):
The
debug-level log ensures the failure is captured when diagnostic logging is enabled. TheUserWarninggives researchers in restricted network environments an actionable message without being intrusive in normal usage.Expected Impact After Resolution
DEBUGlevel, visible whendebug=True.UserWarningexplaining the fallback and how to manually specify a region-local URL.Affected File
base.pyexcept OSError: passSeverity
Medium — No crash or data corruption, but causes silent, undiagnosable performance degradation for users on restricted networks, which is a common deployment environment for the target research audience.