Skip to content

Fix bugs and improve fetch performance in process.py#19

Merged
AlexCatarino merged 3 commits intoQuantConnect:masterfrom
AlexCatarino:bug-deprecated-key
Apr 17, 2026
Merged

Fix bugs and improve fetch performance in process.py#19
AlexCatarino merged 3 commits intoQuantConnect:masterfrom
AlexCatarino:bug-deprecated-key

Conversation

@AlexCatarino
Copy link
Copy Markdown
Member

@AlexCatarino AlexCatarino commented Apr 16, 2026

Summary

  • Reuse a single requests.Session and cap page fetches with ThreadPoolExecutor(max_workers=8) instead of spawning one unbounded thread per page.
  • Fix the per-date file write being nested inside the per-response loop (every date file was rewritten once per page).
  • Fix country_states scope bug that dropped state names when an agency had multiple countries or none (the extend was outside the country loop).
  • Drop the Python 3.6 %z string workaround, the hardcoded REGALYTICS_API_KEY fallback, and the in_federal_register boolean coercion.
  • Log response page count and per-date article counts; exit 1 with an error when more than one date is produced.

Test plan

  • Run python process.py for a known date and confirm output matches the prior file contents (no articles lost, states/agencies shape unchanged).
  • Run a date that spans the UTC/EST boundary and confirm the process exits 1 with the date count error.
  • Run a date with total_pages > 1 and confirm all pages are fetched (logged page count matches API).

AlexCatarino and others added 2 commits April 16, 2026 23:52
- Reuse a single requests.Session and bound concurrency with a
  ThreadPoolExecutor(max_workers=8) instead of spawning one thread per
  page with no limit.
- Move the per-date file write out of the per-response loop; previously
  every date file was rewritten once per page.
- Fix country_states scope bug that dropped state names when an agency
  had multiple countries or none.
- Drop the Python 3.6 timezone string workaround; %z now parses the raw
  value directly.
- Drop hardcoded REGALYTICS_API_KEY fallback and the in_federal_register
  boolean coercion.
- Log response page count and per-date article counts; exit 1 if more
  than one date is produced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AlexCatarino AlexCatarino merged commit b97be5d into QuantConnect:master Apr 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants