Skip to content

Commit 12ee7aa

Browse files
Register find_leader done-callback before slot insertion
The single-flight body in find_leader previously inserted the task into the shared slot map, then registered the done-callback. A signal-driven interrupt (KeyboardInterrupt / SystemExit raised by a signal handler at any bytecode boundary between the two statements) could leave the slot pointing at a task whose completion is never observed by _clear_slot — surfacing as a "Task exception was never retrieved" warning at GC time. Reorder so add_done_callback runs before the slot insertion. The callback is safe on a not-yet-started task; _clear_slot's "only delete if it still points at us" guard already handles the case where the slot was never set. The race window is closed by construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 0cfb00e commit 12ee7aa

2 files changed

Lines changed: 47 additions & 1 deletion

File tree

src/dqliteclient/cluster.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,6 @@ async def find_leader(self, *, trust_server_heartbeat: bool = False) -> str:
205205
task = asyncio.create_task(
206206
self._find_leader_impl(trust_server_heartbeat=trust_server_heartbeat)
207207
)
208-
self._find_leader_tasks[key] = task
209208

210209
def _clear_slot(t: asyncio.Task[str]) -> None:
211210
# Clear the slot only if it still points at THIS task —
@@ -228,7 +227,18 @@ def _clear_slot(t: asyncio.Task[str]) -> None:
228227
with contextlib.suppress(BaseException):
229228
t.exception()
230229

230+
# Register the done-callback BEFORE inserting into the
231+
# shared slot so a signal-driven interrupt (KI / SystemExit
232+
# raised by a signal handler at any bytecode boundary)
233+
# between ``create_task`` and ``add_done_callback`` cannot
234+
# leave the slot pointing at a task whose completion is
235+
# never observed. ``add_done_callback`` is safe on a
236+
# not-yet-started task; the callback fires regardless of
237+
# slot state, and ``_clear_slot``'s "only delete if it
238+
# still points at us" guard handles the case where the
239+
# slot was never set.
231240
task.add_done_callback(_clear_slot)
241+
self._find_leader_tasks[key] = task
232242
# Shield so a caller's outer cancel does not kill the shared
233243
# task; the cancel still propagates to the calling coroutine
234244
# via ``await asyncio.shield``.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
"""Pin: ``find_leader`` registers its done-callback BEFORE inserting
2+
the task into the shared slot map.
3+
4+
The reverse order (slot insert → add_done_callback) opens a
5+
race window: a signal-driven interrupt (KeyboardInterrupt /
6+
SystemExit raised by a signal handler at any bytecode boundary
7+
between the two statements) leaves the slot pointing at a task
8+
whose completion is never observed by ``_clear_slot``.
9+
10+
This test inspects the source order via the closure-captured
11+
function bytecode order; a future refactor that reverts the
12+
ordering will fail this pin.
13+
"""
14+
15+
from __future__ import annotations
16+
17+
import inspect
18+
19+
from dqliteclient.cluster import ClusterClient
20+
21+
22+
def test_done_callback_registered_before_slot_insert() -> None:
23+
"""Inspect the find_leader source: ``add_done_callback`` must
24+
appear before ``self._find_leader_tasks[key] = task`` in the
25+
``if task is None or task.done():`` branch."""
26+
source = inspect.getsource(ClusterClient.find_leader)
27+
callback_idx = source.find("add_done_callback")
28+
slot_assign_idx = source.find("self._find_leader_tasks[key] = task")
29+
assert callback_idx > 0, "add_done_callback registration site must exist"
30+
assert slot_assign_idx > 0, "slot insert site must exist"
31+
assert callback_idx < slot_assign_idx, (
32+
"find_leader must register the done-callback before inserting "
33+
"the task into the shared slot map; otherwise a signal-driven "
34+
"interrupt between the two statements leaks an unobserved-"
35+
"exception task"
36+
)

0 commit comments

Comments
 (0)