Skip to content

Commit 8df2c28

Browse files
fix(dbapi): bundle D — input validation, PEP 249 wrapping, cleanups
Nine issues from cycle 3 bundle D, all in the dbapi layer. Input validation: - ISSUE-82: fetchmany() rejects negative size with ProgrammingError rather than silently returning [] (range(-5) is empty). Mirror the arraysize-setter's >= 1 guard on the sync+async cursors. - ISSUE-86: _reject_non_sequence_params now rejects str/bytes/ bytearray/memoryview. They are iterable so qmark binding would silently "explode" into per-character binds — almost always a caller bug. Users should wrap in ``(value,)``. Exception wrapping to satisfy PEP 249 "all DB errors funnel through Error" contract: - ISSUE-84: DateFromTicks / TimeFromTicks / TimestampFromTicks wrap NaN/inf/out-of-range with DataError instead of leaking ValueError/OverflowError/OSError. - ISSUE-85: _call_client adds a trailing catch for DqliteError subclasses not enumerated above. A future client exception class cannot bypass PEP 249 wrapping. - ISSUE-102: _datetime_from_iso8601 wraps parse failures in DataError. - ISSUE-107: _datetime_from_unixtime wraps TypeError/OverflowError/ OSError/ValueError in DataError (malformed server payload). Interface completions: - ISSUE-80: add Cursor.rownumber / AsyncCursor.rownumber property (PEP 249 optional). Returns the 0-based index of the next row or None when no result set is active. - ISSUE-88: Time() and Timestamp() accept optional microsecond and tzinfo parameters for parity with stdlib datetime.time / datetime.datetime. Mixing driver Time() with stdlib time() no longer silently drops microsecond precision. Cleanups: - ISSUE-110: extract _is_row_returning(sql) helper at module level; both sync and async cursors import it. Previously duplicated inline in two places. - ISSUE-113: narrow the three ``except Exception: pass`` blocks in _cleanup_loop_thread to specific expected exceptions (RuntimeError). Wider suppression hid programmer bugs during finalization. Adds tests/test_cycle3_hardening.py with 32 regression tests across these issues. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 9974f33 commit 8df2c28

File tree

5 files changed

+372
-32
lines changed

5 files changed

+372
-32
lines changed

src/dqlitedbapi/aio/cursor.py

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
_call_client,
88
_convert_params,
99
_convert_row,
10-
_strip_leading_comments,
10+
_is_row_returning,
1111
)
1212
from dqlitedbapi.exceptions import InterfaceError, NotSupportedError, ProgrammingError
1313

@@ -64,6 +64,17 @@ def lastrowid(self) -> int | None:
6464
"""Row ID of the last inserted row."""
6565
return self._lastrowid
6666

67+
@property
68+
def rownumber(self) -> int | None:
69+
"""0-based index of the next row in the current result set.
70+
71+
PEP 249 optional extension. ``None`` when no result set is
72+
active (ISSUE-80).
73+
"""
74+
if self._description is None:
75+
return None
76+
return self._row_index
77+
6778
@property
6879
def arraysize(self) -> int:
6980
"""Number of rows to fetch at a time with fetchmany()."""
@@ -94,13 +105,7 @@ async def execute(
94105
conn = await self._connection._ensure_connection()
95106
params = _convert_params(parameters)
96107

97-
# Determine if this is a query that returns rows.
98-
# Note: WITH ... INSERT/UPDATE/DELETE (without RETURNING) will be
99-
# misrouted to query_raw_typed. This is a known limitation of the heuristic.
100-
normalized = _strip_leading_comments(operation).upper()
101-
is_query = normalized.startswith(("SELECT", "PRAGMA", "EXPLAIN", "WITH")) or (
102-
" RETURNING " in normalized or normalized.endswith(" RETURNING")
103-
)
108+
is_query = _is_row_returning(operation)
104109

105110
_, op_lock = self._connection._ensure_locks()
106111
async with op_lock:
@@ -191,6 +196,10 @@ async def fetchmany(self, size: int | None = None) -> list[tuple[Any, ...]]:
191196

192197
if size is None:
193198
size = self._arraysize
199+
if size < 0:
200+
# See sync Cursor.fetchmany: silently returning [] on a
201+
# negative size hides caller bugs (ISSUE-82).
202+
raise ProgrammingError(f"fetchmany size must be >= 0, got {size}")
194203

195204
result: list[tuple[Any, ...]] = []
196205
for _ in range(size):

src/dqlitedbapi/connection.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,17 +40,23 @@ def _cleanup_loop_thread(
4040
ResourceWarning,
4141
stacklevel=2,
4242
)
43+
# Narrow suppression to the specific exceptions loop/thread teardown
44+
# can legitimately raise during finalization (ISSUE-113). Wider
45+
# ``except Exception: pass`` would hide programmer bugs like a
46+
# missing attribute reference introduced during a refactor.
4347
try:
4448
if not loop.is_closed():
4549
loop.call_soon_threadsafe(loop.stop)
46-
except Exception:
50+
except RuntimeError:
51+
# Loop was closed between is_closed() and the threadsafe call.
4752
pass
48-
with contextlib.suppress(Exception):
53+
with contextlib.suppress(RuntimeError):
4954
thread.join(timeout=5)
5055
try:
5156
if not loop.is_closed():
5257
loop.close()
53-
except Exception:
58+
except RuntimeError:
59+
# Raised if the loop was somehow restarted mid-finalization.
5460
pass
5561

5662

src/dqlitedbapi/cursor.py

Lines changed: 65 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,14 @@ async def _call_client(coro: Coroutine[Any, Any, Any]) -> Any:
3535
client.ProtocolError → dbapi.InterfaceError
3636
client.DataError → dbapi.DataError
3737
client.InterfaceError → dbapi.InterfaceError
38+
any other DqliteError → dbapi.InterfaceError (ISSUE-85)
39+
40+
Every ``dqliteclient`` exception is a subclass of ``DqliteError``;
41+
the trailing catch-all ensures a new client exception class cannot
42+
bypass PEP 249 wrapping. PEP 249 requires all database-sourced
43+
errors to surface as ``Error`` subclasses — without the fallback, a
44+
future ``dqliteclient.CircuitOpenError`` or similar would leak past
45+
``except dqlitedbapi.Error`` boundaries.
3846
"""
3947
try:
4048
return await coro
@@ -50,6 +58,13 @@ async def _call_client(coro: Coroutine[Any, Any, Any]) -> Any:
5058
raise DataError(str(e)) from e
5159
except _client_exc.InterfaceError as e:
5260
raise _DbapiInterfaceError(str(e)) from e
61+
except _client_exc.DqliteError as e:
62+
# Catch-all for any future subclass of DqliteError not enumerated
63+
# above. Surface as InterfaceError rather than leaking to the
64+
# caller as a non-DBAPI exception (ISSUE-85).
65+
raise _DbapiInterfaceError(
66+
f"unrecognized client error ({type(e).__name__}): {e}"
67+
) from e
5368

5469

5570
if TYPE_CHECKING:
@@ -76,15 +91,24 @@ def _convert_row(row: Sequence[Any], column_types: Sequence[int]) -> tuple[Any,
7691

7792

7893
def _reject_non_sequence_params(params: Any) -> None:
79-
"""Reject mappings and unordered containers per PEP 249 qmark rules.
94+
"""Reject mappings, unordered containers, and str/bytes per PEP 249 qmark rules.
8095
8196
PEP 249: for ``qmark`` paramstyle "the sequence is mandatory and the
8297
driver will not accept mappings." We also reject ``set`` / ``frozenset``
8398
— they are sequences structurally but unordered, which silently
84-
scrambles positional bindings.
99+
scrambles positional bindings. And we reject ``str`` /
100+
``bytes`` / ``bytearray`` / ``memoryview`` — they are iterable, so
101+
they would silently "explode" into character/byte binds and the
102+
caller almost always meant ``(value,)`` instead (ISSUE-86).
85103
"""
86104
if params is None:
87105
return
106+
if isinstance(params, (str, bytes, bytearray, memoryview)):
107+
raise ProgrammingError(
108+
f"parameters must be a sequence of values, not "
109+
f"{type(params).__name__!r}; did you mean to pass a tuple "
110+
f"like (value,) with a single element?"
111+
)
88112
if isinstance(params, Mapping):
89113
raise ProgrammingError(
90114
"qmark paramstyle requires a sequence; got a mapping. "
@@ -124,6 +148,26 @@ def _strip_leading_comments(sql: str) -> str:
124148
return s
125149

126150

151+
_ROW_RETURNING_PREFIXES = ("SELECT", "PRAGMA", "EXPLAIN", "WITH")
152+
153+
154+
def _is_row_returning(sql: str) -> bool:
155+
"""Heuristic for "does this statement return a result set?"
156+
157+
Single source of truth for sync and async cursors (ISSUE-110).
158+
Matches leading SELECT/PRAGMA/EXPLAIN/WITH after stripping comments,
159+
and catches trailing/embedded RETURNING clauses on DML.
160+
161+
Note: ``WITH ... INSERT/UPDATE/DELETE`` (no RETURNING) will be
162+
misclassified as a query. This is a known limitation of a
163+
prefix-only check — a full SQL parser is out of scope.
164+
"""
165+
normalized = _strip_leading_comments(sql).upper()
166+
if normalized.startswith(_ROW_RETURNING_PREFIXES):
167+
return True
168+
return " RETURNING " in normalized or normalized.endswith(" RETURNING")
169+
170+
127171
class Cursor:
128172
"""PEP 249 compliant database cursor."""
129173

@@ -183,6 +227,19 @@ def lastrowid(self) -> int | None:
183227
"""Row ID of the last inserted row."""
184228
return self._lastrowid
185229

230+
@property
231+
def rownumber(self) -> int | None:
232+
"""0-based index of the next row in the current result set.
233+
234+
PEP 249 optional extension: returns ``None`` if no result set is
235+
active (no query executed, or last statement was DML without
236+
RETURNING); otherwise returns the index of the row that the next
237+
``fetchone()`` would produce (ISSUE-80).
238+
"""
239+
if self._description is None:
240+
return None
241+
return self._row_index
242+
186243
@property
187244
def arraysize(self) -> int:
188245
"""Number of rows to fetch at a time with fetchmany()."""
@@ -216,15 +273,7 @@ async def _execute_async(self, operation: str, parameters: Sequence[Any] | None
216273
conn = await self._connection._get_async_connection()
217274
params = _convert_params(parameters)
218275

219-
# Determine if this is a query that returns rows.
220-
# Note: WITH ... INSERT/UPDATE/DELETE (without RETURNING) will be
221-
# misrouted to query_raw_typed. This is a known limitation of the heuristic.
222-
normalized = _strip_leading_comments(operation).upper()
223-
is_query = normalized.startswith(("SELECT", "PRAGMA", "EXPLAIN", "WITH")) or (
224-
" RETURNING " in normalized or normalized.endswith(" RETURNING")
225-
)
226-
227-
if is_query:
276+
if _is_row_returning(operation):
228277
columns, column_types, rows = await _call_client(
229278
conn.query_raw_typed(operation, params)
230279
)
@@ -320,6 +369,11 @@ def fetchmany(self, size: int | None = None) -> list[tuple[Any, ...]]:
320369

321370
if size is None:
322371
size = self._arraysize
372+
if size < 0:
373+
# Previously ``range(-5)`` silently returned [] — hid caller
374+
# bugs (ISSUE-82). ``arraysize`` setter already validates
375+
# >= 1; mirror that here.
376+
raise ProgrammingError(f"fetchmany size must be >= 0, got {size}")
323377

324378
result: list[tuple[Any, ...]] = []
325379
for _ in range(size):

src/dqlitedbapi/types.py

Lines changed: 67 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
"""PEP 249 type objects and constructors for dqlite."""
22

33
import datetime
4+
import math
45
from typing import Any
56

7+
from dqlitedbapi.exceptions import DataError
68
from dqlitewire.constants import ValueType
79

810

@@ -12,31 +14,74 @@ def Date(year: int, month: int, day: int) -> datetime.date: # noqa: N802
1214
return datetime.date(year, month, day)
1315

1416

15-
def Time(hour: int, minute: int, second: int) -> datetime.time: # noqa: N802
16-
"""Construct a time value."""
17-
return datetime.time(hour, minute, second)
17+
def Time( # noqa: N802
18+
hour: int,
19+
minute: int,
20+
second: int,
21+
microsecond: int = 0,
22+
tzinfo: datetime.tzinfo | None = None,
23+
) -> datetime.time:
24+
"""Construct a time value.
25+
26+
Accepts optional ``microsecond`` and ``tzinfo`` for parity with
27+
stdlib ``datetime.time`` (ISSUE-88). PEP 249 does not require this,
28+
but mixing the driver's ``Time()`` with ``datetime.time`` would
29+
otherwise drop sub-second precision silently.
30+
"""
31+
return datetime.time(hour, minute, second, microsecond, tzinfo=tzinfo)
1832

1933

2034
def Timestamp( # noqa: N802
21-
year: int, month: int, day: int, hour: int, minute: int, second: int
35+
year: int,
36+
month: int,
37+
day: int,
38+
hour: int,
39+
minute: int,
40+
second: int,
41+
microsecond: int = 0,
42+
tzinfo: datetime.tzinfo | None = None,
2243
) -> datetime.datetime:
2344
"""Construct a timestamp value."""
24-
return datetime.datetime(year, month, day, hour, minute, second)
45+
return datetime.datetime(year, month, day, hour, minute, second, microsecond, tzinfo=tzinfo)
46+
47+
48+
def _validate_ticks(ticks: float) -> None:
49+
"""Reject NaN/inf before handing to datetime.fromtimestamp.
50+
51+
``fromtimestamp`` raises different stdlib exceptions depending on
52+
the failure mode (``ValueError`` for NaN on some platforms,
53+
``OverflowError`` / ``OSError`` for out-of-range). Guard up front so
54+
the caller always sees a single DB-API ``DataError`` (ISSUE-84).
55+
"""
56+
if isinstance(ticks, float) and not math.isfinite(ticks):
57+
raise DataError(f"Invalid timestamp ticks: {ticks}")
2558

2659

2760
def DateFromTicks(ticks: float) -> datetime.date: # noqa: N802
2861
"""Construct a date from a Unix timestamp."""
29-
return datetime.date.fromtimestamp(ticks)
62+
_validate_ticks(ticks)
63+
try:
64+
return datetime.date.fromtimestamp(ticks)
65+
except (OverflowError, OSError, ValueError) as e:
66+
raise DataError(f"Invalid timestamp ticks {ticks}: {e}") from e
3067

3168

3269
def TimeFromTicks(ticks: float) -> datetime.time: # noqa: N802
3370
"""Construct a time from a Unix timestamp."""
34-
return datetime.datetime.fromtimestamp(ticks).time()
71+
_validate_ticks(ticks)
72+
try:
73+
return datetime.datetime.fromtimestamp(ticks).time()
74+
except (OverflowError, OSError, ValueError) as e:
75+
raise DataError(f"Invalid timestamp ticks {ticks}: {e}") from e
3576

3677

3778
def TimestampFromTicks(ticks: float) -> datetime.datetime: # noqa: N802
3879
"""Construct a timestamp from a Unix timestamp."""
39-
return datetime.datetime.fromtimestamp(ticks)
80+
_validate_ticks(ticks)
81+
try:
82+
return datetime.datetime.fromtimestamp(ticks)
83+
except (OverflowError, OSError, ValueError) as e:
84+
raise DataError(f"Invalid timestamp ticks {ticks}: {e}") from e
4085

4186

4287
def Binary(data: bytes) -> bytes: # noqa: N802
@@ -161,6 +206,11 @@ def _datetime_from_iso8601(text: str) -> datetime.datetime | None:
161206
PEP 249 NULL semantics.
162207
163208
Naive input round-trips as naive; aware input preserves the offset.
209+
210+
A malformed string from the server (bug, corruption, or MitM) would
211+
otherwise escape as a raw ``ValueError``; wrap as ``DataError`` to
212+
satisfy PEP 249's "all DB errors funnel through Error" contract
213+
(ISSUE-102).
164214
"""
165215
if not text:
166216
return None
@@ -172,7 +222,7 @@ def _datetime_from_iso8601(text: str) -> datetime.datetime | None:
172222
try:
173223
d = datetime.date.fromisoformat(s)
174224
except ValueError as exc:
175-
raise ValueError(f"Cannot parse ISO 8601 datetime: {text!r}") from exc
225+
raise DataError(f"Cannot parse ISO 8601 datetime from server: {text!r}") from exc
176226
return datetime.datetime(d.year, d.month, d.day)
177227

178228

@@ -181,8 +231,15 @@ def _datetime_from_unixtime(value: int) -> datetime.datetime:
181231
182232
UNIXTIME is unambiguously seconds-since-epoch in UTC, so returning a
183233
UTC-aware value is faithful. Callers that want local time can convert.
234+
235+
A corrupt server or MitM-modified bytes could deliver a non-integer
236+
or out-of-range value; wrap the resulting stdlib exceptions as
237+
``DataError`` (ISSUE-107).
184238
"""
185-
return datetime.datetime.fromtimestamp(value, tz=datetime.UTC)
239+
try:
240+
return datetime.datetime.fromtimestamp(value, tz=datetime.UTC)
241+
except (TypeError, OverflowError, OSError, ValueError) as e:
242+
raise DataError(f"Invalid UNIXTIME from server: {value!r}") from e
186243

187244

188245
def _convert_bind_param(value: Any) -> Any:

0 commit comments

Comments
 (0)