What happens?
DuckDBPyResult::Fetchall() and Fetchmany() release/re-acquire the GIL for every row.
This GIL churn has a significant, and easily avoidable, performance impact.
A one line change to Fetchone will improve Fetchall and Fetchmany performance by 10-30% under many circumstances*.
* except those with expensive conversion costs that dominate the time
Problem
DuckDBPyResult::Fetchall() and Fetchmany() are implemented as iteration over Fetchone(), which results in GIL re-acquisition for every row... mostly unnecessarily:
|
py::list DuckDBPyResult::Fetchall() { |
|
py::list res; |
|
while (true) { |
|
auto fres = Fetchone(); |
|
if (fres.is_none()) { |
|
break; |
|
} |
|
res.append(fres); |
|
} |
|
return res; |
|
} |
Fetchone releases/reacquires the GIL... even if the current chunk has data:
|
Optional<py::tuple> DuckDBPyResult::Fetchone() { |
|
{ |
|
D_ASSERT(py::gil_check()); |
|
py::gil_scoped_release release; |
|
if (!result) { |
|
throw InvalidInputException("result closed"); |
|
} |
|
if (!current_chunk || chunk_offset >= current_chunk->size()) { |
|
current_chunk = FetchNext(*result); |
|
chunk_offset = 0; |
|
} |
|
} |
Suggested fix
The GIL should only be released when actually calling FetchNext() to get new chunks from the database, not for every row access within an already-fetched chunk.
The fix (my branch): main...paultiq:duckdb-pythonf:fetchone_release
Move py::gil_scoped_release inside the current_chunk check / just before FetchNext.
Impact
My ad-hoc testing showed a significant improvement here... a 10-30% improvement under many conditions*
For the test case shown below:
- Without change: 2.9s
- With change: 2.1s (30% better)
* it's less significant for the slower paths in PythonObject::FromValue, those where conversion is done via ToString first.
To Reproduce
Test case
import duckdb
from time import perf_counter
with duckdb.connect() as conn:
conn.execute("CREATE TABLE numbers AS SELECT * FROM range(10000000)")
conn.execute("select * from numbers")
start = perf_counter()
r = conn.fetchall()
end = perf_counter()
print(f"Fetchall took {end - start:.6f} seconds")
OS:
Linux
DuckDB Package Version:
1.4.0 / dev
Python Version:
3.13
Full Name:
Paul T
Affiliation:
Iqmo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?
What happens?
DuckDBPyResult::Fetchall() and Fetchmany() release/re-acquire the GIL for every row.
This GIL churn has a significant, and easily avoidable, performance impact.
A one line change to Fetchone will improve Fetchall and Fetchmany performance by 10-30% under many circumstances*.
* except those with expensive conversion costs that dominate the time
Problem
DuckDBPyResult::Fetchall() and Fetchmany() are implemented as iteration over Fetchone(), which results in GIL re-acquisition for every row... mostly unnecessarily:
duckdb-python/src/duckdb_py/pyresult.cpp
Lines 160 to 170 in af60d44
Fetchone releases/reacquires the GIL... even if the current chunk has data:
duckdb-python/src/duckdb_py/pyresult.cpp
Lines 117 to 128 in af60d44
Suggested fix
The GIL should only be released when actually calling FetchNext() to get new chunks from the database, not for every row access within an already-fetched chunk.
The fix (my branch): main...paultiq:duckdb-pythonf:fetchone_release
Move
py::gil_scoped_releaseinside the current_chunk check / just before FetchNext.Impact
My ad-hoc testing showed a significant improvement here... a 10-30% improvement under many conditions*
For the test case shown below:
* it's less significant for the slower paths in PythonObject::FromValue, those where conversion is done via ToString first.
To Reproduce
Test case
OS:
Linux
DuckDB Package Version:
1.4.0 / dev
Python Version:
3.13
Full Name:
Paul T
Affiliation:
Iqmo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?