What happens?
When SET old_implicit_casting = true;, list type was casted to string unexpectedly, leading to incorrect results with list_extract. This seems to be a regression in 1.4.x as DuckDB 1.3.x works fine.
Reproduce:
import duckdb
import pyarrow as pa
data = {
"str_col": ["1-2", "3-4", "a-z", None]
}
arrow_table = pa.table(data)
conn = duckdb.connect(":memory:")
conn.execute("SET old_implicit_casting = true;")
conn.register("input", arrow_table)
result1 = conn.execute("SELECT list_extract(string_split(str_col, '-'), 1) FROM input").fetch_arrow_table()
print(result1)
In DuckDB 1.4.0 and 1.4.1, we got
pyarrow.Table
list_extract(string_split(str_col, '-'), 1): string
----
list_extract(string_split(str_col, '-'), 1): [["[","[","[",null]] # instead of the first element of list, the first character "[" was returned
In DuckDB 1.3.2, we got
pyarrow.Table
list_extract(string_split(str_col, '-'), 1): string
----
list_extract(string_split(str_col, '-'), 1): [["1","3","a",null]]
To Reproduce
import duckdb
import pyarrow as pa
data = {
"str_col": ["1-2", "3-4", "a-z", None]
}
arrow_table = pa.table(data)
conn = duckdb.connect(":memory:")
conn.execute("SET old_implicit_casting = true;")
conn.register("input", arrow_table)
result1 = conn.execute("SELECT list_extract(string_split(str_col, '-'), 1) FROM input").fetch_arrow_table()
print(result1)
OS:
linux
DuckDB Version:
1.4.1
DuckDB Client:
Python
Hardware:
No response
Full Name:
Hongyu Shi
Affiliation:
Benchling
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
Did you include all code required to reproduce the issue?
Did you include all relevant data sets for reproducing the issue?
Yes
What happens?
When
SET old_implicit_casting = true;, list type was casted to string unexpectedly, leading to incorrect results withlist_extract. This seems to be a regression in 1.4.x as DuckDB 1.3.x works fine.Reproduce:
In DuckDB 1.4.0 and 1.4.1, we got
In DuckDB 1.3.2, we got
To Reproduce
OS:
linux
DuckDB Version:
1.4.1
DuckDB Client:
Python
Hardware:
No response
Full Name:
Hongyu Shi
Affiliation:
Benchling
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
Did you include all code required to reproduce the issue?
Did you include all relevant data sets for reproducing the issue?
Yes