Commit ec69ddf
fix: string_to_array('', delim) returns empty array for PostgreSQL compatibility (apache#21104)
## Problem
`string_to_array` was returning incorrect results for empty string input
— both when the delimiter is non-empty and when the delimiter is itself
an empty string. This diverges from PostgreSQL behavior.
| Query | DataFusion (before) | PostgreSQL (expected) |
|---|---|---|
| `string_to_array('', ',')` | `['']` | `{}` |
| `string_to_array('', '')` | `['']` | `{}` |
| `string_to_array('', ',', 'x')` | `['']` | `{}` |
| `string_to_array('', '', 'x')` | `['']` | `{}` |
Results from datafusion-cli
<img width="1435" height="104" alt="Screenshot 2026-03-23 at 9 14 08 AM"
src="https://github.com/user-attachments/assets/2eaae366-7f8a-4220-87d2-f0b311c26712"
/>
**Root cause:** Rust's `str::split()` on an empty string always yields
one empty-string element, so `"".split(",")` produces `[""]`.
Additionally, the empty-delimiter branch unconditionally appended the
(empty) string value. This is subtle because Arrow's text display format
appears not to quote strings, so `[""]` renders as `[]` —
indistinguishable from an actual empty array. Using `cardinality()`
reveals the current length is 1, not 0.
**PostgreSQL reference:**
[db-fiddle](https://www.db-fiddle.com/f/oCF8EPaZFkDNKSg28rVVWy/3)
## Fix
In `datafusion/functions-nested/src/string.rs`:
- **Non-empty delimiter** `(Some(string), Some(delimiter))`: added `if
!string.is_empty()` guard to skip splitting when input is empty.
- **Empty delimiter** `(Some(string), Some(""))`: added `if
!string.is_empty()` guard so the string value is only appended when
non-empty.
Both the plain variant and the `null_value` variant are fixed (4 checks
total).
## Tests
Added sqllogictest cases in
`datafusion/sqllogictest/test_files/array.slt` using `cardinality()` to
unambiguously verify the arrays are truly empty (not just displaying as
empty):
```sql
SELECT cardinality(string_to_array('', ',')) -- 0
SELECT cardinality(string_to_array('', '')) -- 0
SELECT cardinality(string_to_array('', ',', 'x')) -- 0
SELECT cardinality(string_to_array('', '', 'x')) -- 0
```
Each test covers one of the four `is_empty` guard checks. All four fail
without the fix (returning 1 instead of 0).1 parent 39d7cff commit ec69ddf
2 files changed
+59
-35
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
727 | 727 | | |
728 | 728 | | |
729 | 729 | | |
730 | | - | |
731 | | - | |
732 | | - | |
733 | | - | |
734 | | - | |
735 | | - | |
736 | | - | |
737 | | - | |
738 | | - | |
739 | | - | |
740 | | - | |
741 | | - | |
742 | | - | |
743 | | - | |
744 | | - | |
745 | | - | |
746 | | - | |
747 | | - | |
748 | | - | |
749 | | - | |
750 | | - | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
751 | 735 | | |
752 | | - | |
753 | | - | |
754 | | - | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
755 | 755 | | |
756 | 756 | | |
757 | 757 | | |
758 | 758 | | |
759 | 759 | | |
760 | 760 | | |
761 | 761 | | |
762 | | - | |
763 | | - | |
764 | | - | |
765 | | - | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
766 | 768 | | |
767 | 769 | | |
768 | 770 | | |
769 | 771 | | |
770 | | - | |
771 | | - | |
772 | | - | |
773 | | - | |
774 | | - | |
775 | | - | |
776 | | - | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
777 | 781 | | |
778 | 782 | | |
779 | 783 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9109 | 9109 | | |
9110 | 9110 | | |
9111 | 9111 | | |
| 9112 | + | |
| 9113 | + | |
| 9114 | + | |
| 9115 | + | |
| 9116 | + | |
| 9117 | + | |
| 9118 | + | |
| 9119 | + | |
| 9120 | + | |
| 9121 | + | |
| 9122 | + | |
| 9123 | + | |
| 9124 | + | |
| 9125 | + | |
| 9126 | + | |
| 9127 | + | |
| 9128 | + | |
| 9129 | + | |
| 9130 | + | |
| 9131 | + | |
9112 | 9132 | | |
9113 | 9133 | | |
9114 | 9134 | | |
| |||
0 commit comments