Commit 53c8541
fix: string_to_array('', delim) returns empty array for PostgreSQL compatibility (apache#21104)
## Problem
`string_to_array` was returning incorrect results for empty string input
— both when the delimiter is non-empty and when the delimiter is itself
an empty string. This diverges from PostgreSQL behavior.
| Query | DataFusion (before) | PostgreSQL (expected) |
|---|---|---|
| `string_to_array('', ',')` | `['']` | `{}` |
| `string_to_array('', '')` | `['']` | `{}` |
| `string_to_array('', ',', 'x')` | `['']` | `{}` |
| `string_to_array('', '', 'x')` | `['']` | `{}` |
Results from datafusion-cli
<img width="1435" height="104" alt="Screenshot 2026-03-23 at 9 14 08 AM"
src="https://github.com/user-attachments/assets/2eaae366-7f8a-4220-87d2-f0b311c26712"
/>
**Root cause:** Rust's `str::split()` on an empty string always yields
one empty-string element, so `"".split(",")` produces `[""]`.
Additionally, the empty-delimiter branch unconditionally appended the
(empty) string value. This is subtle because Arrow's text display format
appears not to quote strings, so `[""]` renders as `[]` —
indistinguishable from an actual empty array. Using `cardinality()`
reveals the current length is 1, not 0.
**PostgreSQL reference:**
[db-fiddle](https://www.db-fiddle.com/f/oCF8EPaZFkDNKSg28rVVWy/3)
## Fix
In `datafusion/functions-nested/src/string.rs`:
- **Non-empty delimiter** `(Some(string), Some(delimiter))`: added `if
!string.is_empty()` guard to skip splitting when input is empty.
- **Empty delimiter** `(Some(string), Some(""))`: added `if
!string.is_empty()` guard so the string value is only appended when
non-empty.
Both the plain variant and the `null_value` variant are fixed (4 checks
total).
## Tests
Added sqllogictest cases in
`datafusion/sqllogictest/test_files/array.slt` using `cardinality()` to
unambiguously verify the arrays are truly empty (not just displaying as
empty):
```sql
SELECT cardinality(string_to_array('', ',')) -- 0
SELECT cardinality(string_to_array('', '')) -- 0
SELECT cardinality(string_to_array('', ',', 'x')) -- 0
SELECT cardinality(string_to_array('', '', 'x')) -- 0
```
Each test covers one of the four `is_empty` guard checks. All four fail
without the fix (returning 1 instead of 0).
(cherry picked from commit cdaecf0)1 parent 0423028 commit 53c8541
2 files changed
+59
-35
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
735 | 735 | | |
736 | 736 | | |
737 | 737 | | |
738 | | - | |
739 | | - | |
740 | | - | |
741 | | - | |
742 | | - | |
743 | | - | |
744 | | - | |
745 | | - | |
746 | | - | |
747 | | - | |
748 | | - | |
749 | | - | |
750 | | - | |
751 | | - | |
752 | | - | |
753 | | - | |
754 | | - | |
755 | | - | |
756 | | - | |
757 | | - | |
758 | | - | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
759 | 743 | | |
760 | | - | |
761 | | - | |
762 | | - | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
763 | 763 | | |
764 | 764 | | |
765 | 765 | | |
766 | 766 | | |
767 | 767 | | |
768 | 768 | | |
769 | 769 | | |
770 | | - | |
771 | | - | |
772 | | - | |
773 | | - | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
774 | 776 | | |
775 | 777 | | |
776 | 778 | | |
777 | 779 | | |
778 | | - | |
779 | | - | |
780 | | - | |
781 | | - | |
782 | | - | |
783 | | - | |
784 | | - | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
785 | 789 | | |
786 | 790 | | |
787 | 791 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8699 | 8699 | | |
8700 | 8700 | | |
8701 | 8701 | | |
| 8702 | + | |
| 8703 | + | |
| 8704 | + | |
| 8705 | + | |
| 8706 | + | |
| 8707 | + | |
| 8708 | + | |
| 8709 | + | |
| 8710 | + | |
| 8711 | + | |
| 8712 | + | |
| 8713 | + | |
| 8714 | + | |
| 8715 | + | |
| 8716 | + | |
| 8717 | + | |
| 8718 | + | |
| 8719 | + | |
| 8720 | + | |
| 8721 | + | |
8702 | 8722 | | |
8703 | 8723 | | |
8704 | 8724 | | |
| |||
0 commit comments