Commit c2c0773
authored
feat(unparser): Keep inner join
## Which issue does this PR close?
Partially addresses #13156
(inner joins only; outer joins require additional work)
When the DataFusion optimizer pushes filter predicates into `TableScan`
nodes (e.g. via `FilterPushdown`), the unparser's
`try_transform_to_simple_table_scan_with_filters` extracts those filters
and then always folds them into the `JOIN ON` clause. This is
problematic when the extracted filters contain subquery expressions
(scalar subqueries, `IN`, `EXISTS`), because some SQL backends — notably
BigQuery — reject subqueries inside `JOIN ON` predicates.
This currently breaks 5 TPC-H queries (Q2, Q16, Q17, Q18, Q21) when
unparsed SQL is sent to BigQuery.
We did attempt to fix this
(#13496) by moving **all**
filters to `WHERE`, which broke `LEFT`/`RIGHT`/`FULL` join semantics
(moving a filter from `ON` to `WHERE` changes the result for outer
joins, as demonstrated in
#13132).
## What changes are included in this PR?
For **inner joins only**, `table_scan_filters` extracted by
`try_transform_to_simple_table_scan_with_filters` are now placed in the
`WHERE` clause instead of the `JOIN ON` clause. This is safe because
`ON` and `WHERE` are semantically equivalent for inner joins.
For non-inner joins (`LEFT`, `RIGHT`, `FULL`), the existing behavior is
preserved — filters remain in `JOIN ON` — since moving them to `WHERE`
would change query semantics.
## Are these changes tested?
Yes.
- Added a test case in `test_join_with_table_scan_filters` that
constructs an inner join where the right side has a
`table_scan_with_filters` containing a scalar subquery. Verifies the
subquery predicate appears in `WHERE`, not `JOIN ON`.
- Updated existing snapshots in `test_join_with_table_scan_filters`
reflecting that `table_scan_filters` now appear in `WHERE` for inner
joins.
## Are there any user-facing changes?
SQL generated by the unparser for inner joins may now place `TableScan`
pushdown filters in the `WHERE` clause instead of the `JOIN ON` clause
(similar to changes in `test_join_with_table_scan_filters`)
## Alternatives considered
An alternative approach considered is to introduce a
`supports_subquery_in_join_predicate` dialect flag that only moves
subquery-containing filters to `WHERE` when the dialect opts in (e.g.
`BigQueryDialect`), preserving existing behavior for all other dialects.
Example implementation: spiceai#151
Current approach was chosen due to
- simplicity: no new dialect flag, fewer code paths, simple change.
- performance (potentially): as noted in
[#13156](#13156), placing
filters in `WHERE` can trigger filter pushdown on the target backend,
which is a potential performance win.Filter → TableScan predicates to WHERE instead of moving to JOIN ON (#21694)1 parent bbf67d9 commit c2c0773
2 files changed
Lines changed: 157 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
863 | 863 | | |
864 | 864 | | |
865 | 865 | | |
866 | | - | |
867 | | - | |
868 | | - | |
869 | | - | |
870 | | - | |
871 | | - | |
872 | | - | |
873 | | - | |
874 | | - | |
875 | | - | |
876 | | - | |
877 | | - | |
878 | | - | |
879 | | - | |
880 | | - | |
881 | | - | |
882 | | - | |
883 | | - | |
884 | | - | |
885 | | - | |
886 | | - | |
887 | | - | |
888 | | - | |
889 | | - | |
890 | | - | |
891 | | - | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
892 | 875 | | |
893 | 876 | | |
894 | 877 | | |
| |||
1936 | 1919 | | |
1937 | 1920 | | |
1938 | 1921 | | |
| 1922 | + | |
| 1923 | + | |
| 1924 | + | |
| 1925 | + | |
| 1926 | + | |
| 1927 | + | |
| 1928 | + | |
| 1929 | + | |
| 1930 | + | |
| 1931 | + | |
| 1932 | + | |
| 1933 | + | |
| 1934 | + | |
| 1935 | + | |
| 1936 | + | |
| 1937 | + | |
| 1938 | + | |
| 1939 | + | |
| 1940 | + | |
| 1941 | + | |
| 1942 | + | |
| 1943 | + | |
| 1944 | + | |
| 1945 | + | |
| 1946 | + | |
| 1947 | + | |
| 1948 | + | |
| 1949 | + | |
| 1950 | + | |
| 1951 | + | |
| 1952 | + | |
| 1953 | + | |
| 1954 | + | |
| 1955 | + | |
| 1956 | + | |
| 1957 | + | |
| 1958 | + | |
| 1959 | + | |
| 1960 | + | |
| 1961 | + | |
| 1962 | + | |
| 1963 | + | |
| 1964 | + | |
| 1965 | + | |
| 1966 | + | |
| 1967 | + | |
1939 | 1968 | | |
1940 | 1969 | | |
1941 | 1970 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| |||
1829 | 1830 | | |
1830 | 1831 | | |
1831 | 1832 | | |
1832 | | - | |
| 1833 | + | |
1833 | 1834 | | |
1834 | 1835 | | |
1835 | 1836 | | |
| |||
1844 | 1845 | | |
1845 | 1846 | | |
1846 | 1847 | | |
1847 | | - | |
| 1848 | + | |
1848 | 1849 | | |
1849 | 1850 | | |
1850 | 1851 | | |
| |||
1869 | 1870 | | |
1870 | 1871 | | |
1871 | 1872 | | |
1872 | | - | |
| 1873 | + | |
1873 | 1874 | | |
1874 | 1875 | | |
1875 | 1876 | | |
| |||
1899 | 1900 | | |
1900 | 1901 | | |
1901 | 1902 | | |
1902 | | - | |
| 1903 | + | |
| 1904 | + | |
| 1905 | + | |
| 1906 | + | |
| 1907 | + | |
| 1908 | + | |
| 1909 | + | |
| 1910 | + | |
| 1911 | + | |
| 1912 | + | |
| 1913 | + | |
| 1914 | + | |
| 1915 | + | |
| 1916 | + | |
| 1917 | + | |
| 1918 | + | |
| 1919 | + | |
| 1920 | + | |
| 1921 | + | |
| 1922 | + | |
| 1923 | + | |
| 1924 | + | |
| 1925 | + | |
| 1926 | + | |
| 1927 | + | |
| 1928 | + | |
| 1929 | + | |
| 1930 | + | |
| 1931 | + | |
| 1932 | + | |
| 1933 | + | |
| 1934 | + | |
| 1935 | + | |
| 1936 | + | |
| 1937 | + | |
| 1938 | + | |
| 1939 | + | |
| 1940 | + | |
| 1941 | + | |
| 1942 | + | |
| 1943 | + | |
| 1944 | + | |
| 1945 | + | |
| 1946 | + | |
| 1947 | + | |
| 1948 | + | |
| 1949 | + | |
| 1950 | + | |
| 1951 | + | |
| 1952 | + | |
| 1953 | + | |
| 1954 | + | |
| 1955 | + | |
| 1956 | + | |
| 1957 | + | |
| 1958 | + | |
| 1959 | + | |
| 1960 | + | |
| 1961 | + | |
| 1962 | + | |
| 1963 | + | |
| 1964 | + | |
| 1965 | + | |
| 1966 | + | |
| 1967 | + | |
| 1968 | + | |
| 1969 | + | |
| 1970 | + | |
| 1971 | + | |
| 1972 | + | |
| 1973 | + | |
| 1974 | + | |
| 1975 | + | |
| 1976 | + | |
| 1977 | + | |
| 1978 | + | |
| 1979 | + | |
| 1980 | + | |
| 1981 | + | |
| 1982 | + | |
| 1983 | + | |
| 1984 | + | |
| 1985 | + | |
| 1986 | + | |
| 1987 | + | |
| 1988 | + | |
| 1989 | + | |
| 1990 | + | |
| 1991 | + | |
| 1992 | + | |
| 1993 | + | |
| 1994 | + | |
| 1995 | + | |
| 1996 | + | |
| 1997 | + | |
| 1998 | + | |
| 1999 | + | |
1903 | 2000 | | |
1904 | 2001 | | |
1905 | 2002 | | |
| |||
0 commit comments