Merged
Conversation
476d149 to
6955b72
Compare
Adds a new `arrow_try_cast(expr, 'DataType')` function that casts to Arrow data types specified as strings (like `arrow_cast`) but returns NULL on cast failure instead of erroring (like `try_cast`). The implementation reuses `arrow_cast`'s `data_type_from_args` helper and simplifies to `Expr::TryCast` during optimization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6955b72 to
3610364
Compare
jonahgao
approved these changes
Mar 24, 2026
Member
jonahgao
left a comment
There was a problem hiding this comment.
LGTM, I think this is a very practical UDF.
A minor improvement might be to add more tests. The current test cases are all evaluated at compile time via constant folding. Maybe we need a test likes
select arrow_try_cast(a, 'Int64') from values('100'), (NULL), ('foo') t(a);This would evaluate the cast during physical execution.
Add tests using VALUES clauses so arrow_try_cast is evaluated at runtime rather than being constant-folded at compile time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Thanks @jonahgao ! |
de-bgunter
pushed a commit
to de-bgunter/datafusion
that referenced
this pull request
Mar 24, 2026
## Which issue does this PR close? N/A - new feature ## Rationale for this change `arrow_cast(expr, 'DataType')` casts to Arrow data types specified as strings but errors on failure. `try_cast(expr AS type)` returns NULL on failure but only works with SQL types. There's currently no way to attempt a cast to a specific Arrow type and get NULL on failure instead of an error. ## What changes are included in this PR? Adds a new `arrow_try_cast(expression, datatype)` scalar function that combines the behavior of `arrow_cast` and `try_cast`: - Accepts Arrow data type strings (like `arrow_cast`) - Returns NULL on cast failure instead of erroring (like `try_cast`) Implementation details: - Reuses `arrow_cast`'s `data_type_from_args` helper (made `pub(crate)`) - Simplifies to `Expr::TryCast` during optimization (vs `Expr::Cast` for `arrow_cast`) - Registered alongside existing core functions ## Are these changes tested? Yes — new sqllogictest file `arrow_try_cast.slt` covering: - Successful casts (Int64, Float64, LargeUtf8, Dictionary) - Failed cast returning NULL - Same-type passthrough - NULL input - Invalid type string errors - Multiple casts in one query ## Are there any user-facing changes? New `arrow_try_cast` SQL function available. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
N/A - new feature
Rationale for this change
arrow_cast(expr, 'DataType')casts to Arrow data types specified as strings but errors on failure.try_cast(expr AS type)returns NULL on failure but only works with SQL types. There's currently no way to attempt a cast to a specific Arrow type and get NULL on failure instead of an error.What changes are included in this PR?
Adds a new
arrow_try_cast(expression, datatype)scalar function that combines the behavior ofarrow_castandtry_cast:arrow_cast)try_cast)Implementation details:
arrow_cast'sdata_type_from_argshelper (madepub(crate))Expr::TryCastduring optimization (vsExpr::Castforarrow_cast)Are these changes tested?
Yes — new sqllogictest file
arrow_try_cast.sltcovering:Are there any user-facing changes?
New
arrow_try_castSQL function available.🤖 Generated with Claude Code