fix: linearized operands in physical binaryexpr protobuf to avoid recursion limit#21031
fix: linearized operands in physical binaryexpr protobuf to avoid recursion limit#21031Jefffrey merged 12 commits intoapache:mainfrom
Conversation
| PhysicalExprNode l = 1; | ||
| PhysicalExprNode r = 2; |
There was a problem hiding this comment.
should we delete it or keep it for backward compatibility
There was a problem hiding this comment.
I don't think we have a stance on maintaining backward compatibility yet; so far we haven't bothered, though there is this issue raised to discuss the possibility:
I believe recently DataFusion is trying to cut down on breaking changes so perhaps it's worth to keep this backwards compatibility with that in mind
Jefffrey
left a comment
There was a problem hiding this comment.
Makes sense to me, if we've already applied this logic to logical side and are just bringing it to physical
| PhysicalExprNode l = 1; | ||
| PhysicalExprNode r = 2; |
There was a problem hiding this comment.
I don't think we have a stance on maintaining backward compatibility yet; so far we haven't bothered, though there is this issue raised to discuss the possibility:
I believe recently DataFusion is trying to cut down on breaking changes so perhaps it's worth to keep this backwards compatibility with that in mind
|
Thanks for your reviews @Jefffrey ❤️ |
|
Thanks @haohuaijin |
Which issue does this PR close?
Rationale for this change
When a SQL query contains many filter conditions (e.g., 40+
AND/ORclauses in aWHERE), serializing the physical plan to protobuf and deserializing it fails withDecodeError: recursion limit reached. This is because prost has a default recursion limit of 100, and eachBinaryExprnesting consumes ~2 levels of protobuf recursion depth, so a chain of ~50 AND conditions exceeds the limit.What changes are included in this PR?
Applied the same linearization approach that logical expressions already use that convert a left-deep tree to linearization list. Instead of encoding a chain of same-operator binary expressions as a deeply nested tree, we flatten it into a flat
operandslist:Before (nested, O(n) recursion depth):
After (flat, O(1) recursion depth for the chain):
Are these changes tested?
yes, add some test case
Are there any user-facing changes?