Commit f802ed1
authored
Add protobuf serialization/deserialization support for
## Which issue does this PR close?
<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->
I figured it will be easier to submit PR right away as change doesn't
look controversial. I'm happy to create an issue and link it here if
you'd prefer.
## Rationale for this change
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
So short story is: in another project we'd like to use DataFusion's to
"build" operations on data and then submit resulting logical plan
_somewhere_ to execute (likely not using DF to actually execute the
query). Since those plans never meant to be executed by DF we use
`EmptyTable` as a base to bring schema to DF without any actual data.
`EmptyTable` scans not being serializable prevents us from sending those
plans to Python or over the wire. I believe this change makes
datafusion's LogicalPlan more portable and more usable outside of
datafusion's query executor.
Longer story: [VegaFusion](https://github.com/vega/vegafusion) does
server-side aggregation for Vega charts and is powered by DataFusion. We
recently added option to [use custom query/plan
executors](vega/vegafusion#573), which allows
user to pass a schema (without data) to VegaFusion which will add all
necessary aggregations (but not execute them) and return a logical plan
to user. They can then outsource this plan to custom query executor
(e.g. Spark). This is already implemented and works. However, since
VegaFusion is most commonly used through Python bindings, we'd like to
expose this API to Python too (and additionally as part of gPRC API too)
, which requires serializing built plans to protobuf. Currently we use
`EmptyTable` to bring schema without any data to DataFusion. But since
it can't be converted to protobuf, we're unable to expose this API. We
considered providing custom decoder/encoder, but that would work only
for gRPC case, but not Python as datafusion-python doesn't allow to
provide custom decoder as far as I understand.
## What changes are included in this PR?
<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
* Moved `EmptyTable` from `datafusion-core` into `datafusion-catalog`
and added backwards compatibility re-export (following pattern for other
table providers moved earlier)
* Added new `EmptyTableScanNode` to protobuf definitions
* Added encoding and decoding for new entity into `AsLogicalPlan for
LogicalPlanNode` implementation
## Are these changes tested?
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code
If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
I added two roundtrip tests for the new node
## Are there any user-facing changes?
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
`EmptyTable` can be imported from `datafusion-catalog` crate now, but
old crate (`datafusion-core`) still re-exports it, so this shouldn't be
breaking change
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
P.S. Just to be explicit, code itself was written mostly by LLM (as I'm
not that proficient in Rust yet). I did review and test it thoughEmptyTable scans (#20844)1 parent af7904f commit f802ed1
8 files changed
Lines changed: 281 additions & 8 deletions
File tree
- datafusion
- catalog/src
- core/src/datasource
- proto
- proto
- src
- generated
- logical_plan
- tests/cases
Lines changed: 5 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
| 24 | + | |
| 25 | + | |
30 | 26 | | |
31 | 27 | | |
32 | 28 | | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
33 | 32 | | |
34 | 33 | | |
35 | 34 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | 23 | | |
25 | 24 | | |
26 | 25 | | |
| |||
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| 65 | + | |
65 | 66 | | |
66 | 67 | | |
67 | 68 | | |
| |||
1426 | 1427 | | |
1427 | 1428 | | |
1428 | 1429 | | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
| 1433 | + | |
| 1434 | + | |
| 1435 | + | |
| 1436 | + | |
1429 | 1437 | | |
1430 | 1438 | | |
1431 | 1439 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
1065 | 1066 | | |
1066 | 1067 | | |
1067 | 1068 | | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
1068 | 1098 | | |
1069 | 1099 | | |
1070 | 1100 | | |
| |||
1277 | 1307 | | |
1278 | 1308 | | |
1279 | 1309 | | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
1280 | 1323 | | |
1281 | 1324 | | |
1282 | 1325 | | |
| |||
0 commit comments