Commit 794f30e
authored
## Which issue does this PR close?
- Part of #21684
## Rationale for this change
Introduce three new string array builders with bulk null tracking:
- `StringArrayBuilder` (Utf8)
- `LargeStringArrayBuilder` (LargeUtf8)
- `StringViewArrayBuilder` (Utf8View)
Each builder has the following API:
- append_value(&str) -- add a non-NULL value (row)
- append_placeholder() -- add a NULL row placeholder
- finish(Option<NullBuffer>) -- finish the build, specify NULLs
These are the counterpart of Arrow's `GenericStringBuilder` /
`StringViewBuilder` but it
skips per-row NULL buffer maintenance, which lets callers compute the
NULL buffer in
bulk when possible.
This PR also switches `case_conversion` to use the new APIs, which is
used to
implement `lower`, `upper`, and the Spark equivalents. This improves
`lower` / `upper`
performance by 3-15% on microbenchmarks. More UDFs (~10) will be
converted to use
this API in future PRs.
## What changes are included in this PR?
* Add new builders
* Add unit tests
* Adopt builders in `case_conversion`
## Are these changes tested?
Yes.
## Are there any user-facing changes?
No.
1 parent 7d5ddca commit 794f30e
2 files changed
Lines changed: 491 additions & 28 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | | - | |
25 | | - | |
| 24 | + | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
360 | 372 | | |
361 | 373 | | |
362 | 374 | | |
363 | | - | |
| 375 | + | |
364 | 376 | | |
365 | 377 | | |
366 | 378 | | |
| |||
400 | 412 | | |
401 | 413 | | |
402 | 414 | | |
403 | | - | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
404 | 418 | | |
405 | | - | |
406 | | - | |
407 | | - | |
408 | | - | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
409 | 429 | | |
410 | | - | |
411 | | - | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
412 | 435 | | |
413 | | - | |
| 436 | + | |
414 | 437 | | |
415 | 438 | | |
416 | 439 | | |
| |||
0 commit comments