Commit 3aefba7
authored
fix: fix elapsed_compute metric in ParquetSink to report encoding time only (#21825)
## Which issue does this PR close?
- Closes #21797.
## Rationale for this change
`ParquetSink` registered an `elapsed_compute` metric using a single
wall-clock timer that spanned the entire write operation — upstream
batch wait, CPU Arrow→Parquet encoding, and object-store I/O all rolled
into one number. This
made the metric misleading: it inflated `elapsed_compute` with I/O
latency, which is inconsistent with how every other operator in
DataFusion reports this metric (CPU time only).
## What changes are included in this PR?
Two write paths are fixed independently:
**Sequential path** (`allow_single_file_parallelism = false` or CDC
enabled):
- A new `TimingWriter<W>` wrapper implements `AsyncFileWriter` and
records wall-clock time spent in I/O calls (`write` / `complete`).
- The total time inside `writer.write()` and `writer.close()` is
accumulated in `total_write_time`. After all tasks join,
`elapsed_compute` is set to `total_write_time − io_time`, isolating pure
Arrow→Parquet encoding time.
**Parallel path** (`allow_single_file_parallelism = true`, default):
- `encoding_time: Time` (a clone of the registered `elapsed_compute`
metric) is threaded through the five-function call chain down to the two
leaf sites: `writer.write()` in `column_serializer_task` and
`writer.close()` in `spawn_rg_join_and_finalize_task`. Since `Time` is
`Arc<AtomicUsize>`, all concurrent column tasks accumulate directly into
the registered metric.
- Note: on the parallel path, `append_to_row_group()` in
`concatenate_parallel_row_groups` is interleaved with I/O and cannot be
cleanly isolated. It is excluded from `elapsed_compute`. This is
acceptable since it operates on already-encoded data and represents a
small fraction of total encoding CPU time.
## Are these changes tested?
Yes. Two tests are added/extended in
`datafusion/core/src/dataframe/parquet.rs`:
- `test_parquet_sink_metrics_sequential` (new): verifies
`elapsed_compute > 0` with `allow_single_file_parallelism = false`.
- `test_parquet_sink_metrics_parallel`: extended with an
`elapsed_compute > 0` assertion (was previously missing for the parallel
path).
The existing `test_parquet_sink_metrics` test (parallel path, default
config) already asserted `elapsed_compute > 0` and continues to pass.
## Are there any user-facing changes?
No API changes. The `elapsed_compute` metric was already surfaced — this
PR makes its value accurate rather than introducing a new metric.1 parent 73ca6a5 commit 3aefba7
7 files changed
Lines changed: 241 additions & 51 deletions
File tree
- datafusion
- core/src/dataframe
- datasource-parquet/src
- physical-expr-common
- src/metrics
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
183 | 183 | | |
184 | 184 | | |
185 | 185 | | |
| 186 | + | |
186 | 187 | | |
187 | 188 | | |
188 | 189 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
558 | 558 | | |
559 | 559 | | |
560 | 560 | | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
561 | 621 | | |
562 | 622 | | |
563 | 623 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
| 58 | + | |
| 59 | + | |
59 | 60 | | |
60 | 61 | | |
61 | 62 | | |
| |||
1340 | 1341 | | |
1341 | 1342 | | |
1342 | 1343 | | |
1343 | | - | |
1344 | | - | |
1345 | 1344 | | |
1346 | 1345 | | |
1347 | 1346 | | |
| |||
1375 | 1374 | | |
1376 | 1375 | | |
1377 | 1376 | | |
1378 | | - | |
1379 | | - | |
1380 | | - | |
1381 | | - | |
| 1377 | + | |
| 1378 | + | |
| 1379 | + | |
| 1380 | + | |
| 1381 | + | |
| 1382 | + | |
| 1383 | + | |
| 1384 | + | |
| 1385 | + | |
| 1386 | + | |
| 1387 | + | |
1382 | 1388 | | |
1383 | | - | |
1384 | | - | |
1385 | | - | |
1386 | | - | |
1387 | | - | |
1388 | | - | |
| 1389 | + | |
| 1390 | + | |
1389 | 1391 | | |
1390 | 1392 | | |
1391 | 1393 | | |
| |||
1402 | 1404 | | |
1403 | 1405 | | |
1404 | 1406 | | |
1405 | | - | |
1406 | | - | |
1407 | | - | |
1408 | | - | |
1409 | | - | |
| 1407 | + | |
| 1408 | + | |
| 1409 | + | |
| 1410 | + | |
| 1411 | + | |
| 1412 | + | |
| 1413 | + | |
| 1414 | + | |
1410 | 1415 | | |
1411 | 1416 | | |
1412 | 1417 | | |
1413 | 1418 | | |
1414 | | - | |
1415 | | - | |
1416 | | - | |
1417 | | - | |
1418 | | - | |
| 1419 | + | |
| 1420 | + | |
1419 | 1421 | | |
1420 | 1422 | | |
1421 | 1423 | | |
| |||
1456 | 1458 | | |
1457 | 1459 | | |
1458 | 1460 | | |
1459 | | - | |
1460 | | - | |
1461 | 1461 | | |
1462 | 1462 | | |
1463 | 1463 | | |
| |||
1487 | 1487 | | |
1488 | 1488 | | |
1489 | 1489 | | |
| 1490 | + | |
1490 | 1491 | | |
1491 | 1492 | | |
| 1493 | + | |
1492 | 1494 | | |
1493 | 1495 | | |
1494 | 1496 | | |
| |||
1505 | 1507 | | |
1506 | 1508 | | |
1507 | 1509 | | |
| 1510 | + | |
1508 | 1511 | | |
1509 | 1512 | | |
1510 | 1513 | | |
| |||
1522 | 1525 | | |
1523 | 1526 | | |
1524 | 1527 | | |
| 1528 | + | |
1525 | 1529 | | |
1526 | 1530 | | |
1527 | 1531 | | |
| |||
1536 | 1540 | | |
1537 | 1541 | | |
1538 | 1542 | | |
| 1543 | + | |
| 1544 | + | |
| 1545 | + | |
| 1546 | + | |
| 1547 | + | |
| 1548 | + | |
| 1549 | + | |
| 1550 | + | |
| 1551 | + | |
| 1552 | + | |
| 1553 | + | |
| 1554 | + | |
| 1555 | + | |
| 1556 | + | |
| 1557 | + | |
| 1558 | + | |
1539 | 1559 | | |
1540 | 1560 | | |
1541 | 1561 | | |
| |||
1570 | 1590 | | |
1571 | 1591 | | |
1572 | 1592 | | |
| 1593 | + | |
1573 | 1594 | | |
1574 | 1595 | | |
1575 | 1596 | | |
| |||
1584 | 1605 | | |
1585 | 1606 | | |
1586 | 1607 | | |
| 1608 | + | |
1587 | 1609 | | |
1588 | 1610 | | |
1589 | 1611 | | |
| |||
1603 | 1625 | | |
1604 | 1626 | | |
1605 | 1627 | | |
1606 | | - | |
1607 | | - | |
1608 | | - | |
1609 | | - | |
| 1628 | + | |
| 1629 | + | |
1610 | 1630 | | |
1611 | 1631 | | |
1612 | | - | |
1613 | | - | |
| 1632 | + | |
| 1633 | + | |
| 1634 | + | |
1614 | 1635 | | |
1615 | 1636 | | |
1616 | 1637 | | |
1617 | 1638 | | |
1618 | 1639 | | |
1619 | 1640 | | |
1620 | | - | |
| 1641 | + | |
| 1642 | + | |
| 1643 | + | |
| 1644 | + | |
| 1645 | + | |
| 1646 | + | |
1621 | 1647 | | |
1622 | 1648 | | |
1623 | 1649 | | |
| |||
1629 | 1655 | | |
1630 | 1656 | | |
1631 | 1657 | | |
1632 | | - | |
| 1658 | + | |
1633 | 1659 | | |
1634 | 1660 | | |
1635 | 1661 | | |
| |||
1640 | 1666 | | |
1641 | 1667 | | |
1642 | 1668 | | |
1643 | | - | |
| 1669 | + | |
1644 | 1670 | | |
1645 | 1671 | | |
1646 | 1672 | | |
| |||
1651 | 1677 | | |
1652 | 1678 | | |
1653 | 1679 | | |
1654 | | - | |
| 1680 | + | |
| 1681 | + | |
1655 | 1682 | | |
1656 | 1683 | | |
1657 | 1684 | | |
| |||
1670 | 1697 | | |
1671 | 1698 | | |
1672 | 1699 | | |
1673 | | - | |
| 1700 | + | |
| 1701 | + | |
1674 | 1702 | | |
1675 | 1703 | | |
1676 | 1704 | | |
| |||
1682 | 1710 | | |
1683 | 1711 | | |
1684 | 1712 | | |
1685 | | - | |
| 1713 | + | |
| 1714 | + | |
1686 | 1715 | | |
1687 | 1716 | | |
1688 | 1717 | | |
| |||
1749 | 1778 | | |
1750 | 1779 | | |
1751 | 1780 | | |
1752 | | - | |
1753 | | - | |
1754 | | - | |
1755 | | - | |
1756 | | - | |
| 1781 | + | |
| 1782 | + | |
1757 | 1783 | | |
1758 | | - | |
| 1784 | + | |
1759 | 1785 | | |
1760 | 1786 | | |
1761 | 1787 | | |
1762 | 1788 | | |
1763 | | - | |
1764 | 1789 | | |
1765 | 1790 | | |
1766 | | - | |
1767 | | - | |
| 1791 | + | |
| 1792 | + | |
1768 | 1793 | | |
1769 | 1794 | | |
1770 | | - | |
| 1795 | + | |
1771 | 1796 | | |
1772 | 1797 | | |
1773 | 1798 | | |
1774 | 1799 | | |
| 1800 | + | |
1775 | 1801 | | |
1776 | 1802 | | |
1777 | 1803 | | |
1778 | 1804 | | |
1779 | | - | |
1780 | | - | |
1781 | | - | |
1782 | | - | |
| 1805 | + | |
| 1806 | + | |
1783 | 1807 | | |
1784 | 1808 | | |
1785 | 1809 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
| |||
0 commit comments