Skip to content

Commit 4d06c40

Browse files
Improve ExternalSorter ResourcesExhausted Error Message (#20226)
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #20225. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> When there is not enough memory to continue external sort, either increasing the `memory limit` or decreasing `sort_spill_reservation_bytes` need to be applied. It can be useful to guide the user with clearer error message by highlighting required configs for the consistency because expected settings are as follows: ``` SET datafusion.runtime.memory_limit = '10G' SET datafusion.execution.sort_spill_reservation_bytes = 10485760 ``` Current: ``` Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes ``` New: ``` Not enough memory to continue external sort. Consider increasing the memory limit config: 'datafusion.runtime.memory_limit', or decreasing the config: 'datafusion.execution.sort_spill_reservation_bytes'. ``` ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> A new UT case has been added to cover this use-case and it has been tested locally successfully. **For updated snapshot files:** Following integration tests have been successful by DataFusion Test Pipeline: ``` test test_cli_top_memory_consumers::case_1 ... ok test test_cli_top_memory_consumers::case_2 ... ok test test_cli_top_memory_consumers::case_3 ... ok ``` **Ref:** https://github.com/apache/datafusion/actions/runs/21811797863/job/62925363536?pr=20226 ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> Yes, improving legacy `ExternalSorter` `ResourcesExhausted` Error Message. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
1 parent 73bce15 commit 4d06c40

5 files changed

Lines changed: 23 additions & 6 deletions

File tree

datafusion-cli/tests/snapshots/cli_top_memory_consumers@no_track.snap

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ success: false
1414
exit_code: 1
1515
----- stdout -----
1616
[CLI_VERSION]
17-
Error: Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes
17+
Error: Not enough memory to continue external sort. Consider increasing the memory limit config: 'datafusion.runtime.memory_limit', or decreasing the config: 'datafusion.execution.sort_spill_reservation_bytes'.
1818
caused by
1919
Resources exhausted: Failed to allocate
2020

datafusion-cli/tests/snapshots/cli_top_memory_consumers@top2.snap

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ success: false
1414
exit_code: 1
1515
----- stdout -----
1616
[CLI_VERSION]
17-
Error: Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes
17+
Error: Not enough memory to continue external sort. Consider increasing the memory limit config: 'datafusion.runtime.memory_limit', or decreasing the config: 'datafusion.execution.sort_spill_reservation_bytes'.
1818
caused by
1919
Resources exhausted: Additional allocation failed for ExternalSorter[0] with top memory consumers (across reservations) as:
2020
Consumer(can spill: bool) consumed XB, peak XB,

datafusion-cli/tests/snapshots/cli_top_memory_consumers@top3_default.snap

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ success: false
1212
exit_code: 1
1313
----- stdout -----
1414
[CLI_VERSION]
15-
Error: Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes
15+
Error: Not enough memory to continue external sort. Consider increasing the memory limit config: 'datafusion.runtime.memory_limit', or decreasing the config: 'datafusion.execution.sort_spill_reservation_bytes'.
1616
caused by
1717
Resources exhausted: Additional allocation failed for ExternalSorter[0] with top memory consumers (across reservations) as:
1818
Consumer(can spill: bool) consumed XB, peak XB,

datafusion-examples/examples/execution_monitoring/memory_pool_tracking.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,8 @@ async fn automatic_usage_example() -> Result<()> {
110110
println!("✓ Expected memory limit error during data processing:");
111111
println!("Error: {e}");
112112
/* Example error message:
113-
Error: Not enough memory to continue external sort. Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes
113+
Error: Not enough memory to continue external sort. Consider increasing the memory limit config: 'datafusion.runtime.memory_limit',
114+
or decreasing the config: 'datafusion.execution.sort_spill_reservation_bytes'.
114115
caused by
115116
Resources exhausted: Additional allocation failed with top memory consumers (across reservations) as:
116117
ExternalSorterMerge[3]#112(can spill: false) consumed 10.0 MB, peak 10.0 MB,

datafusion/physical-plan/src/sorts/sort.rs

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -819,7 +819,8 @@ impl ExternalSorter {
819819
match e {
820820
DataFusionError::ResourcesExhausted(_) => e.context(
821821
"Not enough memory to continue external sort. \
822-
Consider increasing the memory limit, or decreasing sort_spill_reservation_bytes"
822+
Consider increasing the memory limit config: 'datafusion.runtime.memory_limit', \
823+
or decreasing the config: 'datafusion.execution.sort_spill_reservation_bytes'."
823824
),
824825
// This is not an OOM error, so just return it as is.
825826
_ => e,
@@ -1736,6 +1737,21 @@ mod tests {
17361737
"Assertion failed: expected a ResourcesExhausted error, but got: {err:?}"
17371738
);
17381739

1740+
// Verify external sorter error message when resource is exhausted
1741+
let config_vector = vec![
1742+
"datafusion.runtime.memory_limit",
1743+
"datafusion.execution.sort_spill_reservation_bytes",
1744+
];
1745+
let error_message = err.message().to_string();
1746+
for config in config_vector.into_iter() {
1747+
assert!(
1748+
error_message.as_str().contains(config),
1749+
"Config: '{}' should be contained in error message: {}.",
1750+
config,
1751+
error_message.as_str()
1752+
);
1753+
}
1754+
17391755
Ok(())
17401756
}
17411757

@@ -1756,7 +1772,7 @@ mod tests {
17561772

17571773
// The input has 200 partitions, each partition has a batch containing 100 rows.
17581774
// Each row has a single Utf8 column, the Utf8 string values are roughly 42 bytes.
1759-
// The total size of the input is roughly 8.4 KB.
1775+
// The total size of the input is roughly 820 KB.
17601776
let input = test::scan_partitioned_utf8(200);
17611777
let schema = input.schema();
17621778

0 commit comments

Comments
 (0)