Skip to content

Commit 1b8f203

Browse files
committed
Update config related stuff
1 parent 1fe63eb commit 1b8f203

3 files changed

Lines changed: 7 additions & 3 deletions

File tree

datafusion/common/src/config.rs

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -670,10 +670,12 @@ config_namespace! {
670670
/// `false` — ANSI SQL mode is disabled by default.
671671
pub enable_ansi_mode: bool, default = false
672672

673-
/// Prefix to use when generating file name in multi file output
673+
/// Prefix to use when generating file name in multi file output.
674674
///
675675
/// When prefix is non-empty string, this prefix will be used to generate file name as
676-
/// `{partitioned_file_prefix_name}{datafusion generated suffix}`
676+
/// `{partitioned_file_prefix_name}{datafusion generated suffix}`.
677+
///
678+
/// Defaults to empty string.
677679
pub partitioned_file_prefix_name: String, default = String::new()
678680
}
679681
}

datafusion/sqllogictest/test_files/information_schema.slt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,7 @@ datafusion.execution.parquet.statistics_enabled page
260260
datafusion.execution.parquet.statistics_truncate_length 64
261261
datafusion.execution.parquet.write_batch_size 1024
262262
datafusion.execution.parquet.writer_version 1.0
263+
datafusion.execution.partitioned_file_prefix_name (empty)
263264
datafusion.execution.perfect_hash_join_min_key_density 0.15
264265
datafusion.execution.perfect_hash_join_small_build_threshold 1024
265266
datafusion.execution.planning_concurrency 13
@@ -397,6 +398,7 @@ datafusion.execution.parquet.statistics_enabled page (writing) Sets if statistic
397398
datafusion.execution.parquet.statistics_truncate_length 64 (writing) Sets statistics truncate length. If NULL, uses default parquet writer setting
398399
datafusion.execution.parquet.write_batch_size 1024 (writing) Sets write_batch_size in rows
399400
datafusion.execution.parquet.writer_version 1.0 (writing) Sets parquet writer version valid values are "1.0" and "2.0"
401+
datafusion.execution.partitioned_file_prefix_name (empty) Prefix to use when generating file name in multi file output. When prefix is non-empty string, this prefix will be used to generate file name as `{partitioned_file_prefix_name}{datafusion generated suffix}`. Defaults to empty string.
400402
datafusion.execution.perfect_hash_join_min_key_density 0.15 The minimum required density of join keys on the build side to consider a perfect hash join (see `HashJoinExec` for more details). Density is calculated as: `(number of rows) / (max_key - min_key + 1)`. A perfect hash join may be used if the actual key density > this value. Currently only supports cases where build_side.num_rows() < u32::MAX. Support for build_side.num_rows() >= u32::MAX will be added in the future.
401403
datafusion.execution.perfect_hash_join_small_build_threshold 1024 A perfect hash join (see `HashJoinExec` for more details) will be considered if the range of keys (max - min) on the build side is < this threshold. This provides a fast path for joins with very small key ranges, bypassing the density check. Currently only supports cases where build_side.num_rows() < u32::MAX. Support for build_side.num_rows() >= u32::MAX will be added in the future.
402404
datafusion.execution.planning_concurrency 13 Fan-out during initial physical planning. This is mostly use to plan `UNION` children in parallel. Defaults to the number of CPU cores on the system

docs/source/user-guide/configs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ The following configuration settings are available:
133133
| datafusion.execution.enforce_batch_size_in_joins | false | Should DataFusion enforce batch size in joins or not. By default, DataFusion will not enforce batch size in joins. Enforcing batch size in joins can reduce memory usage when joining large tables with a highly-selective join filter, but is also slightly slower. |
134134
| datafusion.execution.objectstore_writer_buffer_size | 10485760 | Size (bytes) of data buffer DataFusion uses when writing output files. This affects the size of the data chunks that are uploaded to remote object stores (e.g. AWS S3). If very large (>= 100 GiB) output files are being written, it may be necessary to increase this size to avoid errors from the remote end point. |
135135
| datafusion.execution.enable_ansi_mode | false | Whether to enable ANSI SQL mode. The flag is experimental and relevant only for DataFusion Spark built-in functions When `enable_ansi_mode` is set to `true`, the query engine follows ANSI SQL semantics for expressions, casting, and error handling. This means: - **Strict type coercion rules:** implicit casts between incompatible types are disallowed. - **Standard SQL arithmetic behavior:** operations such as division by zero, numeric overflow, or invalid casts raise runtime errors rather than returning `NULL` or adjusted values. - **Consistent ANSI behavior** for string concatenation, comparisons, and `NULL` handling. When `enable_ansi_mode` is `false` (the default), the engine uses a more permissive, non-ANSI mode designed for user convenience and backward compatibility. In this mode: - Implicit casts between types are allowed (e.g., string to integer when possible). - Arithmetic operations are more lenient — for example, `abs()` on the minimum representable integer value returns the input value instead of raising overflow. - Division by zero or invalid casts may return `NULL` instead of failing. # Default `false` — ANSI SQL mode is disabled by default. |
136-
| datafusion.execution.partitioned_file_prefix_name | | Prefix to use when generating file name in multi file output. Defaults to empty string - no prefix |
136+
| datafusion.execution.partitioned_file_prefix_name | | Prefix to use when generating file name in multi file output. When prefix is non-empty string, this prefix will be used to generate file name as `{partitioned_file_prefix_name}{datafusion generated suffix}`. Defaults to empty string. |
137137
| datafusion.optimizer.enable_distinct_aggregation_soft_limit | true | When set to true, the optimizer will push a limit operation into grouped aggregations which have no aggregate expressions, as a soft limit, emitting groups once the limit is reached, before all rows in the group are read. |
138138
| datafusion.optimizer.enable_round_robin_repartition | true | When set to true, the physical plan optimizer will try to add round robin repartitioning to increase parallelism to leverage more CPU cores |
139139
| datafusion.optimizer.enable_topk_aggregation | true | When set to true, the optimizer will attempt to perform limit operations during aggregations, if possible |

0 commit comments

Comments
 (0)