Skip to content

Commit 878b879

Browse files
alamb2010YOUY01
andauthored
docs: Improve getting started and testing guides for humans and agents (#20970)
## Which issue does this PR close? - Closes #. ## Rationale for this change This PR is a follow-up to #20939 from @Dandandan. The goal is to make it easier for both humans and agents to get started making changes in this repository and to create pull requests efficiently. The repository already had the necessary contributor information, but it was spread across multiple documents and not easy to discover quickly. ## What changes are included in this PR? This PR makes the most important setup, testing, and pre-PR checks easier to find from the contributor guide and from `AGENTS.md`: - Add a quick-start setup section to the contributor guide with the shortest path to a working local environment. - Add a testing quick-start section summarizing the most important tests to run before submitting a PR. - Add a “Before Submitting a PR” section to centralize formatting and lint guidance. - Update `AGENTS.md` to point to the canonical contributor guide sections instead of duplicating setup and testing instructions. ## Are these changes tested? This PR updates documentation only. ## Are there any user-facing changes? This improves contributor-facing documentation and makes setup / testing guidance easier to discover, but it does not change DataFusion runtime behavior or public APIs. --------- Co-authored-by: Yongting You <2010youy01@gmail.com>
1 parent 448a395 commit 878b879

File tree

4 files changed

+94
-28
lines changed

4 files changed

+94
-28
lines changed

AGENTS.md

Lines changed: 10 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,33 +2,22 @@
22

33
## Developer Documentation
44

5+
- [Quick Start Setup](docs/source/contributor-guide/development_environment.md#quick-start)
6+
- [Testing Quick Start](docs/source/contributor-guide/testing.md#testing-quick-start)
7+
- [Before Submitting a PR](docs/source/contributor-guide/index.md#before-submitting-a-pr)
58
- [Contributor Guide](docs/source/contributor-guide/index.md)
69
- [Architecture Guide](docs/source/contributor-guide/architecture.md)
710

811
## Before Committing
912

10-
Before committing any changes, you **must** run the following checks and fix any issues:
13+
Before committing any changes, you MUST follow the instructions in
14+
[Before Submitting a PR](docs/source/contributor-guide/index.md#before-submitting-a-pr)
15+
and ensure the required checks listed there pass. Do not commit code that
16+
fails any of those checks.
1117

12-
```bash
13-
cargo fmt --all
14-
cargo clippy --all-targets --all-features -- -D warnings
15-
```
16-
17-
- `cargo fmt` ensures consistent code formatting across the project.
18-
- `cargo clippy` catches common mistakes and enforces idiomatic Rust patterns. All warnings must be resolved (treated as errors via `-D warnings`).
19-
20-
Do not commit code that fails either of these checks.
18+
When creating a PR, you MUST follow the [PR template](.github/pull_request_template.md).
2119

2220
## Testing
2321

24-
Run relevant tests before submitting changes:
25-
26-
```bash
27-
cargo test --all-features
28-
```
29-
30-
For SQL logic tests:
31-
32-
```bash
33-
cargo test -p datafusion-sqllogictest
34-
```
22+
See the [Testing Quick Start](docs/source/contributor-guide/testing.md#testing-quick-start)
23+
for the recommended pre-PR test commands.

docs/source/contributor-guide/development_environment.md

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,38 @@
2121

2222
This section describes how you can get started at developing DataFusion.
2323

24-
## Windows setup
24+
## Quick Start
25+
26+
For the fastest path to a working local environment, follow these steps
27+
from the repository root:
28+
29+
```shell
30+
# 1. Install Rust (https://rust-lang.org/tools/install/) and verify the active toolchain with
31+
rustup show
32+
33+
# 2. Install protoc 3.15+ (see details below)
34+
protoc --version
35+
36+
# 3. Download test data used by examples and many tests
37+
git submodule update --init --recursive
38+
39+
# 4. Build the workspace
40+
cargo build
41+
42+
# 5. Verify that Rust integration tests can be run
43+
cargo test -p datafusion --test parquet_integration
44+
45+
# 6. Verify that sqllogictests can run
46+
cargo test --profile=ci --test sqllogictests
47+
```
48+
49+
Notes:
50+
51+
- The pinned Rust version is defined in `rust-toolchain.toml`.
52+
- `protoc` is required to compile DataFusion from source.
53+
- Some tests and examples rely on git submodule data being present locally.
54+
55+
## Windows Setup
2556

2657
```shell
2758
wget https://az792536.vo.msecnd.net/vms/VMBuild_20190311/VirtualBox/MSEdge/MSEdge.Win10.VirtualBox.zip
@@ -34,19 +65,19 @@ cargo build
3465

3566
DataFusion has support for [dev containers](https://containers.dev/) which may be used for
3667
developing DataFusion in an isolated environment either locally or remote if desired. Using dev containers for developing
37-
DataFusion is not a requirement by any means but is available for those where doing local development could be tricky
68+
DataFusion is not a requirement but is available where doing local development could be tricky
3869
such as with Windows and WSL2, those with older hardware, etc.
3970

4071
For specific details on IDE support for dev containers see the documentation for [Visual Studio Code](https://code.visualstudio.com/docs/devcontainers/containers),
4172
[IntelliJ IDEA](https://www.jetbrains.com/help/idea/connect-to-devcontainer.html),
4273
[Rust Rover](https://www.jetbrains.com/help/rust/connect-to-devcontainer.html), and
4374
[GitHub Codespaces](https://docs.github.com/en/codespaces/setting-up-your-project-for-codespaces/adding-a-dev-container-configuration/introduction-to-dev-containers).
4475

45-
## Protoc Installation
76+
## `protoc` Installation
4677

4778
Compiling DataFusion from sources requires an installed version of the protobuf compiler, `protoc`.
4879

49-
On most platforms this can be installed from your system's package manager
80+
On most platforms this can be installed from your system's package manager. For example:
5081

5182
```
5283
# Ubuntu
@@ -71,7 +102,7 @@ libprotoc 3.15.0
71102

72103
Alternatively a binary release can be downloaded from the [Release Page](https://github.com/protocolbuffers/protobuf/releases) or [built from source](https://github.com/protocolbuffers/protobuf/blob/main/src/README.md).
73104

74-
## Bootstrap environment
105+
## Bootstrap Environment
75106

76107
DataFusion is written in Rust and it uses a standard rust toolkit:
77108

docs/source/contributor-guide/index.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,10 @@ community as well as get more familiar with Rust and the relevant codebases.
3232

3333
## Development Environment
3434

35-
Setup your development environment [here](development_environment.md), and learn
36-
how to test the code [here](testing.md).
35+
Start with the [Development Environment Quick Start](development_environment.md#quick-start).
36+
37+
For more detail, see the full [development environment guide](development_environment.md)
38+
and the [testing guide](testing.md).
3739

3840
## Finding and Creating Issues to Work On
3941

@@ -99,6 +101,19 @@ If you are concerned that a larger design will be lost in a string of small PRs,
99101

100102
Note all commits in a PR are squashed when merged to the `main` branch so there is one commit per PR after merge.
101103

104+
## Before Submitting a PR
105+
106+
Before submitting a PR, run the standard non-functional checks. PRs must pass
107+
before merge.
108+
109+
```bash
110+
./dev/rust_lint.sh
111+
# use `--write` to automatically fix some formatting and lint errors
112+
# ./dev/rust_lint.sh --write --allow-dirty
113+
```
114+
115+
You should also run any relevant commands from the [testing quick start](testing.md#testing-quick-start).
116+
102117
## Conventional Commits & Labeling PRs
103118

104119
We generate change logs for each release using an automated process that will categorize PRs based on the title

docs/source/contributor-guide/testing.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,37 @@ Tests are critical to ensure that DataFusion is working properly and
2323
is not accidentally broken during refactorings. All new features
2424
should have test coverage and the entire test suite is run as part of CI.
2525

26+
## Testing Quick Start
27+
28+
While developing a feature or bug fix, best practice is to run the smallest set
29+
of tests that gives confidence for your change, then expand as needed.
30+
31+
Initially, run the tests in the crates you changed. For example, if you made changes
32+
to files in `datafusion-optimizer/src`, run the corresponding crate tests:
33+
34+
```shell
35+
cargo test -p datafusion-optimizer
36+
```
37+
38+
Then, run the `sqllogictest` suite, which provides a strong speed–coverage tradeoff for development: it runs quickly while offering broad regression coverage across most SQL behavior in DataFusion.
39+
40+
```shell
41+
cargo test --profile=ci --test sqllogictests
42+
```
43+
44+
Finally, before submitting a PR, run the tests for the core `datafusion` and
45+
`datafusion-cli` crates:
46+
47+
```shell
48+
cargo test -p datafusion
49+
cargo test -p datafusion-cli
50+
```
51+
52+
Some integration tests require optional external services such as Docker-backed
53+
containers and may skip when unavailable.
54+
55+
## Testing Overview
56+
2657
DataFusion has several levels of tests in its [Test Pyramid] and tries to follow
2758
the Rust standard [Testing Organization] described in [The Book].
2859

0 commit comments

Comments
 (0)