This file is for agents working on the datafusion-python project (developing,
testing, reviewing). If you need to use the DataFusion DataFrame API (write
queries, build expressions, understand available functions), see the user-facing
guide at python/datafusion/AGENTS.md.
This project uses AI agent skills stored in .ai/skills/. Each skill is a directory containing a SKILL.md file with instructions for performing a specific task.
Skills follow the Agent Skills open standard. Each skill directory contains:
SKILL.md— The skill definition with YAML frontmatter (name, description, argument-hint) and detailed instructions.- Additional supporting files as needed.
Every pull request must follow the template in
.github/pull_request_template.md. The description must include these sections:
- Which issue does this PR close? — Link the issue with
Closes #NNN. - Rationale for this change — Why the change is needed (skip if the issue already explains it clearly).
- What changes are included in this PR? — Summarize the individual changes.
- Are there any user-facing changes? — Note any changes visible to users
(new APIs, changed behavior, new files shipped in the package, etc.). If
there are breaking changes to public APIs, add the
api changelabel.
Always run pre-commit checks before committing. The hooks are defined in
.pre-commit-config.yaml and run automatically on git commit if pre-commit
is installed as a git hook. To run all hooks manually:
pre-commit run --all-filesFix any failures before committing.
Every Python function must include a docstring with usage examples.
- Examples are required: Each function needs at least one doctest-style example demonstrating basic usage.
- Optional parameters: If a function has optional parameters, include separate
examples that show usage both without and with the optional arguments. Pass
optional arguments using their keyword name (e.g.,
step=dfn.lit(3)) so readers can immediately see which parameter is being demonstrated. - Reuse input data: Use the same input data across examples wherever possible. The examples should demonstrate how different optional arguments change the output for the same input, making the effect of each option easy to understand.
- Alias functions: Functions that are simple aliases (e.g.,
list_sortaliasingarray_sort) only need a one-line description and aSee Alsoreference to the primary function. They do not need their own examples.
When adding or updating an aggregate or window function, ensure the corresponding site documentation is kept in sync:
- Aggregations:
docs/source/user-guide/common-operations/aggregations.rst— add new aggregate functions to the "Aggregate Functions" list and include usage examples if appropriate. - Window functions:
docs/source/user-guide/common-operations/windows.rst— add new window functions to the "Available Functions" list and include usage examples if appropriate.