|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Build & Test Commands |
| 6 | + |
| 7 | +```bash |
| 8 | +# Build |
| 9 | +cargo build |
| 10 | +cargo build --all-features |
| 11 | + |
| 12 | +# Run all tests |
| 13 | +cargo test --all-features |
| 14 | + |
| 15 | +# Run a single test |
| 16 | +cargo test test_name --all-features |
| 17 | + |
| 18 | +# Run tests for a specific dialect |
| 19 | +cargo test sqlparser_postgres --all-features |
| 20 | + |
| 21 | +# Lint |
| 22 | +cargo clippy --all-targets --all-features -- -D warnings |
| 23 | + |
| 24 | +# Format |
| 25 | +cargo fmt --all |
| 26 | + |
| 27 | +# Check (faster than full build) |
| 28 | +cargo check --all-targets --all-features |
| 29 | + |
| 30 | +# Build docs |
| 31 | +cargo doc --document-private-items --no-deps --workspace --all-features |
| 32 | + |
| 33 | +# Run benchmarks (from sqlparser_bench directory) |
| 34 | +cd sqlparser_bench && cargo bench |
| 35 | +``` |
| 36 | + |
| 37 | +## Crate Features |
| 38 | + |
| 39 | +- `serde`: Adds Serialize/Deserialize for all AST nodes |
| 40 | +- `visitor`: Adds a Visitor for recursively walking the AST |
| 41 | +- `recursive-protection` (default): Stack overflow protection |
| 42 | +- `json_example`: For CLI example only |
| 43 | + |
| 44 | +## Architecture |
| 45 | + |
| 46 | +This is an extensible SQL lexer and parser that produces an Abstract Syntax Tree (AST). |
| 47 | + |
| 48 | +### Core Components |
| 49 | + |
| 50 | +- **`src/tokenizer.rs`**: Lexer that converts SQL text into tokens. `Tokenizer::new(dialect, sql).tokenize()` returns `Vec<TokenWithLocation>`. |
| 51 | + |
| 52 | +- **`src/parser/mod.rs`**: Recursive descent parser using Pratt parsing for expressions. Entry point is `Parser::parse_sql(&dialect, sql)` returning `Vec<Statement>`. |
| 53 | + |
| 54 | +- **`src/ast/mod.rs`**: AST type definitions. `Statement` is the top-level enum. Key types: `Query`, `Select`, `Expr`, `DataType`, `ObjectName`. |
| 55 | + |
| 56 | +- **`src/dialect/mod.rs`**: SQL dialect trait and implementations. Each dialect (PostgreSQL, MySQL, etc.) customizes parsing behavior. `GenericDialect` is the most permissive. |
| 57 | + |
| 58 | +### Dialect System |
| 59 | + |
| 60 | +Dialects customize parsing via the `Dialect` trait. Methods control identifier quoting, keyword handling, and syntax variations. Dialect-specific features should work with both the specific dialect AND `GenericDialect`. |
| 61 | + |
| 62 | +### Testing Patterns |
| 63 | + |
| 64 | +Tests use `TestedDialects` from `src/test_utils.rs`: |
| 65 | + |
| 66 | +```rust |
| 67 | +use sqlparser::test_utils::*; |
| 68 | + |
| 69 | +// Test across all dialects |
| 70 | +all_dialects().verified_stmt("SELECT 1"); |
| 71 | + |
| 72 | +// Test specific dialects |
| 73 | +TestedDialects::new(vec![Box::new(PostgreSqlDialect {})]).verified_stmt("..."); |
| 74 | + |
| 75 | +// Test all dialects except specific ones |
| 76 | +all_dialects_except(|d| d.is::<MySqlDialect>()).verified_stmt("..."); |
| 77 | +``` |
| 78 | + |
| 79 | +Key test helpers: |
| 80 | +- `verified_stmt(sql)`: Parse and verify round-trip serialization |
| 81 | +- `verified_query(sql)`: Same but returns `Query` |
| 82 | +- `one_statement_parses_to(sql, canonical)`: Test with different canonical form |
| 83 | + |
| 84 | +### Round-Trip Invariant |
| 85 | + |
| 86 | +AST nodes implement `Display` to reproduce the original SQL (minus comments/whitespace). Tests verify `parse(sql).to_string() == sql`. |
| 87 | + |
| 88 | +### Source Spans |
| 89 | + |
| 90 | +AST nodes include `Span` information for source locations. When constructing AST nodes manually, use `Span::empty()`. |
0 commit comments