This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Note
This is a community fork of asg017/sqlite-vec created to merge pending upstream PRs and provide continued support while the original author is unavailable.
sqlite-vec is a lightweight, fast vector search SQLite extension written in pure C with no dependencies. It's a pre-v1 project (current: v0.2.4-alpha) that provides vector similarity search capabilities for SQLite databases across all platforms where SQLite runs.
Key features:
- Supports float, int8, and binary vector types via
vec0virtual tables - Pure C implementation with optional SIMD optimizations (AVX on x86_64, NEON on ARM)
- Multi-language bindings (Python, Node.js, Ruby, Go, Rust, Lua)
- Runs anywhere: Linux/MacOS/Windows, WASM, embedded devices
- Distance constraints for KNN queries (enables pagination and range queries)
- Optimize command for space reclamation after deletes
- ALTER TABLE RENAME support for vec0 tables
Run ./scripts/vendor.sh first to download vendored dependencies (sqlite3.c, shell.c).
Core builds:
make loadable- Builddist/vec0.{so,dylib,dll}loadable extensionmake static- Builddist/libsqlite_vec0.astatic library anddist/sqlite-vec.hheadermake cli- Builddist/sqlite3CLI with sqlite-vec statically linkedmake all- Build all three targets abovemake wasm- Build WASM version (requires emcc)
Platform-specific compiler:
- Set
CC=to use a different compiler (default: gcc) - Set
AR=to use a different archiver (default: ar)
SIMD control:
- SIMD is auto-enabled on Darwin x86_64 (AVX) and Darwin arm64 (NEON)
- Set
OMIT_SIMD=1to disable SIMD optimizations
Python tests (primary test suite):
# Setup test environment with uv
uv sync --directory tests
# Run all Python tests
make test-loadable python=./tests/.venv/bin/python
# Run specific test
./tests/.venv/bin/python -m pytest tests/test-loadable.py::test_name -vv -s -x
# Update snapshots
make test-loadable-snapshot-update
# Watch mode
make test-loadable-watchOther tests:
make test- Run basic SQL tests viatest.sqlmake test-unit- Compile and run C unit testssqlite3 :memory: '.read test.sql'- Quick smoke test
Test structure:
tests/test-loadable.py- Main comprehensive test suitetests/test-metadata.py- Metadata column teststests/test-auxiliary.py- Auxiliary column teststests/test-partition-keys.py- Partition key teststests/conftest.py- pytest fixtures (loads extension fromdist/vec0)
make format- Format C code with clang-format and Python with blackmake lint- Check formatting without modifying files
The entire extension is in a single sqlite-vec.c file (~9000 lines). It implements a vec0 virtual table module using SQLite's virtual table API.
Key concepts:
-
vec0 virtual table: Declared with
CREATE VIRTUAL TABLE x USING vec0(vector_column TYPE[N], ...)- Vector column: Must specify type (float, int8, bit) and dimensions
- Metadata columns: Additional indexed columns for filtering
- Auxiliary columns: Non-indexed columns for associated data
- Partition keys: Special columns for pre-filtering via
partition_key=column_name - Chunk size: Configurable via
chunk_size=N(default varies by type)
-
Shadow tables: vec0 creates multiple hidden tables to store data:
xyz_chunks- Chunk metadata (size, validity bitmaps, rowids)xyz_rowids- Rowid mapping to chunksxyz_vector_chunksNN- Actual vector data for column NNxyz_auxiliary- Auxiliary column valuesxyz_metadatachunksNN/xyz_metadatatextNN- Metadata storage
-
Query plans: Determined in xBestIndex, encoded in idxStr:
VEC0_QUERY_PLAN_FULLSCAN- Full table scanVEC0_QUERY_PLAN_POINT- Single rowid lookupVEC0_QUERY_PLAN_KNN- K-nearest neighbors vector search
See ARCHITECTURE.md for detailed idxStr encoding and shadow table schemas.
All bindings wrap the core C extension:
-
Go CGO (
bindings/go/cgo/): CGO-based bindings for mattn/go-sqlite3Auto()function to register extension viasqlite3_auto_extension()SerializeFloat32()andSerializeInt8()helper functions- Requires CGO enabled and links libm (
-lm)
-
Python: Installable via pip from GitHub
- Package configuration in
pyproject.tomlandsetup.py - Helper functions in
bindings/python/extra_init.pyfor vector serialization - Requires Python built with
--enable-loadable-sqlite-extensions - Recommend using
uvfor virtual environments (uses system Python with extension support)
- Package configuration in
-
Node.js: Installable via npm from GitHub
- Package configuration in
package.json - CJS (
index.cjs) and ESM (index.mjs) entry points - TypeScript definitions in
index.d.ts
- Package configuration in
-
Ruby: Installable via gem from GitHub
- Gem specification in
sqlite-vec.gemspec - Extension configuration in
extconf.rb - Ruby library in
lib/sqlite_vec.rb
- Gem specification in
-
Rust (
bindings/rust/): Static linking via build.rs- Crate configuration in
Cargo.toml - Exports
sqlite3_vec_init()insrc/lib.rs
- Crate configuration in
-
Lua (
bindings/lua/): Lua 5.1+ compatible binding- Requires
lsqlite3module load()function to load the extensionserialize_f32()for IEEE 754 binary formatserialize_json()for JSON format- Example in
examples/simple-lua/
- Requires
Built with VitePress (Vue-based static site generator):
npm --prefix site run dev- Development servernpm --prefix site run build- Production build- Source:
site/directory - Deployed via GitHub Actions (
.github/workflows/site.yaml)
- Edit
sqlite-vec.cfor core functionality - Update
sqlite-vec.h.tmplif public API changes (regenerated viamake sqlite-vec.h) - Add tests to
tests/test-loadable.pyor other test files - Run
make formatbefore committing - Verify with
make test-loadable
For this fork:
- Update
VERSIONfile (format:X.Y.ZorX.Y.Z-alpha.N) - Update
CHANGELOG.mdwith changes - Commit changes with descriptive message
- Create and push git tag:
git tag v0.X.Y-alpha git push origin v0.X.Y-alpha
Note: This fork does not have CI/CD publishing to package registries (PyPI, npm, crates.io, RubyGems). Users install directly from GitHub using version tags.
Original release process (for reference only):
The original repository uses ./scripts/publish-release.sh and CI/CD (.github/workflows/release.yaml) to build and publish platform-specific extensions and language packages.
Python test fixtures:
@pytest.fixture() db()in conftest.py provides SQLite connection with extension loaded- Tests use
db.execute()for queries - Snapshot testing available for regression tests
Common test patterns:
def test_example(db):
db.execute("CREATE VIRTUAL TABLE v USING vec0(embedding float[3])")
db.execute("INSERT INTO v(rowid, embedding) VALUES (1, '[1,2,3]')")
result = db.execute("SELECT distance FROM v WHERE embedding MATCH '[1,2,3]'").fetchone()SIMD is conditionally compiled based on platform:
SQLITE_VEC_ENABLE_AVX- x86_64 AVX instructionsSQLITE_VEC_ENABLE_NEON- ARM NEON instructions
Code uses preprocessor directives to select implementations. Distance calculations have both scalar and SIMD variants.
- This is pre-v1 software - breaking changes are expected
- The single-file architecture means recompiling for any change
- Tests must run from repository root (assumes
dist/vec0exists) - All bindings depend on the core C extension being built first
- Vector format: JSON arrays
'[1,2,3]'or raw bytes via helper functions
Fork-specific notes:
- Version v0.2.4-alpha includes: Lua binding with IEEE 754 compliant float serialization (#237)
- Version v0.2.3-alpha includes: Android 16KB page support (#254), LDFLAGS support, documentation fixes (#208, #209)
- Version v0.2.2-alpha includes: GLOB operator for text metadata (#191), IS/IS NOT/IS NULL/IS NOT NULL operators (#190), all compilation warnings fixed (including critical logic bug)
- Version v0.2.1-alpha includes: LIKE operator for text metadata (#197), locale-independent JSON parsing (#241), musl libc compilation fix
- Version v0.2.0-alpha merged upstream PRs: #166 (distance constraints), #210 (optimize), #203 (ALTER TABLE RENAME), #212 (cosine distance for binary), #243 (delete memory leak fix), #228 (CI/CD updates)
- See CHANGELOG.md for complete list of changes from original v0.1.7-alpha.2
- Installation is via GitHub (git tags), not package registries
- Python users should use
uvfor virtual environments to ensure loadable extension support