.Net: Update SQL Server vector search to latest VECTOR_SEARCH() syntax#13863
.Net: Update SQL Server vector search to latest VECTOR_SEARCH() syntax#13863roji wants to merge 2 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Updates the SQL Server vector store connector and tests to align with Azure SQL’s latest VECTOR_SEARCH() syntax and capabilities (e.g., TOP(N) WITH APPROXIMATE, iterative filtering), while adding runtime gating for DiskANN features on Azure SQL/Fabric only.
Changes:
- Updated generated SQL for approximate vector search to use
SELECT TOP(N) WITH APPROXIMATEand removed deprecatedTOP_N. - Added iterative filter support inside
VECTOR_SEARCH()(and hybrid semantic CTE) and implementedskipvia subquery wrapping. - Updated conformance/unit tests and added an attribute to require an Azure SQL connection string for DiskANN tests.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| dotnet/test/VectorData/SqlServer.ConformanceTests/Support/SqlServerConnectionStringRequiredAttribute.cs | Adds a test condition attribute to skip DiskANN/Azure-only tests when no external connection string is configured. |
| dotnet/test/VectorData/SqlServer.ConformanceTests/SqlServerIndexKindTests.cs | Reworks DiskANN conformance test to insert 100 rows before index creation and gates it to Azure SQL via the new attribute. |
| dotnet/test/VectorData/SqlServer.ConformanceTests/SqlServerDiskAnnVectorSearchTests.cs | Removes the old temporary workaround test class. |
| dotnet/test/VectorData/SqlServer.ConformanceTests/SqlServerCommandBuilderTests.cs | Updates expected SQL text for new VECTOR_SEARCH() syntax, skip-subquery behavior, and filter support. |
| dotnet/src/VectorData/SqlServer/SqlServerCommandBuilder.cs | Implements new VECTOR_SEARCH() SQL generation, supports filter translation, and handles skip with a subquery. |
| dotnet/src/VectorData/SqlServer/SqlServerCollection.cs | Adds Azure SQL/Fabric runtime detection for DiskANN/VECTOR_SEARCH usage and caches the result per collection instance. |
| .gitignore | Ignores *.lscache artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 90%
✓ Correctness
This PR updates the SQL Server vector data connector to use the latest Azure SQL VECTOR_SEARCH syntax (SELECT TOP(N) WITH APPROXIMATE instead of the deprecated TOP_N parameter), adds filter support inside VECTOR_SEARCH (previously it threw NotSupportedException), adds a runtime Azure SQL engine edition check for DiskAnn features, and updates tests accordingly. The SQL generation logic, filter translation, subquery wrapping for skip>0, and the Azure SQL validation are all implemented correctly. The connection lifecycle management (opening in EnsureAzureSqlForDiskAnnAsync, disposal on error for non-using paths) is sound. No correctness issues found.
✓ Security Reliability
This PR updates the SQL Server vector search implementation to use the latest Azure SQL VECTOR_SEARCH syntax (SELECT TOP(N) WITH APPROXIMATE instead of the deprecated TOP_N parameter), adds filter support for VECTOR_SEARCH queries, and introduces a runtime Azure SQL engine edition check for DiskAnn features. The changes are well-structured with proper parameterized query usage. Filter translation reuses the existing SqlServerFilterTranslator which properly escapes string constants and uses parameterized queries for query parameters. The _isAzureSql caching field has a benign race condition under concurrency but produces correct results. Connection ownership patterns in SearchAsync/HybridSearchAsync follow the pre-existing design (CA2000 suppressed, disposal handled by ReadVectorSearchResultsAsync). No blocking security or reliability issues found.
✓ Test Coverage
The PR updates SQL Server vector search to use the new 'SELECT TOP(N) WITH APPROXIMATE' syntax (replacing deprecated TOP_N), adds filter support to VECTOR_SEARCH (previously threw NotSupportedException), and adds Azure SQL engine edition validation. Test coverage for the command builder was updated accordingly: the filter-throws test now verifies filter SQL generation, and the skip/scoreThreshold tests were updated. However, there is a meaningful gap: the new conditional logic at SqlServerCommandBuilder.cs line 584 that emits 'AND' (when filter is present) vs 'WHERE' (when no filter) for the score threshold clause has no unit test covering the combined filter+scoreThreshold path. Additionally, the hybrid search command builder changes (filter support in VECTOR_SEARCH CTE) have no unit-level tests in SqlServerCommandBuilderTests.
✗ Design Approach
The PR upgrades VECTOR_SEARCH from the deprecated TOP_N parameter to SELECT TOP(N) WITH APPROXIMATE, lifts the filter restriction for DiskAnn indexes (now handled via iterative filtering), and adds a runtime Azure SQL engine-edition guard. The SQL generation changes look correct — filter injection, subquery-wrap for SKIP, ORDER BY inside the hybrid CTE, and score threshold conditional are all well-formed. One real design issue exists in EnsureAzureSqlForDiskAnnAsync: the cached-false case is never short-circuited, so every DiskAnn operation against a non-Azure SQL server makes a redundant database round-trip before re-throwing the same exception.
Suggestions
- Add a unit test for SelectVector with a DiskAnn index when both Filter and ScoreThreshold are specified. The conditional at SqlServerCommandBuilder.cs line 584 (
options.Filter is not null ? "AND " : "WHERE ") is only exercised when both are set, but no test covers this combination. - Consider adding unit tests for SelectHybrid with a DiskAnn vector property and a filter in SqlServerCommandBuilderTests. The diff adds filter translation inside the VECTOR_SEARCH CTE for hybrid search, but there are no command-builder-level tests for hybrid search SQL generation.
Automated review by roji's agents
- Replace deprecated TOP_N parameter with SELECT TOP(N) WITH APPROXIMATE - Enable iterative filtering (WHERE predicates during vector search) - Support skip via subquery wrapping (TOP and OFFSET/FETCH can't coexist) - Add Azure SQL runtime detection for DiskAnn (SERVERPROPERTY EngineEdition) - Remove read-only table workaround (SqlServerDiskAnnVectorSearchTests) - Update hybrid search CTE with new VECTOR_SEARCH syntax - Gate DiskAnn conformance test on Azure SQL connection string - Handle 100-row minimum requirement for DiskAnn vector index creation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
daa373e to
4c52882
Compare
- Fix AddParameter for filter params: pass property: null instead of vectorProperty - Cache both true and false Azure SQL detection results (avoid repeated roundtrips) - Narrow Azure SQL check to IndexKind.DiskAnn specifically (not any non-Flat) - Rename attribute to AzureSqlRequired (checks EngineEdition, not connection string) - Add unit test for filter+scoreThreshold combined path - Add unit test for hybrid search with DiskAnn and filter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Updates the SQL Server vector store connector to support the latest Azure SQL VECTOR_SEARCH() features, matching what was done in EF Core (dotnet/efcore#38075).
Changes
VECTOR_SEARCH syntax update
TOP_Nparameter withSELECT TOP(N) WITH APPROXIMATEskipvia subquery wrapping (TOPandOFFSET/FETCHcannot coexist in the same SELECT)Hybrid search
Azure SQL runtime detection
SERVERPROPERTY('EngineEdition')at table creation and query timeNotSupportedExceptionwhen used against on-prem SQL ServerTest infrastructure
SqlServerDiskAnnVectorSearchTests.cs(read-only table workaround no longer tables with latest vector indexes now support full DML)neededSqlServerConnectionStringRequiredAttributeto gate DiskAnn tests on Azure SQL connection string availabilityKept
PREVIEW_FEATUREScommand inCreateTable(needed for future on-prem SQL Server support)