Skip to content

Commit a0f9aed

Browse files
author
Davide Mauri
authored
Refresh and add mention for semantic re-ranking (#36172)
1 parent b4de5be commit a0f9aed

1 file changed

Lines changed: 23 additions & 19 deletions

File tree

docs/sql-server/ai/vectors-faq.md

Lines changed: 23 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Answers to common questions about vector search and vector indexes
44
author: WilliamDAssafMSFT
55
ms.author: wiassaf
66
ms.reviewer: damauri, mikeray, randolphwest
7-
ms.date: 11/18/2025
7+
ms.date: 01/06/2026
88
ms.service: sql
99
ms.topic: faq
1010
ms.collection:
@@ -29,64 +29,68 @@ This article contains frequently asked questions about vectors and embeddings in
2929
3030
## How do I keep embedding up to date?
3131

32-
Update embeddings every time the underlying data that they represent changes. This is especially important for scenarios where the data is dynamic, such as user-generated content or frequently updated databases. To find out more about several strategies to keep embeddings up to date, see [Database and AI: solutions for keeping embeddings updated](https://devblogs.microsoft.com/azure-sql/database-and-ai-solutions-for-keeping-embeddings-updated/).
32+
Update embeddings every time the underlying data that they represent changes. This practice is especially important for scenarios where the data is dynamic, such as user-generated content or frequently updated databases. For more information about several strategies to keep embeddings up to date, see [Database and AI: solutions for keeping embeddings updated](https://devblogs.microsoft.com/azure-sql/database-and-ai-solutions-for-keeping-embeddings-updated/).
3333

3434
## What is the overhead storage and processing for vector search?
3535

3636
The overhead for vector search primarily involves the storage of the vector data type and the computational resources required for indexing and searching. The **vector** data type is designed to be efficient in terms of storage, but the exact overhead can vary based on the size - the number of dimensions - of the vectors stored.
3737

3838
For more information about how to choose the right vector size, review [Embedding models and dimensions: optimizing the performance-resource usage ratio](https://devblogs.microsoft.com/azure-sql/embedding-models-and-dimensions-optimizing-the-performance-resource-usage-ratio/).
3939

40-
A SQL Server data page can hold up to 8,060 bytes, so the size of the vector affects how many vectors can be stored in a single page. For example, if you have a vector with 1,024 dimensions, and each dimension is a single precision **float** (4 bytes), the total size of the vector would be 4,104 bytes (4096 bytes payload + 8 bytes header). This limits the number of vectors that can fit in a single page to one.
40+
A SQL Server data page can hold up to 8,060 bytes, so the size of the vector affects how many vectors can be stored in a single page. For example, if you have a vector with 1,024 dimensions, and each dimension is a single precision **float** (4 bytes), the total size of the vector is 4,104 bytes (4,096 bytes payload + 8 bytes header). This size limits the number of vectors that fit in a single page to one.
4141

4242
## What embedding model should I use, and when?
4343

44-
There are many embedding models available, and the choice of which one to use depends on the specific use case and the type of data being processed. Some models support multiple languages, while others support multimodal data (text, images, etc.). Some are available only online, others can be run locally.
44+
Many embedding models are available. The choice depends on your specific use case and the type of data you're processing. Some models support multiple languages, while others support multimodal data (text, images, and more). Some models are available only online, while others can run locally.
4545

46-
In addition to the model itself, consider the size of the model and the number of dimensions it produces. Larger models might provide better accuracy but require more computational resources and storage space, but in many cases having more dimension doesn't really change the quality that much, for common use cases.
46+
In addition to the model itself, consider the size of the model and the number of dimensions it produces. Larger models might provide better accuracy but require more computational resources and storage space. In many cases, having more dimensions doesn't significantly change the quality for common use cases.
4747

4848
For more information about how to choose the right embedding model, see [Embedding models and dimensions: optimizing the performance-resource usage ratio](https://devblogs.microsoft.com/azure-sql/embedding-models-and-dimensions-optimizing-the-performance-resource-usage-ratio/).
4949

50-
## How to decide when to use single-precision (4-byte) vs half-precision (2-byte) floating-point values for vectors?
50+
## How do I decide when to use single-precision (4-byte) versus half-precision (2-byte) floating-point values for vectors?
5151

52-
When storing embedding vectors in a database, the choice between single-precision (`float32`) and half-precision (`float16`) floats often comes down to balancing storage efficiency with numerical fidelity.
52+
When you store embedding vectors in a database, you often need to balance storage efficiency with numerical fidelity when choosing between single-precision (`float32`) and half-precision (`float16`) floats.
5353

54-
Fortunately, embeddings are typically not highly sensitive to small changes in precision.
54+
Fortunately, embeddings typically aren't highly sensitive to small changes in precision.
5555

56-
Embeddings are dense vector representations used in tasks like semantic search, recommendation systems, and natural language processing. These vectors are often the output of neural networks, which are inherently tolerant to small numerical variations. As a result, reducing precision from `float32` to `float16` usually has minimal impact on the quality of similarity comparisons or downstream tasksespecially during inference.
56+
Embeddings are dense vector representations used in tasks like semantic search, recommendation systems, and natural language processing. These vectors often come from neural networks, which are inherently tolerant to small numerical variations. As a result, reducing precision from `float32` to `float16` usually has minimal impact on the quality of similarity comparisons or downstream tasks, especially during inference.
5757

5858
Using `float16` can significantly reduce storage and memory usage, which is particularly beneficial when working with large-scale embedding datasets.
5959

6060
## What about sparse vectors?
6161

62-
At this time, the **vector** data type in the SQL Database Engine is designed for dense vectors, which are arrays of floating-point numbers where most of the elements are non-zero. Sparse vectors, which contain a significant number of zero elements, aren't natively supported.
62+
Currently, the **vector** data type in the SQL Database Engine is designed for dense vectors. These vectors are arrays of floating-point numbers where most of the elements are non-zero. Sparse vectors, which contain a significant number of zero elements, aren't natively supported.
6363

6464
## What are some performance benchmarks for SQL vector search?
6565

66-
Performance can vary widely based on the specific use case, the size of the dataset, and the complexity of the queries. However, SQL Server's vector search capabilities are designed to be efficient and scalable, leveraging indexing techniques to optimize search performance.
66+
Performance can vary widely based on the specific use case, the size of the dataset, and the complexity of the queries. However, SQL Server's vector search capabilities are efficient and scalable. They use indexing techniques to optimize search performance.
6767

68-
## What if I have more than one column that I would like to use for generating embeddings?
68+
## What if I have more than one column that I want to use for generating embeddings?
6969

7070
If you have multiple columns that you want to use for generating embeddings, you have two main options:
7171

7272
- Create one embedding for each column, or
7373
- Concatenate the values of multiple columns into a single string and then generate a single embedding for that concatenated string.
7474

75-
For more information about the two options and the related database design considerations, see [Efficiently and Elegantly Modeling Embeddings](https://devblogs.microsoft.com/azure-sql/efficiently-and-elegantly-modeling-embeddings-in-azure-sql-and-sql-server/).
75+
For more information about these two options and the related database design considerations, see [Efficiently and Elegantly Modeling Embeddings](https://devblogs.microsoft.com/azure-sql/efficiently-and-elegantly-modeling-embeddings-in-azure-sql-and-sql-server/).
7676

7777
## What about re-ranking?
7878

79-
Re-ranking is a technique used to improve the relevance of search results by re-evaluating the initial results based on additional criteria or models. In the SQL Database Engine, you can implement re-ranking by combining vector search with full-text (which provides BM25 ranking) or additional SQL queries or machine learning models to refine the results based on specific business logic or user preferences.
79+
Re-ranking improves the relevance of search results by re-evaluating the initial results based on extra criteria or models. In the SQL Database Engine, you can implement re-ranking by combining vector search with full-text search (which provides BM25 ranking), or by using additional SQL queries or machine learning models to refine the results based on specific business logic or user preferences.
8080

81-
For more information, review [Enhancing Search Capabilities with Hybrid Search and RRF Re-Ranking](https://devblogs.microsoft.com/azure-sql/enhancing-search-capabilities-in-sql-server-and-azure-sql-with-hybrid-search-and-rrf-re-ranking/).
81+
For more information, see [Enhancing Search Capabilities with Hybrid Search and RRF Re-Ranking](https://devblogs.microsoft.com/azure-sql/enhancing-search-capabilities-in-sql-server-and-azure-sql-with-hybrid-search-and-rrf-re-ranking/).
8282

83-
## When to use AI Search (now AI Foundry) vs using SQL for vectors search scenarios?
83+
A more refined re-ranking technique is called semantic re-ranking. Semantic re-ranking relies on a specialized model (often a cross-encoder or a late interaction approach) to compare each candidate passage against the query and assign a detailed relevance score. By reassessing these passages, rerankers ensure that the most precise, useful, and relevant results rise to the top.
84+
85+
For a sample of using a re-ranking model, see [Semantic Reranking with Azure SQL, SQL Server 2025 and Cohere Rerank models](https://devblogs.microsoft.com/azure-sql/semantic-reranking-with-azure-sql-sql-server-2025-and-cohere-rerank-models/).
86+
87+
## When should I use AI Search (now AI Foundry) versus using SQL for vector search scenarios?
8488

8589
AI Search (now AI Foundry) is a specialized service designed for advanced search scenarios, including vector search, natural language processing, and AI-driven insights. It provides a comprehensive set of features for building intelligent search applications, such as built-in support for various AI models, advanced ranking algorithms, and integration with other AI services.
8690

87-
The SQL Database Engine provides the ability to store any kind of data and run any kind of query: structured and unstructured, and to perform vector search on that data. It's a good choice for scenarios where you need to do search across all these data together, and you don't want to use a separate service for search that would complicate your architecture. The SQL Database Engine offers critical enterprise security features to make sure data is always protected, such as row-level security (RLS), dynamic data masking (DDM), Always Encrypted, immutable ledger tables, and transparent data encryption (TDE).
91+
The SQL Database Engine provides the ability to store any kind of data and run any kind of query: structured and unstructured. You can perform vector search on that data. It's a good choice for scenarios where you need to search across all these data together, and you don't want to use a separate service for search that would complicate your architecture. The SQL Database Engine offers critical enterprise security features to make sure data is always protected, such as row-level security (RLS), dynamic data masking (DDM), Always Encrypted, immutable ledger tables, and transparent data encryption (TDE).
8892

89-
Here's an example of a single query that can be run in Azure SQL or SQL Server that combines vector, geospatial, structured and unstructured data all at once. The sample query retrieves the top 50 most relevant restaurants based on the description of the restaurant, the location of the restaurant, and the user's preferences, using vector search for the description and geospatial search for the location, filtering also by star numbers, number of reviews, category and so on:
93+
Here's an example of a single query that you can run in Azure SQL or SQL Server that combines vector, geospatial, structured, and unstructured data all at once. The sample query retrieves the top 50 most relevant restaurants based on the description of the restaurant, the location of the restaurant, and the user's preferences. It uses vector search for the description and geospatial search for the location, filtering also by star numbers, number of reviews, category, and other attributes.
9094

9195
```sql
9296
DECLARE @p AS GEOGRAPHY = GEOGRAPHY::Point(47.6694141, -122.1238767, 4326);
@@ -113,7 +117,7 @@ WHERE b.city = 'Redmond'
113117
ORDER BY semantic_distance DESC;
114118
```
115119

116-
In the previous sample, Exact Nearest Neighbor (ENN) search is used to find the most relevant reviews based on the semantic distance of the embeddings, while also filtering by geospatial distance and other business attributes. This query demonstrates the power of combining vector search with traditional SQL capabilities to create a rich and efficient search experience.
120+
In the previous sample, Exact Nearest Neighbor (ENN) search finds the most relevant reviews based on the semantic distance of the embeddings, while also filtering by geospatial distance and other business attributes. This query demonstrates the power of combining vector search with traditional SQL capabilities to create a rich and efficient search experience.
117121

118122
If you want to use Approximate Nearest Neighbor (ANN) search, you can create a vector index on the `reviews_embeddings` table and use the `VECTOR_SEARCH` function to perform the search.
119123

0 commit comments

Comments
 (0)