-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector Search and Semantic Caching #417
Changes from 19 commits
e6f3627
241645c
9150f41
a377faf
bc6ce05
c257207
feb0946
04fbb90
2a9ad73
f2866f0
21957f5
4fbd0bf
30aca01
57f4d13
06b888b
65272dc
16c694c
73c6c05
9a3a257
ecc0a91
9d09c3a
980e737
efa7d33
d33e935
d7cdae8
81db70d
8d68528
18e212a
59c40f7
22e4978
2e9264c
7b8f9c9
1528684
e5aab0e
59b2b13
843a81d
3fc563f
4411d02
96eae51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -388,3 +388,5 @@ FodyWeavers.xsd | |
# JetBrains Rider | ||
.idea/ | ||
*.sln.iml | ||
|
||
test/Redis.OM.Unit.Tests/appsettings.json.local |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -288,6 +288,104 @@ customers.Where(x => x.LastName == "Bond" && x.FirstName == "James"); | |
customers.Where(x=>x.NickNames.Contains("Jim")); | ||
``` | ||
|
||
### Vectors | ||
|
||
Redis OM .NET also supports storing and querying Vectors stored in Redis. | ||
|
||
A `Vector<T>` is a representation of an object that can be transformed into a vector by a Vectorizer. | ||
|
||
A `VectorizerAttribute` is the abstract class you use to decorate your Vector fields, it is responsible for defining the logic to convert your Vectors into Embeddings. In the package `Redis.OM.Vectorizers` we provide vectorizers for HuggingFace, OpenAI, and AzureOpenAI to allow you to easily integrate them into your workflows. | ||
|
||
#### Define a Vector in your Model. | ||
|
||
To define a vector in your model, simply decorate a `Vector<T>` field with and `Indexed` and a `Vectorizer` attribute (in this case we'll use OpenAI): | ||
|
||
```cs | ||
[Document(StorageType = StorageType.Json)] | ||
public class OpenAIQuery | ||
{ | ||
[RedisIdField] | ||
public string Id { get; set; } | ||
|
||
[Indexed(DistanceMetric = DistanceMetric.COSINE)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also show how a few other vector field attributes like index type (HNSW vs FLAT) and related args are set here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a couple of other parameters for the index definition, and explained it a bit better in the modeling section. |
||
[OpenAIVectorizer] | ||
public Vector<string> Prompt { get; set; } | ||
|
||
public string Response { get; set; } | ||
|
||
[Indexed] | ||
public string Language { get; set; } | ||
|
||
[Indexed] | ||
public DateTime TimeStamp { get; set; } | ||
} | ||
``` | ||
|
||
#### Insert Vectors into Redis | ||
|
||
With the vector defined in our model, all we need to do is create Vectors of the generic type, and insert them with our model. Using our `RedisCollection`, you can do this by simply using `Insert`: | ||
|
||
```cs | ||
var query = new OpenAIQuery | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. naming this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Query is confusing in this context, is |
||
{ | ||
Language = "en_us", | ||
Prompt = Vector.Of("What is the Capital of France?"), | ||
Response = "Paris", | ||
TimeStamp = DateTime.Now - TimeSpan.FromHours(3) | ||
}; | ||
collection.Insert(query); | ||
``` | ||
|
||
The Vectorizer will manage the embedding generation for you without you having to intervene. | ||
|
||
#### Query Vectors in Redis | ||
|
||
To query vector fields in Redis, all you need to do is use the `VectorRange` method on a vector within our normal LINQ queries, and/or use the `NearestNeighbors` with whatever other filters you want to use, here's some examples: | ||
|
||
```cs | ||
var queryPrompt = Vector.Of("What really is the Capital of France?"); | ||
|
||
// simple vector range, find first within .15 | ||
var result = collection.First(x => x.Prompt.VectorRange(queryPrompt, .15)); | ||
|
||
// simple nearest neighbors query, finds first nearest neighbor | ||
result = collection.NearestNeighbors(x => x.Prompt, 1, queryPrompt).First(); | ||
|
||
// hybrid query, pre-filters result set for english responses, then runs a nearest neighbors search. | ||
result = collection.Where(x=>x.Language == "en_us").NearestNeighbors(x => x.Prompt, 1, queryPrompt).First(); | ||
|
||
// hybrid query, pre-filters responses newer than 4 hours, and finds first result within .15 | ||
var ts = DateTimeOffset.Now - TimeSpan.FromHours(4); | ||
result = collection.First(x=>x.TimeStamp > ts && x.Prompt.VectorRange(queryPrompt, .15)); | ||
``` | ||
|
||
#### What Happens to the Embeddings? | ||
|
||
With Redis OM, the embeddings can be completely transparent to you, they are generated and bound to the `Vector<T>` when you query/insert your vectors. If however you needed your embedding after the insertion/Query, they are available at `Vector<T>.Embedding`, and be queried either as the raw bytes, as an array of doubles or as an array of floats (depending on your vectorizer). | ||
|
||
#### Configuration | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add other vector field attribute level configuration details here too? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added details about index definition in the modeling section as that's more or less where it belongs (the configuration section is talking about configuring the vectorizers) |
||
|
||
The Vectorizers provided by the `Redis.OM.Vectorizers` package have some configuration parameters that it will pull in either from your `appsettings.json` file, or your environment variables (with your appsettings taking precedence). | ||
|
||
| Configuration Parameter | Description | | ||
|-------------------------------- |-----------------------------------------------| | ||
| REDIS_OM_HF_TOKEN | HuggingFace Authorization token. | | ||
| REDIS_OM_OAI_TOKEN | OpenAI Authorization token | | ||
| REDIS_OM_OAI_API_URL | OpenAI URL | | ||
| REDIS_OM_AZURE_OAI_TOKEN | Azure OpenAI api key | | ||
| REDIS_OM_AZURE_OAI_RESOURCE_NAME | Azure resource name | | ||
| REDIS_OM_AZURE_OAI_DEPLOYMENT_NAME | Azure deployment | | ||
|
||
### Semantic Caching | ||
slorello89 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Redis OM also provides the ability to use Semantic Caching, as well as providers for OpenAI, HuggingFace, and Azure OpenAI to perform semantic caching. To use a Semantic Cache, simply pull one out of the RedisConnectionProvider and use `Store` to insert items, and `GetSimilar` to retrieve items. For example: | ||
|
||
```cs | ||
var cache = _provider.OpenAISemanticCache(token); | ||
cache.Store("What is the capital of France?", "Paris"); | ||
var res = cache.GetSimilar("What really is the capital of France?").First(); | ||
``` | ||
|
||
### 🖩 Aggregations | ||
|
||
We can also run aggregations on the customer object, again using expressions in LINQ: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
FROM mcr.microsoft.com/dotnet/sdk:6.0 | ||
FROM mcr.microsoft.com/dotnet/sdk:7.0 | ||
|
||
|
||
WORKDIR /app | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the phrase "convert your Vectors into Embeddings" is a bit misleading as those two terms are relatively interchangeable. I think we're essentially talking about the definition of the various vector field attributes like distance metric, data type, dims, etc? Some of those subsumed by the choice of vectorizer for sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the verbiage a bit - hopefully this is better?