-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector Search and Semantic Caching #417
Conversation
Some broken tests at first (need to obtain keys for the integration tests with OpenAI/HuggingFace/Azure) |
FYI @Spartee, @tylerhutcherson, & @banker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L-very-GTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heck yeah. This is awesome Steve. I left a few readme suggestion/ideas, mainly focused on clarity and flow. Where will this be represented in the docs?
README.md
Outdated
|
||
A `Vector<T>` is a representation of an object that can be transformed into a vector by a Vectorizer. | ||
|
||
A `VectorizerAttribute` is the abstract class you use to decorate your Vector fields, it is responsible for defining the logic to convert your Vectors into Embeddings. In the package `Redis.OM.Vectorizers` we provide vectorizers for HuggingFace, OpenAI, and AzureOpenAI to allow you to easily integrate them into your workflows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the phrase "convert your Vectors into Embeddings" is a bit misleading as those two terms are relatively interchangeable. I think we're essentially talking about the definition of the various vector field attributes like distance metric, data type, dims, etc? Some of those subsumed by the choice of vectorizer for sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the verbiage a bit - hopefully this is better?
README.md
Outdated
[RedisIdField] | ||
public string Id { get; set; } | ||
|
||
[Indexed(DistanceMetric = DistanceMetric.COSINE)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also show how a few other vector field attributes like index type (HNSW vs FLAT) and related args are set here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of other parameters for the index definition, and explained it a bit better in the modeling section.
|
||
With Redis OM, the embeddings can be completely transparent to you, they are generated and bound to the `Vector<T>` when you query/insert your vectors. If however you needed your embedding after the insertion/Query, they are available at `Vector<T>.Embedding`, and be queried either as the raw bytes, as an array of doubles or as an array of floats (depending on your vectorizer). | ||
|
||
#### Configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add other vector field attribute level configuration details here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added details about index definition in the modeling section as that's more or less where it belongs (the configuration section is talking about configuring the vectorizers)
README.md
Outdated
With the vector defined in our model, all we need to do is create Vectors of the generic type, and insert them with our model. Using our `RedisCollection`, you can do this by simply using `Insert`: | ||
|
||
```cs | ||
var query = new OpenAIQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming this query
makes sense given the implied caching use case here. Maybe spell it out a bit so it's clear why we are inserting a "query" object into your vector database?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query is confusing in this context, is OpenAICompletionResult
& completionResult
better? (It's something that's not query that you might actually do with these embeddings lol).
a79689c
to
d33e935
Compare
…ibution of the model files)
…otnet into feature/vectors
Introduces Vector Search and Semantic caching to Redis OM .NET.
One breaking change - replace
string[]
withobject[]
in some key places (e.g.Execute
/ExecuteAsync
) as the byte arrays needed for VectorSearch need to be passed in raw. Should be transparent to anyone using the higher-level APIs within Redis OM, but anyone using those raw commands might need to make some adjustments. See README for details as to how to use the new API.