Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matching engine -> vector search #414

Merged
merged 2 commits into from
Feb 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@
"# If we want to be specific, we can set the dtype (see below) at creation time\n",
"rank_2_tensor = tf.constant(\n",
" [[1, 2], [3, 4], [5, 6]],\n",
" dtype=None # TODO 1a\n",
" dtype=None, # TODO 1a\n",
" # TODO: Your code goes here.\n",
")\n",
"print(rank_2_tensor)"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,20 @@
"id": "92f0e28c-799f-4176-90db-5b4a53ebc6ab",
"metadata": {},
"source": [
"# Semantic Search with Matching Engine and PaLM Embeddings\n",
"# Semantic Search with Vertex Vector Search and PaLM Embeddings\n",
"\n",
"**Learning Objectives**\n",
" 1. Learn how to create text embeddings using the Vertex PaLM API\n",
" 1. Learn how to load embeddings in Vertex Matching Engine\n",
" 2. Learn how to query Vertex Matching Engine\n",
" 1. Learn how to load embeddings in Vertex Vector Search\n",
" 2. Learn how to query Vertex Vector Search\n",
" 1. Learn how to build an information retrieval system based on semantic match\n",
" \n",
" \n",
"In this notebook, we implement a simple (albeit fast and scalable) [semantic search](https://en.wikipedia.org/wiki/Semantic_search#:~:text=Semantic%20search%20seeks%20to%20improve,to%20generate%20more%20relevant%20results.) retrieval system using [Vertex Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview) and [Vertex PaLM Embeddings](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings). In a semantic search system, a number of documents are returned to a user query, ranked by their semantic match. This means that the returned documents should match the intent or meaning of the query rather than its actual exact keywords as opposed to a boolean or keyword-based retrieval system. Such a semantic search system has in general two components, namely:\n",
"In this notebook, we implement a simple (albeit fast and scalable) [semantic search](https://en.wikipedia.org/wiki/Semantic_search#:~:text=Semantic%20search%20seeks%20to%20improve,to%20generate%20more%20relevant%20results.) retrieval system using [Vertex Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview) and [Vertex PaLM Embeddings](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings). In a semantic search system, a number of documents are returned to a user query, ranked by their semantic match. This means that the returned documents should match the intent or meaning of the query rather than its actual exact keywords as opposed to a boolean or keyword-based retrieval system. Such a semantic search system has in general two components, namely:\n",
"\n",
"* A component that produces semantically meaningful vector representations of both the documents as well as the user queries; we will use the [Vertex PaLM Embeddings](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings) API to creates these embeddings, leveraging the power of the [PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html) large language model developed at Google. \n",
"\n",
"* A component that allows users to store the document vector embeddings and retrieve the most relevant documents by returning the documents whose embeddings are the closest to the user-query embedding in the embedding space. We will use [Vertex Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview) which can scale up to billions of embeddings thanks to an [efficient approximate nearest neighbor strategy](https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html) to compare and retrieve the closest document vectors to a query vector based on a [recent paper from Google research](https://arxiv.org/abs/1908.10396).\n",
"* A component that allows users to store the document vector embeddings and retrieve the most relevant documents by returning the documents whose embeddings are the closest to the user-query embedding in the embedding space. We will use [Vertex Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview) which can scale up to billions of embeddings thanks to an [efficient approximate nearest neighbor strategy](https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html) to compare and retrieve the closest document vectors to a query vector based on a [recent paper from Google research](https://arxiv.org/abs/1908.10396).\n",
"\n",
"\n",
"\n",
Expand Down Expand Up @@ -181,15 +181,15 @@
"id": "f6109de9-2a7f-43aa-b70f-4496ad50be5c",
"metadata": {},
"source": [
"## Creating the matching engine input file"
"## Creating the Vector Search input file"
]
},
{
"cell_type": "markdown",
"id": "c80d8533-d98c-41a6-b5b3-36634c3525a1",
"metadata": {},
"source": [
"At this point, our 4000 abstract embeddings are stored in memory in the `vectors` list. To store these embeddings into [Vertex Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview), we need to serialize them into a JSON file with the [following format](https://cloud.google.com/vertex-ai/docs/matching-engine/match-eng-setup/format-structure):\n",
"At this point, our 4000 abstract embeddings are stored in memory in the `vectors` list. To store these embeddings into [Vertex Matching Engine](https://cloud.google.com/vertex-ai/docs/vector-search/overview), we need to serialize them into a JSON file with the [following format](https://cloud.google.com/vertex-ai/docs/vector-search/setup/format-structure):\n",
"\n",
"```python\n",
"{\"id\": <DOCUMENT_ID1>, \"embedding\": [0.1, ..., -0.7]}\n",
Expand All @@ -198,7 +198,7 @@
"```\n",
"where the value of the `id` field should be an identifier allowing us to retrieve the actual document from a separate source, and the value of `embedding` is the vector returned by the PaLM API. \n",
"\n",
"For the document `id` we simply use the row index in the `metadata` DataFrame, which will serve as our in-memory document store. This makes it particularly easy to retrieve the abstract, title and url from an `id` returned by the matching engine:\n",
"For the document `id` we simply use the row index in the `metadata` DataFrame, which will serve as our in-memory document store. This makes it particularly easy to retrieve the abstract, title and url from an `id` returned by Vector Search:\n",
"\n",
"```python\n",
"metadata.abstract[id]\n",
Expand Down Expand Up @@ -270,25 +270,25 @@
"id": "220b10e1-ad07-40e0-bfa3-38ed4ed8ce8c",
"metadata": {},
"source": [
"## Creating the matching engine index"
"## Creating Vector Search index"
]
},
{
"cell_type": "markdown",
"id": "34b8c2ea-8ddb-4e79-ad00-9726db109ec1",
"metadata": {},
"source": [
"We are now up to the task of setting up [Vertex Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview). The procedure requires two steps:\n",
"We are now up to the task of setting up [Vertex Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview). The procedure requires two steps:\n",
"\n",
"1. The [creation of an index](https://cloud.google.com/vertex-ai/docs/matching-engine/create-manage-index)\n",
"1. The [deployment of this index to an endpoint](https://cloud.google.com/vertex-ai/docs/matching-engine/deploy-index-public)\n",
"1. The [creation of an index](https://cloud.google.com/vertex-ai/docs/vector-search/create-manage-index)\n",
"1. The [deployment of this index to an endpoint](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-public)\n",
"\n",
"While creating the index, the embedding vectors are uploaded to the matching engine and a tree-like data structure (the index) is created allowing for fast but approximate retrieval of the `approximate_neighbors_count` nearest neighbors of a given vector. The index depends on a notion of distance between embedding vectors that we need to specify in the `distance_measure_type`. We choose here the `COSINE_DISTANCE` which essentially is a measure of the angle between the embedding vectors. Other possible choices are the square of the euclidean distance (`SQUARED_L2_DISTANCE`), the [Manhattan distance](https://en.wikipedia.org/wiki/Taxicab_geometry) (`L1_DISTANCE`), or the dot product distance (`DOT_PRODUCT_DISTANCE`). (Note that if the embeddings you are using have been trained to minimize the one of these distances between matching pairs, then you may get better results by selecting this particular distance, otherwise the `COSINE_DISTANCE` will do just fine.) \n",
"While creating the index, the embedding vectors are uploaded to the Vector Search and a tree-like data structure (the index) is created allowing for fast but approximate retrieval of the `approximate_neighbors_count` nearest neighbors of a given vector. The index depends on a notion of distance between embedding vectors that we need to specify in the `distance_measure_type`. We choose here the `COSINE_DISTANCE` which essentially is a measure of the angle between the embedding vectors. Other possible choices are the square of the euclidean distance (`SQUARED_L2_DISTANCE`), the [Manhattan distance](https://en.wikipedia.org/wiki/Taxicab_geometry) (`L1_DISTANCE`), or the dot product distance (`DOT_PRODUCT_DISTANCE`). (Note that if the embeddings you are using have been trained to minimize the one of these distances between matching pairs, then you may get better results by selecting this particular distance, otherwise the `COSINE_DISTANCE` will do just fine.) \n",
"\n",
"### Exercise\n",
"\n",
"Complete the next cell so that it creates the matching engine index from the embedding file. (Running it will take up about 1 hour.)\n",
"You can read about the options [here](https://cloud.google.com/vertex-ai/docs/matching-engine/configuring-indexes)."
"Complete the next cell so that it creates Vector Search index from the embedding file. (Running it will take up about 1 hour.)\n",
"You can read about the options [here](https://cloud.google.com/vertex-ai/docs/vector-search/configuring-indexes)."
]
},
{
Expand Down Expand Up @@ -349,7 +349,7 @@
"id": "bbd056fc-5502-4982-b943-c03bd77fc9c6",
"metadata": {},
"source": [
"Now that our index is up and running, we need to make it accessible to be able to query it. The first step is to create a public endpoint (for speedups, one can also create a [private endpoint in a VPC network](https://cloud.google.com/vertex-ai/docs/matching-engine/deploy-index-vpc)):"
"Now that our index is up and running, we need to make it accessible to be able to query it. The first step is to create a public endpoint (for speedups, one can also create a [private endpoint in a VPC network](https://cloud.google.com/vertex-ai/docs/vector-search/deploy-index-vpc)):"
]
},
{
Expand All @@ -359,7 +359,7 @@
"source": [
"### Exercise\n",
"\n",
"Complete the cell below to create an matching engine endpoint:"
"Complete the cell below to create a Vector Search endpoint:"
]
},
{
Expand Down Expand Up @@ -389,7 +389,7 @@
"source": [
"### Exercise\n",
"\n",
"Complete the cell below to deply the matching engine endpoint:"
"Complete the cell below to deploy the Vector Search endpoint:"
]
},
{
Expand Down Expand Up @@ -421,7 +421,7 @@
"id": "394442cd-70c1-4a1a-8cf5-fa43e1973152",
"metadata": {},
"source": [
"We are now ready to issue queries to the matching engine! \n",
"We are now ready to issue queries to Vector Search! \n",
"\n",
"To begin with, we need to create a PaLM embedding from a user query: "
]
Expand All @@ -443,7 +443,7 @@
"id": "7feb8901-b2ff-4cc0-82f9-fcbc927fa553",
"metadata": {},
"source": [
"Then we can use the `find_neighbors` method from our deployed matching engine index. This method takes as input the embedding vector from the user query and returns the abstract id's of the `NUM_NEIGHBORS` nearest neighbors:"
"Then we can use the `find_neighbors` method from our deployed Vector Search index. This method takes as input the embedding vector from the user query and returns the abstract id's of the `NUM_NEIGHBORS` nearest neighbors:"
]
},
{
Expand All @@ -453,7 +453,7 @@
"source": [
"### Exercise\n",
"\n",
"Query the matching engine to retrieve the abstract ID's whose embeddings are closest to the vector representing the user query:"
"Query Vector Search to retrieve the abstract ID's whose embeddings are closest to the vector representing the user query:"
]
},
{
Expand Down Expand Up @@ -510,7 +510,7 @@
"id": "27f493d6-5e73-4dc1-9404-f093111e18e0",
"metadata": {},
"source": [
"Here is the matching engine response formatted as a simple list for convenience. You may see in the list of returned papers some in a different language than english even though the query was in english. This demonstrates the muli-language ability of the PaLM large language model and illustrates that the matches are done on the basis of meaning meaning rather than exact keywords match:"
"Here is Vector Search response formatted as a simple list for convenience. You may see in the list of returned papers some in a different language than english even though the query was in english. This demonstrates the muli-language ability of the PaLM large language model and illustrates that the matches are done on the basis of meaning meaning rather than exact keywords match:"
]
},
{
Expand Down Expand Up @@ -570,12 +570,12 @@
"metadata": {
"environment": {
"kernel": "python3",
"name": "tf2-gpu.2-11.m109",
"name": "tf2-gpu.2-12.m115",
"type": "gcloud",
"uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-11:m109"
"uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-12:m115"
},
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (Local)",
"language": "python",
"name": "python3"
},
Expand All @@ -589,7 +589,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
Loading
Loading