Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

Commit

Permalink
Merge branch 'dev' into server-docstrings
Browse files Browse the repository at this point in the history
  • Loading branch information
igiloh-pinecone committed Nov 5, 2023
2 parents 71c7fc6 + b90769d commit 3582401
Show file tree
Hide file tree
Showing 10 changed files with 329 additions and 55 deletions.
95 changes: 50 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Canopy

**Canopy** is an open-source Retrieval Augmented Generation (RAG) framework built on top of the Pinecone vector database. Canopy enables developers to quickly and easily experiment with and build applications using Retrieval Augmented Generation (RAG).
Canopy provides a configurable built-in server that allows users to effortlessly deploy a RAG-infused Chatbot web app using their own documents as a knowledge base.
For advanced use cases, the canopy core library enables building your own custom retrieval-powered AI applications.
**Canopy** is an open-source Retrieval Augmented Generation (RAG) framework and context engine built on top of the Pinecone vector database. Canopy enables you to quickly and easily experiment with and build applications using RAG. Start chatting with your documents or text data with a few simple commands.

Canopy provides a configurable built-in server so you can effortlessly deploy a RAG-powered chat application to your existing chat UI or interface. Or you can build your own, custom RAG application using the Canopy lirbary.

Canopy is desinged to be:
* **Easy to implement:** Bring your text data in Parquet or JSONL format, and Canopy will handle the rest. Canopy makes it easy to incorporate RAG into your OpenAI chat applications.
* **Reliable at scale:** Build fast, highly accurate GenAI applications that are production-ready and backed by Pinecone’s vector database. Seamlessly scale to billions of items with transarent, resource-based pricing.
* **Open and flexible:** Fully open-source, Canopy is both modular and extensible. You can configure to choose the components you need, or extend any component with your own custom implementation. Easily incorporate it into existing OpenAI applications and connect Canopy to your preferred UI.
* **Interactive and iterative:** Evaluate your RAG workflow with a CLI based chat tool. With a simple command in the Canopy CLI you can interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side to evaluate the augmented results before scaling to production.
* **Open and flexible:** Fully open-source, Canopy is both modular and extensible. You can configure to choose the components you need, or extend any component with your own custom implementation.
* **Interactive and iterative:** Evaluate your RAG workflow with a CLI based chat tool. With a simple command in the Canopy CLI you can interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side.

## RAG with Canopy

Expand Down Expand Up @@ -41,31 +41,31 @@ Learn how Canopy implemenets the full RAG workflow to prevent hallucinations and

<li> Canopy KnowledgeBase will encode each chunk using one or more embedding models</li>

<li> Canopy KnowledgeBase will upsert the encoded chunks into Pinecone Index</li>
<li> Canopy KnowledgeBase will upsert the encoded chunks into Pinecone index</li>

</ol>
</details>

## What's inside the box?

1. **Canopy Core Library** - Canopy has 3 API level components that are responsible for different parts of the RAG workflow:
* **ChatEngine** _`/chat/completions`_ - implements the full RAG workflow and exposes a chat interface to interact with your data. It acts as a wrapper around the Knowledge Base and Context Engine.
* **ContextEngine** - performs the “retrieval” part of RAG. The `ContextEngine` utilizes the underlying `KnowledgeBase` to retrieve the most relevant document chunks, then formulates a coherent textual context to be used as a prompt for the LLM.
1. **Canopy Core Library** - The library has 3 main classes that are responsible for different parts of the RAG workflow:
* **ChatEngine** - Exposes a chat interface to interact with your data. Given the history of chat messages, the `ChatEngine` formulates relevant queries to the `ContextEngine`, then uses the LLM to generate a knowledgeable response.
* **ContextEngine** - Performs the “retrieval” part of RAG. The `ContextEngine` utilizes the underlying `KnowledgeBase` to retrieve the most relevant documents, then formulates a coherent textual context to be used as a prompt for the LLM.
* **KnowledgeBase** - Manages your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings, storing them in a Pinecone vector database. Given a text query - the `KnowledgeBase` will retrieve the most relevant document chunks from the database.

* **KnowledgeBase** _`/context/{upsert, delete}` - prepares your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings before upserting them into the Pinecone vector database. It also handles Delete operations.

> more information about the Core Library usage can be found in the [Library Documentation](docs/library.md)
2. **Canopy Service** - a webservice that wraps the **Canopy Core** and exposes it as a REST API. The service is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production. The service also comes with a built in Swagger UI for easy testing and documentation. After you [start the server](#3-start-the-canopy-service), you can access the Swagger UI at `http://host:port/docs` (default: `http://localhost:8000/docs`)
2. **Canopy Server** - This is a webservice that wraps the **Canopy Core** library and exposes it as a REST API. The server is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production.
The server also comes with a built-in Swagger UI for easy testing and documentation. After you [start the server](#3-start-the-canopy-server), you can access the Swagger UI at `http://host:port/docs` (default: `http://localhost:8000/docs`)

3. **Canopy CLI** - A built-in development tool that allows users to swiftly set up their own Canopy server and test its configuration.
With just three CLI commands, you can create a new Canopy service, upload your documents to it, and then interact with the Chatbot using a built-in chat application directly from the terminal. The built-in chatbot also enables comparison of RAG-infused responses against a native LLM chatbot.
With just three CLI commands, you can create a new Canopy server, upload your documents to it, and then interact with the Chatbot using a built-in chat application directly from the terminal. The built-in chatbot also enables comparison of RAG-infused responses against a native LLM chatbot.

## Considerations

* Canopy is currently only compatiable with OpenAI API endpoints for both the embedding model and the LLM. Rate limits and pricing set by OpenAI will apply.


## Setup

0. set up a virtual environment (optional)
Expand All @@ -77,7 +77,7 @@ more about virtual environments [here](https://docs.python.org/3/tutorial/venv.h

1. install the package
```bash
pip install pinecone-canopy
pip install canopy-sdk
```

2. Set up the environment variables
Expand All @@ -101,7 +101,7 @@ export INDEX_NAME=<INDEX_NAME>
| `PINECONE_ENVIRONMENT`| Determines the Pinecone service cloud environment of your index e.g `west1-gcp`, `us-east-1-aws`, etc | You can find the Pinecone environment next to the API key in [console](https://app.pinecone.io/) |
| `OPENAI_API_KEY` | API key for OpenAI. Used to authenticate to OpenAI's services for embedding and chat API | You can find your OpenAI API key [here](https://platform.openai.com/account/api-keys). You might need to login or register to OpenAI services |
| `INDEX_NAME` | Name of the Pinecone index Canopy will underlying work with | You can choose any name as long as it follows Pinecone's [restrictions](https://support.pinecone.io/hc/en-us/articles/11729246212637-Are-there-restrictions-on-index-names-#:~:text=There%20are%20two%20main%20restrictions,and%20emojis%20are%20not%20supported.) |
| `CANOPY_CONFIG_FILE` | The path of a configuration yaml file to be used by the Canopy service. | Optional - if not provided, default configuration would be used |
| `CANOPY_CONFIG_FILE` | The path of a configuration yaml file to be used by the Canopy server. | Optional - if not provided, default configuration would be used |
</details>


Expand All @@ -125,19 +125,20 @@ In this quickstart, we will show you how to use the **Canopy** to build a simple

### 1. Create a new **Canopy** Index

**Canopy** will create and configure a new Pinecone index on your behalf. Just run:
As a one-time setup, Canopy needs to create a new Pinecone index that is configured to work with Canopy. Just run:

```bash
canopy new
```

And follow the CLI instructions. The index that will be created will have a prefix `canopy--<INDEX_NAME>`. This will have to be done only once per index.
And follow the CLI instructions. The index that will be created will have a prefix `canopy--<INDEX_NAME>`.
You only have to do this process once for every Canopy index you want to create.

> To learn more about Pinecone Indexes and how to manage them, please refer to the following guide: [Understanding indexes](https://docs.pinecone.io/docs/indexes)
> To learn more about Pinecone indexes and how to manage them, please refer to the following guide: [Understanding indexes](https://docs.pinecone.io/docs/indexes)
### 2. Uploading data

You can load data into your **Canopy** Index by simply using the CLI:
You can load data into your Canopy index using the command:

```bash
canopy upsert /path/to/data_directory
Expand All @@ -149,7 +150,7 @@ canopy upsert /path/to/data_directory/file.parquet
canopy upsert /path/to/data_directory/file.jsonl
```

Canopy support single or mulitple files in jsonl or praquet format. The documents should have the following schema:
Canopy supports files in `jsonl` or `parquet` format. The documents should have the following schema:

```
+----------+--------------+--------------+---------------+
Expand All @@ -162,57 +163,59 @@ Canopy support single or mulitple files in jsonl or praquet format. The document

Follow the instructions in the CLI to upload your data.

### 3. Start the **Canopy** service
### 3. Start the Canopy server

**Canopy** service serve as a proxy between your application and Pinecone. It will also handle the RAG part of the application. To start the service, run:
The canopy server exposes Canopy's functionality via a REST API. Namely, it allows you to upload documents, retrieve relevant docs for a given query, and chat with your data. The server exposes a `/chat.completion` endpoint that can be easily integrated with any chat application.
To start the server, run:

```bash
canopy start
```

Now, you should be prompted with the following standard Uvicorn message:

```
...
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```
**That's it!** you can now start using the **Canopy** server with any chat application that supports a `/chat.completion` endpoint.

> **_📝 NOTE:_**
> _📝 NOTE:_
>
> The canopy start command will keep the terminal occupied. To proceed with the next steps, please open a new terminal window.
> If you want to run the service in the background, you can use the following command - **```nohup canopy start &```**
> The canopy start command will keep the terminal occupied.
> If you want to run the server in the background, you can use the following command - **```nohup canopy start &```**
> However, this is not recommended.

### 4. Chat with your data
### Stopping the server
To stop the server, simply press `CTRL+C` in the terminal where you started it.

Now that you have data in your index, you can chat with it using the CLI:
If you have started the server in the background, you can stop it by running:

```bash
canopy chat
canopy stop
```

This will open a chat interface in your terminal. You can ask questions and the **Canopy** will try to answer them using the data you uploaded.
## Evaluation chat tool

To compare the chat response with and without RAG use the `--baseline` flag
Canopy's CLI comes with a built-in chat app that allows you to interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side to evaluate the results

```bash
canopy chat --baseline
```
In a new terminal window, set the [required environment variables](#setup) then run:

This will open a similar chat interface window, but will send your question directly to the LLM without the RAG pipeline.

### 5. Stop the **Canopy** service
```bash
canopy chat
```

To stop the service, simply press `CTRL+C` in the terminal where you started it.
This will open a chat interface in your terminal. You can ask questions and the RAG-infused chatbot will try to answer them using the data you uploaded.

If you have started the service in the background, you can stop it by running:
To compare the chat response with and without RAG use the `--no-rag` flag

```bash
canopy stop
canopy chat --no-rag
```

This will open a similar chat interface window, but will show both the RAG and non-RAG responses side-by-side.


## Advanced usage

### Migrating existing OpenAI application to **Canopy**
Expand All @@ -222,7 +225,7 @@ If you already have an application that uses the OpenAI API, you can migrate it
```python
import openai

openai.api_base = "http://host:port/context"
openai.api_base = "http://host:port/"

# now you can use the OpenAI API as usual
```
Expand All @@ -232,15 +235,17 @@ or without global state change:
```python
import openai

openai_response = openai.Completion.create(..., api_base="http://host:port/context")
openai_response = openai.Completion.create(..., api_base="http://host:port/")
```

### Running Canopy service in production
### Running Canopy server in production

Canopy is using FastAPI as the web framework and Uvicorn as the ASGI server. It is recommended to use Gunicorn as the production server, mainly because it supports multiple worker processes and can handle multiple requests in parallel, more details can be found [here](https://www.uvicorn.org/deployment/#using-a-process-manager).

To run the canopy service for production, please run:
To run the canopy server for production, please run:

```bash
gunicorn canopy_cli.app:app --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 --workers <number of desired worker processes>
```

The server interacts with services like Pinecone and OpenAI using your own authentication credentials. When deploying the server on a public web hosting provider, it is recommended to enable an authentication mechanism, so that your server would only take requests from authenticated users.
2 changes: 1 addition & 1 deletion docs/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ more about virtual environments [here](https://docs.python.org/3/tutorial/venv.h

1. install the package
```bash
pip install pinecone-canopy
pip install canopy-sdk
```

2. Set up the environment variables
Expand Down
2 changes: 1 addition & 1 deletion examples/canopy-lib-quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
}
],
"source": [
"!pip install -qU pinecone-canopy"
"!pip install -qU canopy-sdk"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[tool.poetry]
name = "pinecone-canopy"
name = "canopy-sdk"
version = "0.1.0"
description = "Canopy is an orchestration engine for intergating LLMs with Pinecone."
authors = ["Relevance Team <[email protected]>"]
Expand Down
2 changes: 1 addition & 1 deletion src/canopy/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import importlib.metadata

# Taken from https://stackoverflow.com/a/67097076
__version__ = importlib.metadata.version("pinecone-canopy")
__version__ = importlib.metadata.version("canopy-sdk")
72 changes: 71 additions & 1 deletion src/canopy/llm/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,15 @@


class OpenAILLM(BaseLLM):

"""
OpenAI LLM wrapper built on top of the OpenAI Python client.
Note: OpenAI requires a valid API key to use this class.
You can set the "OPENAI_API_KEY" environment variable to your API key.
Or you can directly set it as follows:
>>> import openai
>>> openai.api_key = "YOUR_API_KEY"
"""
def __init__(self,
model_name: str = "gpt-3.5-turbo",
*,
Expand All @@ -42,6 +50,29 @@ def chat_completion(self,
max_tokens: Optional[int] = None,
model_params: Optional[ModelParams] = None,
) -> Union[ChatResponse, Iterable[StreamingChatChunk]]:
"""
Chat completion using the OpenAI API.
Note: this function is wrapped in a retry decorator to handle transient errors.
Args:
messages: Messages (chat history) to send to the model.
stream: Whether to stream the response or not.
max_tokens: Maximum number of tokens to generate. Defaults to None (generates until stop sequence or until hitting max context size).
model_params: Model parameters to use for this request. Defaults to None (uses the default model parameters).
see: https://platform.openai.com/docs/api-reference/chat/create
Returns:
ChatResponse or StreamingChatChunk
Usage:
>>> from canopy.llm import OpenAILLM
>>> from canopy.models.data_models import UserMessage
>>> llm = OpenAILLM()
>>> messages = [UserMessage(content="Hello! How are you?")]
>>> result = llm.chat_completion(messages)
>>> print(result.choices[0].message.content)
"I'm good, how are you?"
""" # noqa: E501

model_params_dict: Dict[str, Any] = {}
model_params_dict.update(
Expand Down Expand Up @@ -80,6 +111,45 @@ def enforced_function_call(self,
*,
max_tokens: Optional[int] = None,
model_params: Optional[ModelParams] = None) -> dict:
"""
This function enforces the model to respond with a specific function call.
To read more about this feature, see: https://platform.openai.com/docs/guides/gpt/function-calling
Note: this function is wrapped in a retry decorator to handle transient errors.
Args:
messages: Messages (chat history) to send to the model.
function: Function to call. See canopy.llm.models.Function for more details.
max_tokens: Maximum number of tokens to generate. Defaults to None (generates until stop sequence or until hitting max context size).
model_params: Model parameters to use for this request. Defaults to None (uses the default model parameters).
see: https://platform.openai.com/docs/api-reference/chat/create
Returns:
dict: Function call arguments as a dictionary.
Usage:
>>> from canopy.llm import OpenAILLM
>>> from canopy.llm.models import Function, FunctionParameters, FunctionArrayProperty
>>> from canopy.models.data_models import UserMessage
>>> llm = OpenAILLM()
>>> messages = [UserMessage(content="I was wondering what is the capital of France?")]
>>> function = Function(
... name="query_knowledgebase",
... description="Query search engine for relevant information",
... parameters=FunctionParameters(
... required_properties=[
... FunctionArrayProperty(
... name="queries",
... items_type="string",
... description='List of queries to send to the search engine.',
... ),
... ]
... )
... )
>>> llm.enforced_function_call(messages, function)
{'queries': ['capital of France']}
""" # noqa: E501
# this enforces the model to call the function
function_call = {"name": function.name}

Expand Down
Loading

0 comments on commit 3582401

Please sign in to comment.