This guide provides more details for customizing the Chat App.
The Chat App is designed to work with any PDF documents. The sample data is provided to help you get started quickly, but you can easily replace it with your own data. You'll want to first remove all the existing data, then add your own. See the data ingestion guide for more details.
The frontend is built using React and Fluent UI components. The frontend components are stored in the app/frontend/src
folder. The typical components you'll want to customize are:
app/frontend/index.html
: To change the page titleapp/frontend/src/pages/layout/Layout.tsx
: To change the header text and logoapp/frontend/src/pages/chat/Chat.tsx
: To change the large headingapp/frontend/src/components/Example/ExampleList.tsx
: To change the example questions
The backend is built using Quart, a Python framework for asynchronous web applications. The backend code is stored in the app/backend
folder.
Typically, the primary backend code you'll want to customize is the app/backend/approaches
folder, which contains the classes powering the Chat and Ask tabs. Each class uses a different RAG (Retrieval Augmented Generation) approach, which include system messages that should be changed to match your data
The chat tab uses the approach programmed in chatreadretrieveread.py.
- It uses the OpenAI ChatCompletion API to turn the user question into a good search query.
- It queries Azure AI Search for search results for that query (optionally using the vector embeddings for that query).
- It then combines the search results and original user question, and asks OpenAI ChatCompletion API to answer the question based on the sources. It includes the last 4K of message history as well (or however many tokens are allowed by the deployed model).
The system_message_chat_conversation
variable is currently tailored to the sample data since it starts with "Assistant helps the company employees with their healthcare plan questions, and questions about the employee handbook." Change that to match your data.
The ask tab uses the approach programmed in retrievethenread.py.
- It queries Azure AI Search for search results for the user question (optionally using the vector embeddings for that question).
- It then combines the search results and user question, and asks OpenAI ChatCompletion API to answer the question based on the sources.
The system_chat_template
variable is currently tailored to the sample data since it starts with "You are an intelligent assistant helping Contoso Inc employees with their healthcare plan questions and employee handbook questions." Change that to match your data.
The UI provides a "Developer Settings" menu for customizing the approaches, like disabling semantic ranker or using vector search. Those settings are passed in the "context" field of the request to the backend, and are not saved permanently. However, if you find a setting that you do want to make permanent, there are two approaches:
- Change the defaults in the frontend. You'll find the defaults in
Chat.tsx
andOneShot.tsx
(for Ask). For example, this line of code sets the default retrieval mode to Hybrid:
const [retrievalMode, setRetrievalMode] = useState<RetrievalMode>(RetrievalMode.Hybrid);
You can change the default to Text by changing the code to:
const [retrievalMode, setRetrievalMode] = useState<RetrievalMode>(RetrievalMode.Text);
- Change the overrides in the backend. Each of the approaches has a
run
method that takes acontext
parameter, and the first line of code extracts the overrides from thatcontext
. That's where you can override any of the settings. For example, to change the retrieval mode to text:
overrides = context.get("overrides", {})
overrides["retrieval_mode"] = "text"
By changing the setting on the backend, you can safely remove the Developer Settings UI from the frontend, if you don't wish to expose that to your users.
Once you are running the chat app on your own data and with your own tailored system prompt, the next step is to test the app with questions and note the quality of the answers. If you notice any answers that aren't as good as you'd like, here's a process for improving them.
The first step is to identify where the problem is occurring. For example, if using the Chat tab, the problem could be:
- OpenAI ChatCompletion API is not generating a good search query based on the user question
- Azure AI Search is not returning good search results for the query
- OpenAI ChatCompletion API is not generating a good answer based on the search results and user question
You can look at the "Thought process" tab in the chat app to see each of those steps, and determine which one is the problem.
If the problem is with the ChatCompletion API calls (steps 1 or 3 above), you can try changing the relevant prompt.
Once you've changed the prompt, make sure you ask the same question multiple times to see if the overall quality has improved, and run an evaluation when you're satisfied with the changes. The ChatCompletion API can yield different results every time, even for a temperature of 0.0, but especially for a higher temperature than that (like our default of 0.7 for step 3).
You can also try changing the ChatCompletion parameters, like temperature, to see if that improves results for your domain.
If the problem is with Azure AI Search (step 2 above), the first step is to check what search parameters you're using. Generally, the best results are found with hybrid search (text + vectors) plus the additional semantic re-ranking step, and that's what we've enabled by default. There may be some domains where that combination isn't optimal, however.
You can change many of the search parameters in the "Developer settings" in the frontend and see if results improve for your queries. The most relevant options:
You may find it easier to experiment with search options with the index explorer in the Azure Portal. Open up the Azure AI Search resource, select the Indexes tab, and select the index there.
Then use the JSON view of the search explorer, and make sure you specify the same options you're using in the app. For example, this query represents a search with semantic ranker configured:
{
"search": "eye exams",
"queryType": "semantic",
"semanticConfiguration": "default",
"queryLanguage": "en-us",
"speller": "lexicon",
"top": 3
}
You can also use the highlight
parameter to see what text is being matched in the content
field in the search results.
{
"search": "eye exams",
"highlight": "content"
...
}
The search explorer works well for testing text, but is harder to use with vectors, since you'd also need to compute the vector embedding and send it in. It is probably easier to use the app frontend for testing vectors/hybrid search.
Once you've made changes to the prompts or settings, you'll want to rigorously evaluate the results to see if they've improved. You can use tools in the AI RAG Chat evaluator repository to run evaluations, review results, and compare answers across runs.