Skip to content

Commit 284a2a6

Browse files
Merge pull request #25 from akvo/poc/24-try-to-improve-rag-prompt
Poc/24 try to improve rag prompt
2 parents 9718d8c + 5c1ddac commit 284a2a6

File tree

3 files changed

+171
-15
lines changed

3 files changed

+171
-15
lines changed

backend/app/services/README.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# 🔍 RAG Prompt Comparison: Before vs After
2+
3+
This document outlines the key improvements made to the prompts used in our Retrieval-Augmented Generation (RAG) system, focusing on better context handling, answer quality, and user experience.
4+
5+
---
6+
7+
## 🧠 Contextualize Question Prompt
8+
9+
### 🟡 BEFORE
10+
```python
11+
contextualize_q_system_prompt = (
12+
"Given a chat history and the latest user question "
13+
"which might reference context in the chat history, "
14+
"formulate a standalone question which can be understood "
15+
"without the chat history. Do NOT answer the question, just "
16+
"reformulate it if needed and otherwise return it as is."
17+
)
18+
```
19+
20+
### 🟢 AFTER
21+
```python
22+
contextualize_q_system_prompt = (
23+
"You are given a chat history and the latest user question. Your task is to reformulate the user's question into a "
24+
"clear, standalone version that accurately captures the user's intent. The standalone question must be understandable "
25+
"without access to the previous messages.\n\n"
26+
"If the user refers to previous parts of the conversation (e.g., using phrases like 'what did we talk about earlier?', "
27+
"'summarize our chat', 'what was your last answer?', or 'can you remind me what I said before?'), then incorporate the relevant "
28+
"context from the chat history into the reformulated question. Do not omit or generalize key topics or facts.\n\n"
29+
"Examples:\n"
30+
"- User question: 'Can you summarize what we’ve discussed so far?'\n"
31+
" Reformulated: 'Summarize the conversation we’ve had so far about fine-tuning a language model.'\n"
32+
"- User question: 'What was the tool you mentioned before?'\n"
33+
" Reformulated: 'What was the name of the tool you mentioned earlier for data labeling in NLP pipelines?'\n"
34+
"- User question: 'What did I ask you in the beginning?'\n"
35+
" Reformulated: 'What was my first question regarding LangChain integration?'\n\n"
36+
"Preserve the user's original language and intent. Reformulate the question in a way that is suitable for searching relevant "
37+
"information from a knowledge base, especially in multi-turn conversations where the user's intent builds on earlier exchanges."
38+
)
39+
```
40+
41+
### ✅ Key Improvements:
42+
- Handles memory-related queries: Supports reformulation of questions like "what did we talk about before?"
43+
- Examples added: Demonstrates how to handle different kinds of historical references.
44+
- Preserves intent and language: Ensures user phrasing remains intact while boosting searchability.
45+
- Search-optimized structure: Produces standalone questions useful for embedding-based KB retrieval.
46+
47+
---
48+
49+
## 🤖 QA System Prompt
50+
51+
### 🟡 BEFORE
52+
```python
53+
qa_system_prompt = (
54+
"You are given a user question, and please write clean, concise and accurate answer to the question. "
55+
"You will be given a set of related contexts to the question, which are numbered sequentially starting from 1. "
56+
"Each context has an implicit reference number based on its position in the array (first context is 1, second is 2, etc.). "
57+
"Please use these contexts and cite them using the format [citation:x] at the end of each sentence where applicable. "
58+
"Your answer must be correct, accurate and written by an expert using an unbiased and professional tone. "
59+
"Please limit to 1024 tokens. Do not give any information that is not related to the question, and do not repeat. "
60+
"Say 'information is missing on' followed by the related topic, if the given context do not provide sufficient information. "
61+
"If a sentence draws from multiple contexts, please list all applicable citations, like [citation:1][citation:2]. "
62+
"Other than code and specific names and citations, your answer must be written in the same language as the question. "
63+
"Be concise.\n\nContext: {context}\n\n"
64+
"Remember: Cite contexts by their position number (1 for first context, 2 for second, etc.) and don't blindly "
65+
"repeat the contexts verbatim."
66+
)
67+
```
68+
69+
### 🟢 AFTER
70+
```python
71+
qa_strict_prompt = (
72+
"You are a highly knowledgeable and factual AI assistant. You must answer user questions using **only** the content provided in the context documents.\n\n"
73+
"### Strict Answering Rules:\n"
74+
"1. **Use Context Only**: Do not use external knowledge or assumptions. All parts of your answer must be supported by the given context.\n"
75+
"2. **Cite Precisely**: Cite the source of information using [citation:x], where x corresponds to the position of the document (1, 2, 3, etc.). "
76+
"Citations must be placed at the end of each sentence where the context is used.\n"
77+
"3. **If Information Is Missing**:\n"
78+
" - If key information needed to answer the question is missing, respond with: \n"
79+
" 'Information is missing on [specific topic] based on the provided context.'\n"
80+
" - If the context gives partial information, summarize what is known and clearly state what is missing.\n"
81+
"4. **Writing Style & Language**:\n"
82+
" - Respond in the same language used in the user’s question.\n"
83+
" - Be clear, concise, and professional.\n"
84+
" - Do not copy context verbatim—summarize or paraphrase it when necessary.\n"
85+
"5. **Multiple Sources**: If a statement is supported by more than one document, list all citations, e.g., [citation:1][citation:3].\n"
86+
"6. **Length Limit**: Keep the full answer under 1024 tokens. Be brief but complete.\n\n"
87+
"### Provided Context:\n{context}\n"
88+
)
89+
```
90+
91+
### 🔍 Improvements Analysis
92+
93+
#### 🎯 Problem: Overuse of "Missing Information" Warnings
94+
- Before: Too eager to declare "missing information"
95+
- After: Encourages partial yet helpful answers when context is incomplete
96+
97+
#### 🧩 Problem: Poor Context Synthesis
98+
- Before: No instruction on combining insights
99+
- After: Actively directs to synthesize across multiple documents
100+
101+
#### 🗣️ Problem: Robotic Tone
102+
- Before: Rigid expert tone
103+
- After: Professional but user-friendly tone with clearer structure.
104+
105+
#### 🌐 Problem: Hidden Language Requirements
106+
- Before: Language policy buried in a dense paragraph
107+
- After: Clearly defined under numbered instructions.
108+
109+
---
110+
111+
## 🚀 Expected Outcomes
112+
113+
| Outcome | Expected Improvement |
114+
| --------------------------------- | ------------------------ |
115+
| Fewer "missing information" cases | 60–80% reduction |
116+
| More context synthesis | +Better citations |
117+
| Enhanced readability | More natural replies |
118+
| Multilingual consistency | Higher user trust |
119+
| Better response quality | Higher user satisfaction |

backend/app/services/chat_service.py

Lines changed: 50 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ async def generate_response(
3535
db: Session,
3636
max_history_length: Optional[int] = 10,
3737
generate_last_n_messages: Optional[bool] = False,
38+
strict_mode: Optional[bool] = True,
3839
) -> AsyncGenerator[str, None]:
3940
try:
4041
"""
@@ -70,14 +71,10 @@ async def generate_response(
7071
.all()
7172
)
7273
for message in all_history_messages:
73-
marker = "__LLM_RESPONSE__"
74-
content = message.content
75-
if content and marker in content:
76-
content = content.split(marker, 1)[1].strip()
7774
messages["messages"].append(
7875
{
7976
"role": message.role,
80-
"content": content,
77+
"content": message.content,
8178
}
8279
)
8380
# EOL generate last n message in backend
@@ -142,12 +139,26 @@ async def generate_response(
142139

143140
# Create contextualize question prompt
144141
contextualize_q_system_prompt = (
145-
"Given a chat history and the latest user question "
146-
"which might reference context in the chat history, "
147-
"formulate a standalone question which can be understood "
148-
"without the chat history. Do NOT answer the question, just "
149-
"reformulate it if needed and otherwise return it as is."
142+
"You are given a chat history and the user's latest question. Your task is to rewrite the user's input as a clear, "
143+
"standalone question that fully captures their intent. The reformulated question must be understandable on its own, "
144+
"without requiring access to earlier parts of the conversation.\n\n"
145+
"If the user refers to earlier messages or prior context (e.g., 'what did we talk about?', 'summarize our chat', "
146+
"'what was your last response?', or 'can you remind me what I said before?'), incorporate the relevant details from the "
147+
"chat history into the rewritten question. Be precise—do not omit specific topics, facts, or tools mentioned earlier.\n\n"
148+
"Your reformulated question should:\n"
149+
"1. Retain the user's original language and tone.\n"
150+
"2. Be specific and context-aware.\n"
151+
"3. Be suitable for use in retrieval or question-answering over a knowledge base.\n\n"
152+
"Examples:\n"
153+
"- User: 'Can you summarize what we’ve discussed so far?'\n"
154+
" Reformulated: 'Summarize our conversation so far about fine-tuning a language model.'\n"
155+
"- User: 'What was the tool you mentioned before?'\n"
156+
" Reformulated: 'What was the name of the tool you mentioned earlier for data labeling in NLP pipelines?'\n"
157+
"- User: 'What did I ask you in the beginning?'\n"
158+
" Reformulated: 'What was my first question regarding LangChain integration?'\n\n"
159+
"Focus on maintaining the intent while making the question precise and independently interpretable."
150160
)
161+
151162
contextualize_q_prompt = ChatPromptTemplate.from_messages(
152163
[
153164
("system", contextualize_q_system_prompt),
@@ -162,7 +173,7 @@ async def generate_response(
162173
)
163174

164175
# Create QA prompt
165-
qa_system_prompt = (
176+
qa_flexible_prompt = (
166177
"You are given a user question, and please write clean, concise and accurate answer to the question. "
167178
"You will be given a set of related contexts to the question, which are numbered sequentially starting from 1. "
168179
"Each context has an implicit reference number based on its position in the array (first context is 1, second is 2, etc.). "
@@ -176,6 +187,32 @@ async def generate_response(
176187
"Remember: Cite contexts by their position number (1 for first context, 2 for second, etc.) and don't blindly "
177188
"repeat the contexts verbatim."
178189
)
190+
qa_strict_prompt = (
191+
"You are a highly knowledgeable and factual AI assistant. You must answer user questions using **only** the content provided in the context documents.\n\n"
192+
"### Strict Answering Rules:\n"
193+
"1. **Use Context Only**: Do not use external knowledge or assumptions. All parts of your answer must be supported by the given context.\n"
194+
"2. **Cite Precisely**: Cite the source of information using [citation:x], where x corresponds to the position of the document (1, 2, 3, etc.). "
195+
"Citations must be placed at the end of each sentence where the context is used.\n"
196+
"3. **If Information Is Missing**:\n"
197+
" - If key information needed to answer the question is missing, respond with: \n"
198+
" 'Information is missing on [specific topic] based on the provided context.'\n"
199+
" - If the context gives partial information, summarize what is known and clearly state what is missing.\n"
200+
"4. **Writing Style & Language**:\n"
201+
" - Respond in the same language used in the user’s question.\n"
202+
" - Be clear, concise, and professional.\n"
203+
" - Do not copy context verbatim—summarize or paraphrase it when necessary.\n"
204+
"5. **Multiple Sources**: If a statement is supported by more than one document, list all citations, e.g., [citation:1][citation:3].\n"
205+
"6. **Length Limit**: Keep the full answer under 1024 tokens. Be brief but complete.\n\n"
206+
"### Provided Context:\n{context}\n"
207+
)
208+
209+
if strict_mode:
210+
qa_system_prompt = qa_strict_prompt
211+
else:
212+
qa_system_prompt = (
213+
qa_flexible_prompt # your original or a looser version
214+
)
215+
179216
qa_prompt = ChatPromptTemplate.from_messages(
180217
[
181218
("system", qa_system_prompt),
@@ -221,8 +258,9 @@ async def generate_response(
221258
{"input": query, "chat_history": chat_history}
222259
):
223260
if "context" in chunk:
261+
retrieved_docs = chunk["context"]
224262
serializable_context = []
225-
for context in chunk["context"]:
263+
for context in retrieved_docs:
226264
serializable_doc = {
227265
"page_content": context.page_content.replace(
228266
'"', '\\"'

demo-page/config.js

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,8 @@ window.config_living_income = {
3535
wsURL: "wss://akvo-rag.akvotest.org/ws/chat",
3636
};
3737

38-
// LOCAL ENV
3938
window.config_local = {
40-
title: "Chat from Local",
41-
kb_id: 38,
39+
title: "TDT #3",
40+
kb_id: 43,
4241
wsURL: "ws://localhost:81/ws/chat",
4342
};

0 commit comments

Comments
 (0)