Project: Context-Supported AI Response System with RAG

Overview

This project is a conversational AI application designed to answer user queries using a multilingual large language model (LLM) called Llama 3.2. The system leverages a combination of advanced natural language processing (NLP) tools and a vector database to provide contextual, accurate, and efficient responses. The AI can maintain a history of conversations and use this history alongside relevant external documents to generate meaningful answers.

Features

Contextual Query Handling:
- Answers questions using a combination of conversation history and a vector-based document retrieval system.
- If a question is unrelated to the context or history, the system gracefully responds with "pass."
Multilingual LLM Integration:
- Utilizes Llama 3.2, a state-of-the-art multilingual large language model for text input/output operations.
Conversation Memory:
- Maintains a history of the last 5 exchanges between the user and the model to enrich future responses.
Real-Time Web Interface:
- Built with NiceGUI to provide a clean and user-friendly web application for interaction.
PDF Document Processing:
- Parses and splits PDF documents into manageable chunks for efficient retrieval.
Efficient Database Management:
- Uses Chroma Vector Database to store and retrieve embeddings of textual data.
Logging:
- Logs user queries and AI responses for debugging and tracking purposes.
Real-Time Conversation Contextualization:
- Integrates contextual memory and external document retrieval in real-time for more accurate responses.
Adaptive Language Model Tuning:
- Supports custom fine-tuning for more specialized domains to adapt the model to particular use cases.
Interactive User Feedback:
- Users can provide feedback on answers, which helps improve future responses by adjusting memory.

Technologies Used

Language Model and NLP Tools

Llama 3.2:
- A collection of pretrained and instruction-tuned generative models in 1B and 3B sizes.
- Optimized for multilingual dialogue, agentic retrieval, and summarization tasks.
- Outperforms many open-source and closed chat models on industry benchmarks.
LangChain:
- For building the conversational pipeline, including prompt templates and retrieval-based queries.
Ollama Embeddings:
- Converts text into numerical representations for similarity searches.

Database and Document Management

Chroma Vector Database:
- Stores embeddings of text documents for fast semantic search.
PyPDFDirectoryLoader:
- Extracts text content from PDF files.
Recursive Character Text Splitter:
- Splits large text into manageable chunks for better search and retrieval.
External File Format Support:
- Plans for future support for additional file types such as DOCX, TXT, and HTML.

Web Interface

NiceGUI:
- A Python-based framework for creating modern web interfaces.
- Implements real-time user interaction with AI.
Real-Time Conversation Handling:
- The web interface is optimized to handle live interactions, offering seamless conversation flow.

Other Utilities

Argparse:
- Handles command-line arguments for flexibility in usage.
Logging:
- Tracks user inputs and model outputs in a structured log file.
Datetime and Asyncio:
- Used for logging timestamps and managing real-time asynchronous tasks.
User Feedback Mechanism:
- A feedback loop that allows users to rate answers, improving response generation.

How It Works

PDF Processing:
- Upload PDF documents to the data directory.
- The system processes and splits documents into chunks with metadata, storing them in the Chroma database.
Querying:
- Users input a query through the web interface or command-line.
- The query is converted into embeddings, and the most relevant document chunks are retrieved.
Conversational Memory:
- The AI uses the last 5 exchanges to add depth and context to its answers.
Response Generation:
- Combines retrieved documents and conversation history to craft a response using the Llama 3.2 model.
Logging:
- All interactions are logged for later analysis.
Interactive Feedback:
- Users can rate responses or provide feedback, which can influence future model behavior and improve the system’s adaptability.

Model Details: Llama 3.2

The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text-only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open-source and closed chat models on common industry benchmarks.

Getting Started

Prerequisites

Python 3.8 or later
Required Python packages (install via pip install -r requirements.txt):
- langchain
- nicegui
- chromadb
- PyPDF2

Running the Application

Populate the Database:
- Place your PDF documents in the data directory.
- Run populate_database.py to process and store document embeddings.
Start the Web Interface:
- Run ui_utils.py to launch the NiceGUI-based web app.
- Access the app at http://localhost:8080.
Command-Line Query:
- Run main.py with a query argument:
  python main.py "Your query here"
User Feedback:
- Provide ratings or comments on responses directly from the web interface.

Example Use Cases

Contextual Question Answering:
- The AI retrieves relevant content from PDFs to answer questions accurately.
Multilingual Conversations:
- Supports queries and responses in multiple languages.
Agentic Retrieval:
- Summarizes large documents and provides concise answers.
Interactive Feedback Loop:
- Users can help fine-tune future responses by providing feedback.

Future Improvements

Add support for additional file types (e.g., DOCX, TXT).
Enhance multilingual capabilities with more tuned models.
Bigger LLM for better user experience.
Implenemtation on a actual running personal website.

Contact

For questions, suggestions, or contributions, feel free to reach out at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
__pycache__		__pycache__
chroma		chroma
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
admin.py		admin.py
db.py		db.py
file_utils.py		file_utils.py
get_embedding_function.py		get_embedding_function.py
interaction_log.log		interaction_log.log
log_utils.py		log_utils.py
main.py		main.py
nice.py		nice.py
populate_database.py		populate_database.py
requirements.txt		requirements.txt
test_rag.py		test_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project: Context-Supported AI Response System with RAG

Overview

Features

Technologies Used

Language Model and NLP Tools

Database and Document Management

Web Interface

Other Utilities

How It Works

Model Details: Llama 3.2

Getting Started

Prerequisites

Running the Application

Example Use Cases

Future Improvements

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

muriarty1893/rag-tutorial-v2

Folders and files

Latest commit

History

Repository files navigation

Project: Context-Supported AI Response System with RAG

Overview

Features

Technologies Used

Language Model and NLP Tools

Database and Document Management

Web Interface

Other Utilities

How It Works

Model Details: Llama 3.2

Getting Started

Prerequisites

Running the Application

Example Use Cases

Future Improvements

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages