Advanced AI Chatbot System for Kongju National University
Enterprise-grade RAG (Retrieval Augmented Generation) system with event-driven architecture
Intelligent Document Retrieval
- Korean-optimized semantic search using KoSimCSE embeddings
- Context-aware response generation with GPT-4o-mini
- Campus-specific content filtering (Singwan/Cheonan/Yesan)
Real-time Data Synchronization
- Event-driven architecture with Redis Pub/Sub messaging
- Automatic vector store updates on content changes
- Scalable microservices-ready infrastructure
Comprehensive Information Coverage
- Academic announcements and notifications
- Scholarship and administrative updates
- Library services and resources
- Student activities and club information
- Campus facilities and services
# 1. Clone repository
git clone https://github.com/your-username/like-knu-rag.git
cd like-knu-rag
# 2. Install dependencies (Python 3.11 recommended)
pip install -r requirements.txt
# 3. Configure environment
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env
# 4. Run tests
python3.11 tests/test_rag.py- Python 3.11+ - Core development language
- LangChain 0.3+ - RAG pipeline framework
- ChromaDB 0.5+ - Vector database (embedded mode)
- KoSimCSE - Korean embeddings (BM-K/KoSimCSE-roberta-multitask)
- OpenAI GPT-4o-mini - Response generation model
- FastAPI - REST API server (planned)
graph TD
A[User Interface] --> B[API Layer]
B --> C[Application Layer]
C --> D[Domain Layer]
C --> E[Infrastructure Layer]
E --> F[Vector Store<br/>ChromaDB]
E --> G[Embedding<br/>KoSimCSE]
E --> H[LLM<br/>GPT-4o-mini]
E --> I[Messaging<br/>Redis Pub/Sub]
style A fill:#e1f5fe
style F fill:#f3e5f5
style G fill:#e8f5e8
style H fill:#fff3e0
style I fill:#ffebee
like-knu-rag/
├── src/ # Source code
│ ├── domain/ # Domain layer
│ │ ├── models/ # Entities & Value objects
│ │ │ ├── notice.py # Notice model
│ │ │ ├── campus.py # Campus enum
│ │ │ └── common.py # Common types
│ │ ├── repositories/ # Repository interfaces
│ │ └── services/ # Domain services
│ ├── application/ # Application layer
│ │ ├── dto/ # Data transfer objects
│ │ ├── processors/ # Document processing
│ │ │ ├── document_processor.py
│ │ │ └── text_splitter.py
│ │ └── services/ # Application services
│ │ └── rag_service.py # RAG system core
│ ├── infrastructure/ # Infrastructure layer
│ │ ├── embedding/ # Embedding models
│ │ │ └── korean_embeddings.py
│ │ ├── vector_store/ # Vector database
│ │ │ └── chroma_store.py
│ │ ├── messaging/ # Messaging system
│ │ │ ├── events.py # Event models
│ │ │ ├── brokers/
│ │ │ │ └── redis_broker.py # Redis Pub/Sub
│ │ │ └── handlers/
│ │ │ └── notice_handler.py
│ │ └── repositories/ # Implementations
│ ├── interfaces/ # Interface layer
│ │ ├── api/ # REST API (planned)
│ │ └── cli/ # CLI (planned)
│ └── shared/ # Shared layer
│ ├── exceptions/ # Exception handling
│ └── utils/ # Utilities
│ └── filters.py
├── tests/ # Test code
│ ├── test_basic.py # Basic functionality tests
│ └── test_rag.py # RAG integration tests
├── data/chroma_db/ # Vector DB (auto-generated)
├── requirements.txt # Dependencies
├── .env # Environment variables
├── CLAUDE.md # Development context
├── demo.py # Demo script
└── README.md
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtSet the following values in .env file:
EMBEDDING_MODEL=BM-K/KoSimCSE-roberta-multitask
OPENAI_API_KEY=your_openai_api_key_here# Basic functionality tests
python3.11 tests/test_basic.py
# Complete RAG system tests (requires API key)
python3.11 tests/test_rag.py# Experience chatbot with sample data
python3.11 demo.pyfrom src.application.services.rag_service import create_rag_system_with_sample_data
# Create RAG system with sample data
rag_system = create_rag_system_with_sample_data()
# Q&A interaction
response = rag_system.chat("When is course registration?")
print(f"Answer: {response['answer']}")
print(f"Sources: {len(response['sources'])} documents")
# Campus-specific filtering
response = rag_system.chat("Tell me about library information", campus="CHEONAN")
print(f"Cheonan campus answer: {response['answer']}")
# Search results analysis
for i, source in enumerate(response['sources']):
print(f"{i + 1}. {source['title']} ({source['campus']})")from src.infrastructure.messaging.brokers.redis_broker import create_message_broker
from src.infrastructure.messaging.events import NoticeEvent, EventType
async def setup_messaging():
# Create Redis message broker
broker = create_message_broker("redis://localhost:6379")
await broker.start()
# Publish notice event
event = NoticeEvent(
event_id="test-123",
event_type=EventType.NOTICE_CREATED,
notice_id="notice-456",
title="New announcement",
# ... other fields
)
await broker.publish("notices.created", event)
await broker.stop()| Feature Area | Progress | Status |
|---|---|---|
| Architecture | ████████████████████ 100% |
✅ Complete |
| AI Integration | ████████████████████ 100% |
✅ Complete |
| Vector Search | ████████████████████ 100% |
✅ Complete |
| Event System | ██████████████████░░ 90% |
🚧 In Progress |
| API Server | ██████░░░░░░░░░░░░░░ 30% |
🚧 Planned |
| Web Interface | ██░░░░░░░░░░░░░░░░░░ 10% |
📋 Designed |
1. Clean Architecture Implementation
- Domain-Driven Design - Business logic isolation
- Dependency Inversion - Testable structure
- Layer Separation - Domain → Application → Infrastructure → Interface
- Modular Design - Independent feature development
2. Korean-Optimized AI System
- KoSimCSE Embeddings - Maximized Korean language understanding
- GPT-4o-mini Integration - Cost-effective response generation
- Document Chunking - Optimized context processing
- Similarity Search - Accurate relevant document extraction
3. Vector Database
- ChromaDB Embedded - No separate server required
- Deduplication System - notice_id based data integrity
- Campus Filtering - Efficient search optimization
- Real-time Updates - Dynamic document management
4. Event-Driven Messaging
- Redis Pub/Sub - Asynchronous messaging system
- Type-Safe Events - Pydantic model based
- Scalable Architecture - Microservices ready
- Real-time Synchronization - Data consistency guarantee
gantt
title Kongju University Chatbot Development Timeline
dateFormat YYYY-MM-DD
section Phase 1: Core System
Architecture Design :done, arch, 2025-07-01, 2025-07-05
AI Model Integration :done, ai, 2025-07-06, 2025-07-10
Vector Search System :done, vector, 2025-07-08, 2025-07-11
section Phase 2: Extended System
Event System :active, event, 2025-07-11, 2025-07-15
API Server Setup :api, 2025-07-15, 2025-07-20
Live Data Integration :data, 2025-07-18, 2025-07-25
section Phase 3: User Interface
Web Interface :ui, 2025-07-22, 2025-07-30
Mobile Optimization :mobile, 2025-07-28, 2025-08-05
Deployment & Ops :deploy, 2025-08-01, 2025-08-10
| Domain | Technology | Rationale |
|---|---|---|
| AI/ML |
• KoSimCSE (Korean embeddings) • OpenAI GPT-4o-mini • LangChain |
• Korean performance optimization • Cost efficiency • Rich ecosystem |
| Data |
• ChromaDB (vector) • Redis (messaging) • Pydantic (validation) |
• Embedded mode support • High-performance Pub/Sub • Type safety |
| Architecture |
• Clean Architecture • Event-driven • Microservices ready |
• Test ease • Scalability • Maintainability |
Building the Future of University Information Systems
All contributions are welcome - from issue reports to code contributions
- Fork the project
- Create a feature branch (
git checkout -b feature/amazing-feature) - Develop and test your feature
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Create a Pull Request
|
Report bugs as issues |
Add meal plans, |
Improve README, |
Web interface |
Bug Report
## Bug Description
Brief description of the bug
## Steps to Reproduce
1. Go to '...' page
2. Click '...' button
3. Error occurs
## Expected Behavior
What should happen normally
## Environment
- OS: [e.g., Windows 11]
- Python: [e.g., 3.11.5]
- Browser: [e.g., Chrome 115]Feature Request
## Feature Description
Description of the feature you'd like to add
## Background
Why this feature is needed
## Proposed Solution
Specific implementation approachMIT License
Copyright (c) 2025 공주대처럼 챗봇 프로젝트
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
If this project helps you, please give it a star!
Built for Kongju National University students