Skip to content

LikeKNU/LikeKNU-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kongju University RAG Chatbot

Python LangChain ChromaDB OpenAI Redis

Advanced AI Chatbot System for Kongju National University

Enterprise-grade RAG (Retrieval Augmented Generation) system with event-driven architecture


Key Features

Intelligent Document Retrieval

  • Korean-optimized semantic search using KoSimCSE embeddings
  • Context-aware response generation with GPT-4o-mini
  • Campus-specific content filtering (Singwan/Cheonan/Yesan)

Real-time Data Synchronization

  • Event-driven architecture with Redis Pub/Sub messaging
  • Automatic vector store updates on content changes
  • Scalable microservices-ready infrastructure

Comprehensive Information Coverage

  • Academic announcements and notifications
  • Scholarship and administrative updates
  • Library services and resources
  • Student activities and club information
  • Campus facilities and services

Quick Start

# 1. Clone repository
git clone https://github.com/your-username/like-knu-rag.git
cd like-knu-rag

# 2. Install dependencies (Python 3.11 recommended)
pip install -r requirements.txt

# 3. Configure environment
echo "OPENAI_API_KEY=your_openai_api_key_here" > .env

# 4. Run tests
python3.11 tests/test_rag.py

Technology Stack

  • Python 3.11+ - Core development language
  • LangChain 0.3+ - RAG pipeline framework
  • ChromaDB 0.5+ - Vector database (embedded mode)
  • KoSimCSE - Korean embeddings (BM-K/KoSimCSE-roberta-multitask)
  • OpenAI GPT-4o-mini - Response generation model
  • FastAPI - REST API server (planned)

System Architecture

graph TD
    A[User Interface] --> B[API Layer]
    B --> C[Application Layer]
    C --> D[Domain Layer]
    C --> E[Infrastructure Layer]
    
    E --> F[Vector Store<br/>ChromaDB]
    E --> G[Embedding<br/>KoSimCSE]
    E --> H[LLM<br/>GPT-4o-mini]
    E --> I[Messaging<br/>Redis Pub/Sub]
    
    style A fill:#e1f5fe
    style F fill:#f3e5f5
    style G fill:#e8f5e8
    style H fill:#fff3e0
    style I fill:#ffebee
Loading

Project Structure (Clean Architecture)

like-knu-rag/
├── src/                          # Source code
│   ├── domain/                   # Domain layer
│   │   ├── models/               # Entities & Value objects
│   │   │   ├── notice.py         # Notice model
│   │   │   ├── campus.py         # Campus enum
│   │   │   └── common.py         # Common types
│   │   ├── repositories/         # Repository interfaces
│   │   └── services/             # Domain services
│   ├── application/              # Application layer
│   │   ├── dto/                  # Data transfer objects
│   │   ├── processors/           # Document processing
│   │   │   ├── document_processor.py
│   │   │   └── text_splitter.py
│   │   └── services/             # Application services
│   │       └── rag_service.py    # RAG system core
│   ├── infrastructure/           # Infrastructure layer
│   │   ├── embedding/            # Embedding models
│   │   │   └── korean_embeddings.py
│   │   ├── vector_store/         # Vector database
│   │   │   └── chroma_store.py
│   │   ├── messaging/            # Messaging system
│   │   │   ├── events.py         # Event models
│   │   │   ├── brokers/
│   │   │   │   └── redis_broker.py # Redis Pub/Sub
│   │   │   └── handlers/
│   │   │       └── notice_handler.py
│   │   └── repositories/         # Implementations
│   ├── interfaces/               # Interface layer
│   │   ├── api/                  # REST API (planned)
│   │   └── cli/                  # CLI (planned)
│   └── shared/                   # Shared layer
│       ├── exceptions/           # Exception handling
│       └── utils/                # Utilities
│           └── filters.py
├── tests/                        # Test code
│   ├── test_basic.py             # Basic functionality tests
│   └── test_rag.py               # RAG integration tests
├── data/chroma_db/               # Vector DB (auto-generated)
├── requirements.txt              # Dependencies
├── .env                          # Environment variables
├── CLAUDE.md                     # Development context
├── demo.py                       # Demo script
└── README.md

Installation and Setup

1. Environment Setup

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Environment Configuration

Set the following values in .env file:

EMBEDDING_MODEL=BM-K/KoSimCSE-roberta-multitask
OPENAI_API_KEY=your_openai_api_key_here

3. Run Tests

# Basic functionality tests
python3.11 tests/test_basic.py

# Complete RAG system tests (requires API key)
python3.11 tests/test_rag.py

Usage Examples

Demo Execution

# Experience chatbot with sample data
python3.11 demo.py

Code Usage

from src.application.services.rag_service import create_rag_system_with_sample_data

# Create RAG system with sample data
rag_system = create_rag_system_with_sample_data()

# Q&A interaction
response = rag_system.chat("When is course registration?")
print(f"Answer: {response['answer']}")
print(f"Sources: {len(response['sources'])} documents")

# Campus-specific filtering
response = rag_system.chat("Tell me about library information", campus="CHEONAN")
print(f"Cheonan campus answer: {response['answer']}")

# Search results analysis
for i, source in enumerate(response['sources']):
    print(f"{i + 1}. {source['title']} ({source['campus']})")

Event System (Advanced)

from src.infrastructure.messaging.brokers.redis_broker import create_message_broker
from src.infrastructure.messaging.events import NoticeEvent, EventType


async def setup_messaging():
    # Create Redis message broker
    broker = create_message_broker("redis://localhost:6379")
    await broker.start()

    # Publish notice event
    event = NoticeEvent(
        event_id="test-123",
        event_type=EventType.NOTICE_CREATED,
        notice_id="notice-456",
        title="New announcement",
        # ... other fields
    )

    await broker.publish("notices.created", event)
    await broker.stop()

Development Status

Implementation Progress

Feature Area Progress Status
Architecture ████████████████████ 100% ✅ Complete
AI Integration ████████████████████ 100% ✅ Complete
Vector Search ████████████████████ 100% ✅ Complete
Event System ██████████████████░░ 90% 🚧 In Progress
API Server ██████░░░░░░░░░░░░░░ 30% 🚧 Planned
Web Interface ██░░░░░░░░░░░░░░░░░░ 10% 📋 Designed

Completed Features

1. Clean Architecture Implementation
  • Domain-Driven Design - Business logic isolation
  • Dependency Inversion - Testable structure
  • Layer Separation - Domain → Application → Infrastructure → Interface
  • Modular Design - Independent feature development
2. Korean-Optimized AI System
  • KoSimCSE Embeddings - Maximized Korean language understanding
  • GPT-4o-mini Integration - Cost-effective response generation
  • Document Chunking - Optimized context processing
  • Similarity Search - Accurate relevant document extraction
3. Vector Database
  • ChromaDB Embedded - No separate server required
  • Deduplication System - notice_id based data integrity
  • Campus Filtering - Efficient search optimization
  • Real-time Updates - Dynamic document management
4. Event-Driven Messaging
  • Redis Pub/Sub - Asynchronous messaging system
  • Type-Safe Events - Pydantic model based
  • Scalable Architecture - Microservices ready
  • Real-time Synchronization - Data consistency guarantee

Development Roadmap

gantt
    title Kongju University Chatbot Development Timeline
    dateFormat  YYYY-MM-DD
    section Phase 1: Core System
    Architecture Design   :done, arch, 2025-07-01, 2025-07-05
    AI Model Integration  :done, ai, 2025-07-06, 2025-07-10
    Vector Search System  :done, vector, 2025-07-08, 2025-07-11
    
    section Phase 2: Extended System
    Event System         :active, event, 2025-07-11, 2025-07-15
    API Server Setup     :api, 2025-07-15, 2025-07-20
    Live Data Integration :data, 2025-07-18, 2025-07-25
    
    section Phase 3: User Interface
    Web Interface        :ui, 2025-07-22, 2025-07-30
    Mobile Optimization  :mobile, 2025-07-28, 2025-08-05
    Deployment & Ops     :deploy, 2025-08-01, 2025-08-10
Loading

Core Technology Stack

Domain Technology Rationale
AI/ML • KoSimCSE (Korean embeddings)
• OpenAI GPT-4o-mini
• LangChain
• Korean performance optimization
• Cost efficiency
• Rich ecosystem
Data • ChromaDB (vector)
• Redis (messaging)
• Pydantic (validation)
• Embedded mode support
• High-performance Pub/Sub
• Type safety
Architecture • Clean Architecture
• Event-driven
• Microservices ready
• Test ease
• Scalability
• Maintainability

Contributing

Building the Future of University Information Systems

All contributions are welcome - from issue reports to code contributions

How to Contribute

  1. Fork the project
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Develop and test your feature
  4. Commit your changes (git commit -m 'Add amazing feature')
  5. Push to the branch (git push origin feature/amazing-feature)
  6. Create a Pull Request

Contribution Areas

Bug Fixes

Report bugs as issues
or fix them directly

New Features

Add meal plans,
shuttle bus info,
and other data types

Documentation

Improve README,
comments, and
development guides

UI/UX

Web interface
design and
usability improvements

Issue Templates

Bug Report

## Bug Description

Brief description of the bug

## Steps to Reproduce

1. Go to '...' page
2. Click '...' button
3. Error occurs

## Expected Behavior

What should happen normally

## Environment

- OS: [e.g., Windows 11]
- Python: [e.g., 3.11.5]
- Browser: [e.g., Chrome 115]

Feature Request

## Feature Description

Description of the feature you'd like to add

## Background

Why this feature is needed

## Proposed Solution

Specific implementation approach

License

MIT License

Copyright (c) 2025 공주대처럼 챗봇 프로젝트

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

If this project helps you, please give it a star!

Built for Kongju National University students

University HomepageContact UsReport Issues

About

AI로 만드는 AI 챗봇..ㅋㅋ

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages