Final Project Sistem Basis Data Kelompok 14

An intelligent learning assistant powered by RAG (Retrieval-Augmented Generation) to help students better understand their course materials.
TanyaJawab is a comprehensive web-based learning assistant that empowers students to better understand their course materials through advanced AI technology. The platform combines document management, intelligent Q&A capabilities, and study organization tools into one seamless experience.
- 📄 Document Intelligence: Upload lecture notes, assignments, and syllabi in PDF format
- 🔍 Smart Q&A: Ask questions about your documents and get contextually relevant answers
- 📅 Course Scheduling: Manage your weekly class schedule
- ✅ Assignment Tracking: Keep track of all your tasks and deadlines
- 🔐 Secure Authentication: Log in via GitHub OAuth or local credentials
TanyaJawab leverages cutting-edge Retrieval-Augmented Generation (RAG) to provide accurate, contextual answers about your course materials:
-
Document Processing:
- Upload your PDF documents
- Our system extracts text and images using Gemini Vision API
- Content is split into manageable chunks and converted to vector embeddings
- These embeddings are stored in Qdrant, a vector database optimized for semantic search
-
Intelligent Q&A:
- Ask questions about any of your uploaded documents
- The system finds the most relevant content chunks in your documents
- It augments the question with this context and passes it to a Large Language Model
- You receive accurate answers based specifically on your materials
TanyaJawab is built on a modern tech stack designed for performance, reliability, and scalability:
- Frontend: React.js with Tailwind CSS
- Backend: Python (FastAPI) with processing workers
- Databases:
- PostgreSQL: Core relational data
- Qdrant: Vector embeddings for semantic search
- Redis: Caching for performance optimization
- AI Integration:
- Gemini Vision API: Document text extraction
- Embedding Models: Vector generation
- LLM API: Answer generation
- Python 3.9+
- Node.js 18+
- Docker and Docker Compose
- PostgreSQL 14+
- Redis 6+
-
Clone this repository:
git clone https://github.com/coolcmyk/TanyaJawab.git cd TanyaJawab -
Set up environment variables:
cp .env.example .env # Edit .env with your configuration -
Start the databases using Docker:
docker-compose up -d postgres redis qdrant
-
Install backend dependencies:
cd backend python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
-
Run database migrations:
alembic upgrade head
-
Install frontend dependencies:
cd ../frontend npm install -
Start the development servers:
# In one terminal (backend) cd backend uvicorn app.main:app --reload # In another terminal (frontend) cd frontend npm run dev
-
Visit
http://localhost:3000in your browser
# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/TanyaJawab
REDIS_URL=redis://localhost:6379/0
QDRANT_URL=http://localhost:6333
# Authentication
SECRET_KEY=your_secret_key
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
# AI Services
GEMINI_API_KEY=your_gemini_api_key
LLM_API_KEY=your_llm_api_key
EMBEDDING_MODEL=text-embedding-ada-002
# Run backend tests
cd backend
pytest
# Run frontend tests
cd frontend
npm testAPI documentation is available at /docs or /redoc when running the backend server.
TanyaJawab/
├── backend/
│ ├── app/
│ │ ├── api/
│ │ ├── core/
│ │ ├── db/
│ │ ├── models/
│ │ ├── schemas/
│ │ ├── services/
│ │ │ ├── rag/
│ │ │ ├── document/
│ │ │ └── ...
│ │ └── main.py
│ ├── alembic/
│ └── requirements.txt
├── frontend/
│ ├── public/
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ ├── hooks/
│ │ ├── contexts/
│ │ └── ...
│ └── package.json
├── docker-compose.yml
└── README.md
The system uses multiple data stores:
- users: User account information
- courses: Course and schedule information
- assignments: Task tracking with due dates
- documents: Document metadata
- parsed_pages: Extracted content from document pages
- doc_chunks: Document chunks with embeddings for semantic search
- Used for caching extraction results and RAG query responses
- All user data is isolated and secured
- Authentication via GitHub OAuth or secure local authentication
- Document access is restricted to the uploading user
- All sensitive environment variables are kept secure
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Mobile app version
- Support for more document types (DOCX, PPTX)
- Collaborative study sessions
- Spaced repetition flashcards generated from documents
- Advanced analytics on study habits
Built with ❤️ for students by students
This guide provides comprehensive installation instructions for setting up the TanyaJawab system on your local development environment.
Before starting the installation, make sure you have the following prerequisites installed:
- Python: Version 3.9 or higher
- Node.js: Version 18 or higher
- Docker & Docker Compose: Latest stable version
- Git: Latest version
- PostgreSQL (optional if using Docker): Version 14 or higher
- Redis (optional if using Docker): Version 6 or higher
git clone https://github.com/yourusername/TanyaJawab.git
cd TanyaJawabCreate environment files for both backend and frontend:
# Copy the example environment files
cp backend/.env.example backend/.env
cp frontend/.env.example frontend/.envEdit the backend/.env file with the following configuration:
# Application Settings
APP_NAME=TanyaJawab
APP_ENV=development
DEBUG=True
LOG_LEVEL=INFO
# Server Settings
HOST=0.0.0.0
PORT=8000
CORS_ORIGINS=http://localhost:3000
# Database Configuration
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/TanyaJawab
REDIS_URL=redis://localhost:6379/0
QDRANT_URL=http://localhost:6333
# Authentication
SECRET_KEY=your_very_secure_secret_key
AUTH_TOKEN_EXPIRE_MINUTES=60
REFRESH_TOKEN_EXPIRE_DAYS=7
# OAuth Settings (GitHub)
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
GITHUB_CALLBACK_URL=http://localhost:8000/api/auth/github/callback
# AI Services
GEMINI_API_KEY=your_gemini_api_key
LLM_API_KEY=your_llm_api_key
EMBEDDING_MODEL=text-embedding-ada-002
EMBEDDING_DIMENSION=1536
CHUNK_SIZE=512
CHUNK_OVERLAP=50Edit the frontend/.env file:
VITE_API_BASE_URL=http://localhost:8000/api
VITE_GITHUB_AUTH_URL=http://localhost:8000/api/auth/githubThis setup runs only the databases in Docker while the application runs locally:
# Start the required database services
docker-compose up -d postgres redis qdrantThe docker-compose.yml file should contain:
version: '3.8'
services:
postgres:
image: postgres:14-alpine
ports:
- "5432:5432"
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: TanyaJawab
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:6-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
qdrant:
image: qdrant/qdrant:latest
ports:
- "6333:6333"
- "6334:6334"
volumes:
- qdrant_data:/qdrant/storage
volumes:
postgres_data:
redis_data:
qdrant_data:For a full containerized setup, you can use:
# Build and start all services
docker-compose -f docker-compose.full.yml up -d# Navigate to backend directory
cd backend
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Apply database migrations
alembic upgrade head# Run the Qdrant initialization script
python scripts/init_qdrant.py# Start the FastAPI server with hot reloading
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000# Navigate to frontend directory
cd ../frontend
# Install dependencies
npm install
# Start the development server
npm run devThe frontend will be available at http://localhost:3000.
# Run the admin user creation script
cd ../backend
python scripts/create_admin.py- Open your browser and navigate to
http://localhost:3000 - Register a new account or sign in with GitHub
- The API documentation is available at
http://localhost:8000/docs
If you encounter database connection errors:
# Check if PostgreSQL container is running
docker ps | grep postgres
# Check PostgreSQL logs
docker logs TanyaJawab-postgres-1If the RAG system isn't working properly:
# Verify Qdrant is running
curl http://localhost:6333/collections
# Reinitialize the collections if needed
python scripts/init_qdrant.py --forceIf you encounter authentication problems:
- Verify your SECRET_KEY in the .env file
- Check GitHub OAuth settings match your application settings
- Clear browser cookies and try again
# Backend tests
cd backend
pytest
# Frontend tests
cd frontend
npm testWhen changing models:
# Generate migration
alembic revision --autogenerate -m "Description of changes"
# Apply migration
alembic upgrade head# Backend
pip install new-package
pip freeze > requirements.txt
# Frontend
npm install new-package --saveFor production deployment, additional steps are recommended:
- Use proper SSL/TLS certificates
- Configure proper web server (Nginx/Apache)
- Set up monitoring (Prometheus/Grafana)
- Configure backups for all databases
- Set
APP_ENV=productionandDEBUG=False


