A vector-based search system for Amazon product data using Qdrant for similarity search and sentence-transformers for embedding generation.
Academic Project: This project was developed for the course "Recuperación de Información y Recomendaciones en la Web" at Facultad de Ingeniería, Universidad de la República. Course information: https://eva.fing.edu.uy/course/view.php?id=821
- Semantic search for Amazon products
- Vector embeddings with sentence-transformers
- Fast similarity search with Qdrant
- Type-safe Python codebase
- Command-line interface
- Python 3.12+
- Docker
-
Setup Python environment with pyenv:
# Install Python 3.12 pyenv install 3.12.2 # Set local Python version pyenv local 3.12.2
-
Install dependencies with uv:
# Install uv if you don't have it pip install uv # Create virtual environment uv venv # Activate virtual environment source .venv/bin/activate # On Windows: .venv\Scripts\activate # Sync dependencies with specific extras uv sync --extra dev --extra backend # Or sync all dependency groups at once uv sync --all-extras
-
Start Qdrant database:
# Use Docker Compose (recommended) docker-compose up -d -
Configure environment:
cp .env.example .env
# Create a collection
amazon-copilot create-collection amazon_products
# Load product data from CSV
amazon-copilot load-products data/Amazon-Products.csv amazon_products# Search using the CLI
amazon-copilot search-products "wireless headphones" --collection-name amazon_productsRun the FastAPI development server:
uvicorn amazon_copilot.api.main:app --reloadAccess the API documentation at http://localhost:8000/docs
For detailed guides, refer to:
- Detailed Setup Guide - Full installation instructions
- Data Loading Guide - How to load and manage product data
- Search Guide - Advanced search options and filtering
- Development Guide - For contributors and developers
- Qdrant Setup Guide - Vector database configuration
amazon-copilot/
├── src/amazon_copilot/
│ ├── api/ # FastAPI implementation
│ │ ├── main.py # API entry point
│ │ └── routers/ # API route definitions
│ ├── services/ # Business logic services
│ ├── qdrant_client.py # Qdrant vector DB client
│ ├── cli.py # Command-line interface
│ ├── schemas.py # Data schemas
│ └── utils.py # Utility functions
├── data/ # Data directory
├── docs/ # Documentation
├── pyproject.toml # Project configuration
├── docker-compose.yml # Docker configuration
└── README.md # This file