This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Quick start using shell script
chmod +x run.sh
./run.sh
# Manual start
cd backend && uv run uvicorn app:app --reload --port 8000# Install dependencies
uv sync
# Add new dependencies (use uv instead of pip)
uv add package_name
# Environment variables required
# Create .env file with:
ANTHROPIC_API_KEY=your_anthropic_api_key_hereAlways use uv for running Python files and commands:
# Run Python scripts
uv run python script.py
# Run any Python command
uv run command_nameInstall development dependencies before using code quality tools:
uv sync --group devFormat Script (Modifies Files)
./scripts/format.shUse this script when you want to automatically fix code style issues. It will:
- Sort imports with isort
- Format code with Black
- Run flake8 linting (reports remaining issues)
- Run mypy type checking
Lint Script (Read-Only Checks)
./scripts/lint.shUse this script to verify code quality without modifying files. Perfect for:
- Pre-commit checks
- CI/CD pipelines
- Verifying code before submitting PRs
Exit code 0 = all checks pass, non-zero = issues found.
If scripts aren't executable: chmod +x scripts/*.sh
- Web Interface: http://localhost:8000
- API Documentation: http://localhost:8000/docs
This is a Retrieval-Augmented Generation (RAG) system for course materials with a FastAPI backend and vanilla JavaScript frontend.
RAGSystem (backend/rag_system.py): Main orchestrator that coordinates all components
- Manages document processing, vector storage, AI generation, and search tools
- Handles course document ingestion from the docs/ directory
- Processes queries using tool-based search approach
VectorStore (backend/vector_store.py): ChromaDB-based vector storage with dual collections
course_catalog: Stores course titles for name resolution- Metadata: title, instructor, course_link, lesson_count, lessons_json (list of lessons with lesson_number, lesson_title, lesson_link)
course_content: Stores text chunks for semantic search- Metadata: course_title, lesson_number, chunk_index
- Supports filtered search by course name and lesson number
AIGenerator (backend/ai_generator.py): Anthropic Claude API integration
- Uses claude-sonnet-4-20250514 model
- Implements tool calling for search functionality
- Maintains conversation history via SessionManager
Search Tools (backend/search_tools.py): Tool-based search system
- CourseSearchTool: Semantic search across course content with intelligent course name resolution
- ToolManager: Manages tool registration and execution for AI model
- Course documents (PDF, DOCX, TXT) are loaded from docs/ directory on startup
- DocumentProcessor chunks content and extracts course metadata
- VectorStore stores both metadata and content in separate ChromaDB collections
- User queries trigger AI generation with access to search tools
- AI uses CourseSearchTool to find relevant content and generates responses
- Frontend displays responses with source attribution
- Chunk size: 800 characters with 100 character overlap
- Embedding model: all-MiniLM-L6-v2 (SentenceTransformers)
- Max search results: 5 per query
- Conversation history: 2 message pairs
- Single-page application with vanilla JavaScript
- Real-time course statistics display
- Markdown rendering support for AI responses
- Responsive design with sidebar for course info and suggested queries
- The system automatically loads documents from docs/ on startup
- ChromaDB data persists in backend/chroma_db/
- FastAPI serves both API endpoints (/api/*) and static frontend files
- CORS is configured for development with broad permissions
- No-cache headers are set for static files during development