CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Development Commands

Running the Application

# Quick start using shell script
chmod +x run.sh
./run.sh

# Manual start
cd backend && uv run uvicorn app:app --reload --port 8000

Environment Setup

# Install dependencies
uv sync

# Add new dependencies (use uv instead of pip)
uv add package_name

# Environment variables required
# Create .env file with:
ANTHROPIC_API_KEY=your_anthropic_api_key_here

Python Execution

Always use uv for running Python files and commands:

# Run Python scripts
uv run python script.py

# Run any Python command
uv run command_name

Code Quality Tools

Prerequisites

Install development dependencies before using code quality tools:

uv sync --group dev

Available Scripts

Format Script (Modifies Files)

./scripts/format.sh

Use this script when you want to automatically fix code style issues. It will:

Sort imports with isort
Format code with Black
Run flake8 linting (reports remaining issues)
Run mypy type checking

Lint Script (Read-Only Checks)

./scripts/lint.sh

Use this script to verify code quality without modifying files. Perfect for:

Pre-commit checks
CI/CD pipelines
Verifying code before submitting PRs

Exit code 0 = all checks pass, non-zero = issues found.

Troubleshooting

If scripts aren't executable: chmod +x scripts/*.sh

Application Access

Web Interface: http://localhost:8000
API Documentation: http://localhost:8000/docs

Architecture Overview

This is a Retrieval-Augmented Generation (RAG) system for course materials with a FastAPI backend and vanilla JavaScript frontend.

Core Components

RAGSystem (backend/rag_system.py): Main orchestrator that coordinates all components

Manages document processing, vector storage, AI generation, and search tools
Handles course document ingestion from the docs/ directory
Processes queries using tool-based search approach

VectorStore (backend/vector_store.py): ChromaDB-based vector storage with dual collections

course_catalog: Stores course titles for name resolution
- Metadata: title, instructor, course_link, lesson_count, lessons_json (list of lessons with lesson_number, lesson_title, lesson_link)
course_content: Stores text chunks for semantic search
- Metadata: course_title, lesson_number, chunk_index
Supports filtered search by course name and lesson number

AIGenerator (backend/ai_generator.py): Anthropic Claude API integration

Uses claude-sonnet-4-20250514 model
Implements tool calling for search functionality
Maintains conversation history via SessionManager

Search Tools (backend/search_tools.py): Tool-based search system

CourseSearchTool: Semantic search across course content with intelligent course name resolution
ToolManager: Manages tool registration and execution for AI model

Data Flow

Course documents (PDF, DOCX, TXT) are loaded from docs/ directory on startup
DocumentProcessor chunks content and extracts course metadata
VectorStore stores both metadata and content in separate ChromaDB collections
User queries trigger AI generation with access to search tools
AI uses CourseSearchTool to find relevant content and generates responses
Frontend displays responses with source attribution

Key Configuration (backend/config.py)

Chunk size: 800 characters with 100 character overlap
Embedding model: all-MiniLM-L6-v2 (SentenceTransformers)
Max search results: 5 per query
Conversation history: 2 message pairs

Frontend Architecture

Single-page application with vanilla JavaScript
Real-time course statistics display
Markdown rendering support for AI responses
Responsive design with sidebar for course info and suggested queries

Development Notes

The system automatically loads documents from docs/ on startup
ChromaDB data persists in backend/chroma_db/
FastAPI serves both API endpoints (/api/*) and static frontend files
CORS is configured for development with broad permissions
No-cache headers are set for static files during development

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Development Commands

Running the Application

Environment Setup

Python Execution

Code Quality Tools

Prerequisites

Available Scripts

Troubleshooting

Application Access

Architecture Overview

Core Components

Data Flow

Key Configuration (backend/config.py)

Frontend Architecture

Development Notes

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Development Commands

Running the Application

Environment Setup

Python Execution

Code Quality Tools

Prerequisites

Available Scripts

Troubleshooting

Application Access

Architecture Overview

Core Components

Data Flow

Key Configuration (backend/config.py)

Frontend Architecture

Development Notes