From 490850d7a6a6ed783996f7149a4bf36b11edaa6e Mon Sep 17 00:00:00 2001 From: 0xWheatyz Date: Thu, 12 Mar 2026 23:51:32 -0400 Subject: [PATCH] docs: reorganize documentation into docs/ directory MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move CONTAINER_REGISTRY.md and DATABASE_MODE.md to docs/ - Add comprehensive DEPLOYMENT.md with full deployment instructions - Update README.md with documentation section linking to docs/ - Keep README.md at root for GitHub visibility πŸ€– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- README.md | 8 + .../CONTAINER_REGISTRY.md | 0 DATABASE_MODE.md => docs/DATABASE_MODE.md | 0 docs/DEPLOYMENT.md | 438 ++++++++++++++++++ 4 files changed, 446 insertions(+) rename CONTAINER_REGISTRY.md => docs/CONTAINER_REGISTRY.md (100%) rename DATABASE_MODE.md => docs/DATABASE_MODE.md (100%) create mode 100644 docs/DEPLOYMENT.md diff --git a/README.md b/README.md index 4a91ac4..deb45da 100644 --- a/README.md +++ b/README.md @@ -246,6 +246,14 @@ pytest tests/ --cov=SPARC --cov-report=term-missing Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore` +## Documentation + +Additional documentation is available in the `docs/` directory: + +- **[Deployment Guide](docs/DEPLOYMENT.md)** - Complete deployment instructions for Docker, database setup, and production configuration +- **[Database Mode](docs/DATABASE_MODE.md)** - Database storage for prompts, responses, and analytics +- **[Container Registry](docs/CONTAINER_REGISTRY.md)** - CI/CD and container registry setup with Gitea Actions + ## License For open source projects, say how it is licensed. diff --git a/CONTAINER_REGISTRY.md b/docs/CONTAINER_REGISTRY.md similarity index 100% rename from CONTAINER_REGISTRY.md rename to docs/CONTAINER_REGISTRY.md diff --git a/DATABASE_MODE.md b/docs/DATABASE_MODE.md similarity index 100% rename from DATABASE_MODE.md rename to docs/DATABASE_MODE.md diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md new file mode 100644 index 0000000..d98f403 --- /dev/null +++ b/docs/DEPLOYMENT.md @@ -0,0 +1,438 @@ +# SPARC Complete Deployment Guide + +This guide provides step-by-step instructions for deploying the SPARC (Semiconductor Patent & Analytics Report Core) application with all features enabled, including SERP API patent retrieval, LLM analysis, database storage, and the web UI. + +## Table of Contents + +- [Prerequisites](#prerequisites) +- [Step 1: Clone and Configure](#step-1-clone-and-configure) +- [Step 2: Start Services with Docker Compose](#step-2-start-services-with-docker-compose) +- [Step 3: Initialize the Database](#step-3-initialize-the-database) +- [Step 4: Run the Services](#step-4-run-the-services) +- [Step 5: Verify Deployment](#step-5-verify-deployment) +- [Step 6: Using the Application](#step-6-using-the-application) +- [Step 7: View Stored Data](#step-7-view-stored-data) +- [Architecture Overview](#architecture-overview) +- [Environment Variables Reference](#environment-variables-reference) +- [Production Docker Compose](#production-docker-compose) +- [Troubleshooting](#troubleshooting) + +--- + +## Prerequisites + +1. **Docker & Docker Compose** installed +2. **API Keys** (you'll need to obtain these): + - **SerpAPI Key**: Sign up at https://serpapi.com/ (free tier: 100 searches/month) + - **OpenRouter API Key**: Sign up at https://openrouter.ai/ (pay-as-you-go) + +--- + +## Step 1: Clone and Configure + +```bash +git clone +cd SPARC + +# Create environment file +cp .env.example .env +``` + +Edit `.env` with your API keys: + +```env +# Required API Keys +API_KEY=your_serpapi_key_here +OPENROUTER_API_KEY=your_openrouter_key_here + +# Database Configuration (matches docker-compose.yml) +DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc +USE_DATABASE=true +``` + +--- + +## Step 2: Start Services with Docker Compose + +```bash +# Start PostgreSQL database +docker-compose up -d postgres + +# Wait for postgres to be healthy (check with) +docker-compose ps + +# You should see sparc-postgres with status "healthy" +``` + +--- + +## Step 3: Initialize the Database + +```bash +# Option A: If running locally with Python +python scripts/init_database.py + +# Option B: If using Docker, run inside container +docker-compose run --rm sparc-app python scripts/init_database.py +``` + +This creates the `llm_messages` table with the following schema: + +| Column | Type | Purpose | +|--------|------|---------| +| `id` | SERIAL | Primary key | +| `timestamp` | TIMESTAMP | Message creation time | +| `company_name` | VARCHAR(255) | Company being analyzed | +| `analysis_type` | VARCHAR(50) | 'single_patent' or 'portfolio' | +| `model` | VARCHAR(100) | LLM model identifier | +| `prompt` | TEXT | Full prompt sent to LLM | +| `response` | TEXT | LLM response | +| `metadata` | JSONB | Patent IDs, content lengths | +| `token_usage` | JSONB | prompt/completion/total tokens | +| `created_at` | TIMESTAMP | Record timestamp | + +--- + +## Step 4: Run the Services + +### Option A: Run Locally (Development) + +```bash +# Terminal 1: Start FastAPI backend +uvicorn SPARC.api:app --host 0.0.0.0 --port 8000 --reload + +# Terminal 2: Start Streamlit dashboard +streamlit run dashboard.py --server.port 8501 --server.address 0.0.0.0 +``` + +### Option B: Run with Docker (Production) + +See [Production Docker Compose](#production-docker-compose) section below for a complete `docker-compose.prod.yml` configuration. + +```bash +docker-compose -f docker-compose.prod.yml up -d +``` + +--- + +## Step 5: Verify Deployment + +```bash +# Check API health +curl http://localhost:8000/health + +# Expected response: +# {"status":"healthy","version":"0.1.0","timestamp":"..."} +``` + +Access the services: + +| Service | URL | +|---------|-----| +| REST API | http://localhost:8000 | +| API Documentation (Swagger) | http://localhost:8000/docs | +| Dashboard (Web UI) | http://localhost:8501 | + +--- + +## Step 6: Using the Application + +### Via Dashboard (Web UI) + +1. Open http://localhost:8501 +2. Select **"Company Analysis"** from the sidebar +3. Enter a company name (e.g., "Intel") +4. Click **"Analyze"** + +This will: +- Query SerpAPI for recent patents +- Download and parse patent PDFs +- Send patent content to Claude for analysis +- Store prompt/response in PostgreSQL +- Display results in the dashboard + +### Via REST API + +```bash +# Analyze single company +curl http://localhost:8000/analyze/Intel + +# Batch analyze multiple companies (synchronous) +curl -X POST http://localhost:8000/analyze/batch \ + -H "Content-Type: application/json" \ + -d '{"companies": ["Intel", "AMD", "NVIDIA"], "max_workers": 3}' + +# Async batch (for large jobs) +curl -X POST http://localhost:8000/analyze/batch/async \ + -H "Content-Type: application/json" \ + -d '{"companies": ["Intel", "AMD"]}' + +# Check job status +curl http://localhost:8000/jobs/{job_id} + +# List all jobs +curl http://localhost:8000/jobs +``` + +### Via Python + +```python +from SPARC.analyzer import CompanyAnalyzer + +analyzer = CompanyAnalyzer() +result = analyzer.analyze("Intel") +print(result.analysis) +``` + +--- + +## Step 7: View Stored Data + +```bash +# View analytics (aggregated usage) +python scripts/view_analytics.py + +# View stored messages +python scripts/view_messages.py + +# Query database directly +docker exec -it sparc-postgres psql -U postgres -d sparc -c \ + "SELECT company_name, analysis_type, token_usage FROM llm_messages ORDER BY timestamp DESC LIMIT 10;" +``` + +--- + +## Architecture Overview + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Dashboard │───▢│ FastAPI │───▢│ Analyzer β”‚ +β”‚ (8501) β”‚ β”‚ (8000) β”‚ β”‚ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ β”‚ β”‚ + β–Ό β–Ό β–Ό + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ SerpAPI β”‚ β”‚ OpenRouter β”‚ β”‚ PostgreSQL β”‚ + β”‚ (Patents) β”‚ β”‚ (Claude) β”‚ β”‚ (Storage) β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +### Component Responsibilities + +| Component | Purpose | +|-----------|---------| +| **Dashboard** | Streamlit web UI for interactive analysis | +| **FastAPI** | REST API for programmatic access | +| **Analyzer** | Orchestrates patent retrieval and LLM analysis | +| **SerpAPI** | Retrieves patent data from Google Patents | +| **OpenRouter** | Routes requests to Claude for AI analysis | +| **PostgreSQL** | Stores prompts, responses, and analytics | + +--- + +## Environment Variables Reference + +| Variable | Required | Default | Description | +|----------|----------|---------|-------------| +| `API_KEY` | Yes | - | SerpAPI key for patent search | +| `OPENROUTER_API_KEY` | Yes | - | OpenRouter API key for Claude access | +| `DATABASE_URL` | Yes* | - | PostgreSQL connection string | +| `USE_DATABASE` | No | `false` | Set to `true` to enable database storage | + +*Required when `USE_DATABASE=true` + +### Database URL Format + +``` +postgresql://[user]:[password]@[host]:[port]/[database] +``` + +Example: +``` +postgresql://postgres:postgres@localhost:5432/sparc +``` + +--- + +## Production Docker Compose + +Create a `docker-compose.prod.yml` file for full production deployment: + +```yaml +version: '3.8' + +services: + postgres: + image: postgres:16-alpine + container_name: sparc-postgres + environment: + POSTGRES_USER: postgres + POSTGRES_PASSWORD: postgres + POSTGRES_DB: sparc + volumes: + - postgres_data:/var/lib/postgresql/data + ports: + - "5432:5432" + healthcheck: + test: ["CMD-SHELL", "pg_isready -U postgres"] + interval: 5s + timeout: 5s + retries: 5 + restart: unless-stopped + + api: + build: . + container_name: sparc-api + command: uvicorn SPARC.api:app --host 0.0.0.0 --port 8000 + environment: + - API_KEY=${API_KEY} + - OPENROUTER_API_KEY=${OPENROUTER_API_KEY} + - DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc + - USE_DATABASE=true + ports: + - "8000:8000" + depends_on: + postgres: + condition: service_healthy + volumes: + - ./patents:/app/patents + restart: unless-stopped + + dashboard: + build: . + container_name: sparc-dashboard + command: streamlit run dashboard.py --server.port 8501 --server.address 0.0.0.0 + environment: + - API_KEY=${API_KEY} + - OPENROUTER_API_KEY=${OPENROUTER_API_KEY} + - DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc + - USE_DATABASE=true + ports: + - "8501:8501" + depends_on: + - api + volumes: + - ./patents:/app/patents + restart: unless-stopped + + init-db: + build: . + container_name: sparc-init-db + command: python scripts/init_database.py + environment: + - DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc + - USE_DATABASE=true + depends_on: + postgres: + condition: service_healthy + restart: "no" + +volumes: + postgres_data: +``` + +### Deploy with Production Compose + +```bash +# Start all services +docker-compose -f docker-compose.prod.yml up -d + +# View logs +docker-compose -f docker-compose.prod.yml logs -f + +# Stop all services +docker-compose -f docker-compose.prod.yml down + +# Stop and remove volumes (WARNING: deletes data) +docker-compose -f docker-compose.prod.yml down -v +``` + +--- + +## Troubleshooting + +### Database Connection Issues + +```bash +# Check if postgres is running +docker-compose ps + +# Check postgres logs +docker-compose logs postgres + +# Test database connection +docker exec -it sparc-postgres psql -U postgres -d sparc -c "SELECT 1;" +``` + +### API Key Issues + +```bash +# Verify environment variables are set +echo $API_KEY +echo $OPENROUTER_API_KEY + +# Test SerpAPI directly +curl "https://serpapi.com/search?engine=google_patents&q=Intel&api_key=$API_KEY" +``` + +### Port Conflicts + +If ports 8000, 8501, or 5432 are in use: + +```bash +# Find what's using the port +lsof -i :8000 + +# Or change ports in docker-compose.yml +ports: + - "8080:8000" # Use 8080 instead of 8000 +``` + +### Container Issues + +```bash +# Rebuild containers after code changes +docker-compose build --no-cache + +# Remove all containers and start fresh +docker-compose down +docker-compose up -d --build +``` + +### Viewing Application Logs + +```bash +# All services +docker-compose logs -f + +# Specific service +docker-compose logs -f api +docker-compose logs -f dashboard +``` + +--- + +## Quick Reference + +```bash +# Development setup +cp .env.example .env +# Edit .env with API keys +docker-compose up -d postgres +python scripts/init_database.py +uvicorn SPARC.api:app --reload & +streamlit run dashboard.py + +# Production setup +docker-compose -f docker-compose.prod.yml up -d + +# Check status +curl http://localhost:8000/health +open http://localhost:8501 + +# View data +python scripts/view_analytics.py +python scripts/view_messages.py +```