Files
SPARC/docs/DEPLOYMENT.md
agent-company 97048917f2 docs: document patent PDF volume mount for containerized deployments
Switch docker-compose.yml from bind mount to a named volume (patent_data)
so downloaded PDFs survive container recreation. Add a "Patent PDF Storage"
section to DEPLOYMENT.md covering Docker Compose, Kubernetes PVC, and S3
alternatives.

Closes leeworks-agents/SPARC#1360

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:08:02 +00:00

12 KiB

SPARC Complete Deployment Guide

This guide provides step-by-step instructions for deploying the SPARC (Semiconductor Patent & Analytics Report Core) application with all features enabled, including SERP API patent retrieval, LLM analysis, database storage, and the web UI.

Table of Contents


Prerequisites

  1. Docker & Docker Compose installed
  2. API Keys (you'll need to obtain these):

Step 1: Clone and Configure

git clone <repository-url>
cd SPARC

# Create environment file
cp .env.example .env

Edit .env with your API keys:

# Required API Keys
API_KEY=your_serpapi_key_here
OPENROUTER_API_KEY=your_openrouter_key_here

# Database Configuration (matches docker-compose.yml)
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc
USE_DATABASE=true

Step 2: Start Services with Docker Compose

# Start all services (PostgreSQL, API, and Dashboard)
docker-compose up -d

# Check status
docker-compose ps

# You should see:
# - sparc-postgres (healthy)
# - sparc-api (running on port 8000)
# - sparc-dashboard (running on port 8080)

The database is automatically initialized by the init-db service.


Step 3: Database Schema

The init-db service automatically creates the llm_messages table with the following schema:

Column Type Purpose
id SERIAL Primary key
timestamp TIMESTAMP Message creation time
company_name VARCHAR(255) Company being analyzed
analysis_type VARCHAR(50) 'single_patent' or 'portfolio'
model VARCHAR(100) LLM model identifier
prompt TEXT Full prompt sent to LLM
response TEXT LLM response
metadata JSONB Patent IDs, content lengths
token_usage JSONB prompt/completion/total tokens
created_at TIMESTAMP Record timestamp

Step 4: Run the Services

All services are started automatically with docker-compose up -d from Step 2.

# View logs
docker-compose logs -f

# View specific service logs
docker-compose logs -f api
docker-compose logs -f dashboard

Option B: Run Locally (Development)

If you prefer running services locally without Docker:

# Start PostgreSQL with Docker
docker-compose up -d postgres

# Wait for database to be healthy, then initialize
python scripts/init_database.py

# Start FastAPI backend
uvicorn SPARC.api:app --host 0.0.0.0 --port 8000 --reload

# For the React frontend (separate terminal)
cd frontend
npm install
npm run dev

Step 5: Verify Deployment

# Check API health
curl http://localhost:8000/health

# Expected response:
# {"status":"healthy","version":"0.1.0","timestamp":"..."}

Access the services:

Service URL
REST API http://localhost:8000
API Documentation (Swagger) http://localhost:8000/docs
Dashboard (Web UI) http://localhost:8080

Step 6: Using the Application

Via Dashboard (Web UI)

  1. Open http://localhost:8080
  2. Register a new account or login (default admin: admin / admin)
  3. Navigate to "Analysis" from the sidebar
  4. Enter a company name (e.g., "Intel")
  5. Click "Analyze"

This will:

  • Query SerpAPI for recent patents
  • Download and parse patent PDFs
  • Send patent content to Claude for analysis
  • Store prompt/response in PostgreSQL (with caching)
  • Display results in the dashboard

Via REST API

# Analyze single company
curl http://localhost:8000/analyze/Intel

# Batch analyze multiple companies (synchronous)
curl -X POST http://localhost:8000/analyze/batch \
  -H "Content-Type: application/json" \
  -d '{"companies": ["Intel", "AMD", "NVIDIA"], "max_workers": 3}'

# Async batch (for large jobs)
curl -X POST http://localhost:8000/analyze/batch/async \
  -H "Content-Type: application/json" \
  -d '{"companies": ["Intel", "AMD"]}'

# Check job status
curl http://localhost:8000/jobs/{job_id}

# List all jobs
curl http://localhost:8000/jobs

Via Python

from SPARC.analyzer import CompanyAnalyzer

analyzer = CompanyAnalyzer()
result = analyzer.analyze("Intel")
print(result.analysis)

Step 7: View Stored Data

# View analytics (aggregated usage)
python scripts/view_analytics.py

# View stored messages
python scripts/view_messages.py

# Query database directly
docker exec -it sparc-postgres psql -U postgres -d sparc -c \
  "SELECT company_name, analysis_type, token_usage FROM llm_messages ORDER BY timestamp DESC LIMIT 10;"

Architecture Overview

┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  Dashboard   │───▶│   FastAPI    │───▶│   Analyzer   │
│  (8501)      │    │   (8000)     │    │              │
└──────────────┘    └──────────────┘    └──────┬───────┘
                                               │
                    ┌──────────────────────────┼──────────────────────────┐
                    │                          │                          │
                    ▼                          ▼                          ▼
           ┌──────────────┐           ┌──────────────┐           ┌──────────────┐
           │   SerpAPI    │           │  OpenRouter  │           │  PostgreSQL  │
           │ (Patents)    │           │  (Claude)    │           │  (Storage)   │
           └──────────────┘           └──────────────┘           └──────────────┘

Component Responsibilities

Component Purpose
Dashboard React TypeScript web UI with authentication
FastAPI REST API with JWT authentication
Analyzer Orchestrates patent retrieval and LLM analysis
SerpAPI Retrieves patent data from Google Patents
OpenRouter Routes requests to Claude for AI analysis
PostgreSQL Stores prompts, responses, users, and cached results

Environment Variables Reference

Variable Required Default Description
API_KEY Yes - SerpAPI key for patent search
OPENROUTER_API_KEY Yes - OpenRouter API key for Claude access
DATABASE_URL Yes - PostgreSQL connection string
USE_CACHE No true Check database for cached responses before API calls
JWT_SECRET Yes - Secret key for JWT authentication (change in production!)

Database URL Format

postgresql://[user]:[password]@[host]:[port]/[database]

Example:

postgresql://postgres:postgres@localhost:5432/sparc

Docker Compose Services

The docker-compose.yml includes all services needed for production:

Service Container Port Description
postgres sparc-postgres 5432 PostgreSQL database
init-db sparc-init-db - One-time database initialization (seeds admin user)
api sparc-api 8000 FastAPI REST API with JWT auth (patent PDFs stored in patent_data volume)
dashboard sparc-dashboard 8080 React TypeScript web UI

Common Docker Compose Commands

# Start all services
docker-compose up -d

# Start with rebuild (after code changes)
docker-compose up -d --build

# View logs
docker-compose logs -f

# View specific service logs
docker-compose logs -f api
docker-compose logs -f dashboard

# Stop all services
docker-compose down

# Stop and remove volumes (WARNING: deletes data)
docker-compose down -v

# Restart a specific service
docker-compose restart api

Patent PDF Storage

The SPARC API downloads patent PDFs during analysis and stores them at /app/patents inside the container. These files are used for subsequent single-patent analysis requests and as a local cache to avoid re-downloading. If this directory is not persisted, all downloaded PDFs are lost when the container is recreated.

Docker Compose (default)

The default docker-compose.yml declares a named volume called patent_data that is mounted at /app/patents:

# In the api service:
volumes:
  - patent_data:/app/patents

# At the top-level volumes section:
volumes:
  patent_data:

This means PDFs survive docker compose down and docker compose up cycles. To remove patent data intentionally, run:

docker compose down -v   # WARNING: also removes postgres_data
# or selectively:
docker volume rm sparc_patent_data

If you prefer a bind mount (e.g., for easy host-side access during development), replace the volume with:

volumes:
  - ./patents:/app/patents

Kubernetes

For Kubernetes deployments, create a PersistentVolumeClaim and mount it into the API pod:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: sparc-patent-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sparc-api
spec:
  template:
    spec:
      containers:
        - name: api
          volumeMounts:
            - name: patent-data
              mountPath: /app/patents
      volumes:
        - name: patent-data
          persistentVolumeClaim:
            claimName: sparc-patent-data

Adjust the storage size based on expected patent volume. Each patent PDF is typically 1-5 MB.

S3 Object Storage (alternative)

For production deployments that need shared or highly durable storage, set STORAGE_BACKEND=s3 in your .env file. This stores patent PDFs in an S3-compatible bucket (AWS S3 or MinIO) instead of the local filesystem, eliminating the need for a persistent volume. See the S3/MinIO section in .env.example for configuration details.


Troubleshooting

Database Connection Issues

# Check if postgres is running
docker-compose ps

# Check postgres logs
docker-compose logs postgres

# Test database connection
docker exec -it sparc-postgres psql -U postgres -d sparc -c "SELECT 1;"

API Key Issues

# Verify environment variables are set
echo $API_KEY
echo $OPENROUTER_API_KEY

# Test SerpAPI directly
curl "https://serpapi.com/search?engine=google_patents&q=Intel&api_key=$API_KEY"

Port Conflicts

If ports 8000, 8501, or 5432 are in use:

# Find what's using the port
lsof -i :8000

# Or change ports in docker-compose.yml
ports:
  - "8080:8000"  # Use 8080 instead of 8000

Container Issues

# Rebuild containers after code changes
docker-compose build --no-cache

# Remove all containers and start fresh
docker-compose down
docker-compose up -d --build

Viewing Application Logs

# All services
docker-compose logs -f

# Specific service
docker-compose logs -f api
docker-compose logs -f dashboard

Quick Reference

# Docker setup (recommended)
cp .env.example .env
# Edit .env with API keys
docker-compose up -d

# Local development setup
cp .env.example .env
# Edit .env with API keys
docker-compose up -d postgres
python scripts/init_database.py
uvicorn SPARC.api:app --reload &
cd frontend && npm install && npm run dev &

# Check status
curl http://localhost:8000/health
open http://localhost:8080

# View data
python scripts/view_analytics.py
python scripts/view_messages.py