forked from 0xWheatyz/SPARC
docs: reorganize documentation into docs/ directory
- Move CONTAINER_REGISTRY.md and DATABASE_MODE.md to docs/ - Add comprehensive DEPLOYMENT.md with full deployment instructions - Update README.md with documentation section linking to docs/ - Keep README.md at root for GitHub visibility 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -246,6 +246,14 @@ pytest tests/ --cov=SPARC --cov-report=term-missing
|
|||||||
|
|
||||||
Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
|
Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
|
||||||
|
|
||||||
|
## Documentation
|
||||||
|
|
||||||
|
Additional documentation is available in the `docs/` directory:
|
||||||
|
|
||||||
|
- **[Deployment Guide](docs/DEPLOYMENT.md)** - Complete deployment instructions for Docker, database setup, and production configuration
|
||||||
|
- **[Database Mode](docs/DATABASE_MODE.md)** - Database storage for prompts, responses, and analytics
|
||||||
|
- **[Container Registry](docs/CONTAINER_REGISTRY.md)** - CI/CD and container registry setup with Gitea Actions
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
For open source projects, say how it is licensed.
|
For open source projects, say how it is licensed.
|
||||||
|
|||||||
@@ -0,0 +1,438 @@
|
|||||||
|
# SPARC Complete Deployment Guide
|
||||||
|
|
||||||
|
This guide provides step-by-step instructions for deploying the SPARC (Semiconductor Patent & Analytics Report Core) application with all features enabled, including SERP API patent retrieval, LLM analysis, database storage, and the web UI.
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
- [Prerequisites](#prerequisites)
|
||||||
|
- [Step 1: Clone and Configure](#step-1-clone-and-configure)
|
||||||
|
- [Step 2: Start Services with Docker Compose](#step-2-start-services-with-docker-compose)
|
||||||
|
- [Step 3: Initialize the Database](#step-3-initialize-the-database)
|
||||||
|
- [Step 4: Run the Services](#step-4-run-the-services)
|
||||||
|
- [Step 5: Verify Deployment](#step-5-verify-deployment)
|
||||||
|
- [Step 6: Using the Application](#step-6-using-the-application)
|
||||||
|
- [Step 7: View Stored Data](#step-7-view-stored-data)
|
||||||
|
- [Architecture Overview](#architecture-overview)
|
||||||
|
- [Environment Variables Reference](#environment-variables-reference)
|
||||||
|
- [Production Docker Compose](#production-docker-compose)
|
||||||
|
- [Troubleshooting](#troubleshooting)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
1. **Docker & Docker Compose** installed
|
||||||
|
2. **API Keys** (you'll need to obtain these):
|
||||||
|
- **SerpAPI Key**: Sign up at https://serpapi.com/ (free tier: 100 searches/month)
|
||||||
|
- **OpenRouter API Key**: Sign up at https://openrouter.ai/ (pay-as-you-go)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 1: Clone and Configure
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone <repository-url>
|
||||||
|
cd SPARC
|
||||||
|
|
||||||
|
# Create environment file
|
||||||
|
cp .env.example .env
|
||||||
|
```
|
||||||
|
|
||||||
|
Edit `.env` with your API keys:
|
||||||
|
|
||||||
|
```env
|
||||||
|
# Required API Keys
|
||||||
|
API_KEY=your_serpapi_key_here
|
||||||
|
OPENROUTER_API_KEY=your_openrouter_key_here
|
||||||
|
|
||||||
|
# Database Configuration (matches docker-compose.yml)
|
||||||
|
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc
|
||||||
|
USE_DATABASE=true
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 2: Start Services with Docker Compose
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start PostgreSQL database
|
||||||
|
docker-compose up -d postgres
|
||||||
|
|
||||||
|
# Wait for postgres to be healthy (check with)
|
||||||
|
docker-compose ps
|
||||||
|
|
||||||
|
# You should see sparc-postgres with status "healthy"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 3: Initialize the Database
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Option A: If running locally with Python
|
||||||
|
python scripts/init_database.py
|
||||||
|
|
||||||
|
# Option B: If using Docker, run inside container
|
||||||
|
docker-compose run --rm sparc-app python scripts/init_database.py
|
||||||
|
```
|
||||||
|
|
||||||
|
This creates the `llm_messages` table with the following schema:
|
||||||
|
|
||||||
|
| Column | Type | Purpose |
|
||||||
|
|--------|------|---------|
|
||||||
|
| `id` | SERIAL | Primary key |
|
||||||
|
| `timestamp` | TIMESTAMP | Message creation time |
|
||||||
|
| `company_name` | VARCHAR(255) | Company being analyzed |
|
||||||
|
| `analysis_type` | VARCHAR(50) | 'single_patent' or 'portfolio' |
|
||||||
|
| `model` | VARCHAR(100) | LLM model identifier |
|
||||||
|
| `prompt` | TEXT | Full prompt sent to LLM |
|
||||||
|
| `response` | TEXT | LLM response |
|
||||||
|
| `metadata` | JSONB | Patent IDs, content lengths |
|
||||||
|
| `token_usage` | JSONB | prompt/completion/total tokens |
|
||||||
|
| `created_at` | TIMESTAMP | Record timestamp |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 4: Run the Services
|
||||||
|
|
||||||
|
### Option A: Run Locally (Development)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Terminal 1: Start FastAPI backend
|
||||||
|
uvicorn SPARC.api:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
|
||||||
|
# Terminal 2: Start Streamlit dashboard
|
||||||
|
streamlit run dashboard.py --server.port 8501 --server.address 0.0.0.0
|
||||||
|
```
|
||||||
|
|
||||||
|
### Option B: Run with Docker (Production)
|
||||||
|
|
||||||
|
See [Production Docker Compose](#production-docker-compose) section below for a complete `docker-compose.prod.yml` configuration.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker-compose -f docker-compose.prod.yml up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 5: Verify Deployment
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check API health
|
||||||
|
curl http://localhost:8000/health
|
||||||
|
|
||||||
|
# Expected response:
|
||||||
|
# {"status":"healthy","version":"0.1.0","timestamp":"..."}
|
||||||
|
```
|
||||||
|
|
||||||
|
Access the services:
|
||||||
|
|
||||||
|
| Service | URL |
|
||||||
|
|---------|-----|
|
||||||
|
| REST API | http://localhost:8000 |
|
||||||
|
| API Documentation (Swagger) | http://localhost:8000/docs |
|
||||||
|
| Dashboard (Web UI) | http://localhost:8501 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 6: Using the Application
|
||||||
|
|
||||||
|
### Via Dashboard (Web UI)
|
||||||
|
|
||||||
|
1. Open http://localhost:8501
|
||||||
|
2. Select **"Company Analysis"** from the sidebar
|
||||||
|
3. Enter a company name (e.g., "Intel")
|
||||||
|
4. Click **"Analyze"**
|
||||||
|
|
||||||
|
This will:
|
||||||
|
- Query SerpAPI for recent patents
|
||||||
|
- Download and parse patent PDFs
|
||||||
|
- Send patent content to Claude for analysis
|
||||||
|
- Store prompt/response in PostgreSQL
|
||||||
|
- Display results in the dashboard
|
||||||
|
|
||||||
|
### Via REST API
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Analyze single company
|
||||||
|
curl http://localhost:8000/analyze/Intel
|
||||||
|
|
||||||
|
# Batch analyze multiple companies (synchronous)
|
||||||
|
curl -X POST http://localhost:8000/analyze/batch \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"companies": ["Intel", "AMD", "NVIDIA"], "max_workers": 3}'
|
||||||
|
|
||||||
|
# Async batch (for large jobs)
|
||||||
|
curl -X POST http://localhost:8000/analyze/batch/async \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"companies": ["Intel", "AMD"]}'
|
||||||
|
|
||||||
|
# Check job status
|
||||||
|
curl http://localhost:8000/jobs/{job_id}
|
||||||
|
|
||||||
|
# List all jobs
|
||||||
|
curl http://localhost:8000/jobs
|
||||||
|
```
|
||||||
|
|
||||||
|
### Via Python
|
||||||
|
|
||||||
|
```python
|
||||||
|
from SPARC.analyzer import CompanyAnalyzer
|
||||||
|
|
||||||
|
analyzer = CompanyAnalyzer()
|
||||||
|
result = analyzer.analyze("Intel")
|
||||||
|
print(result.analysis)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 7: View Stored Data
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# View analytics (aggregated usage)
|
||||||
|
python scripts/view_analytics.py
|
||||||
|
|
||||||
|
# View stored messages
|
||||||
|
python scripts/view_messages.py
|
||||||
|
|
||||||
|
# Query database directly
|
||||||
|
docker exec -it sparc-postgres psql -U postgres -d sparc -c \
|
||||||
|
"SELECT company_name, analysis_type, token_usage FROM llm_messages ORDER BY timestamp DESC LIMIT 10;"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||||
|
│ Dashboard │───▶│ FastAPI │───▶│ Analyzer │
|
||||||
|
│ (8501) │ │ (8000) │ │ │
|
||||||
|
└──────────────┘ └──────────────┘ └──────┬───────┘
|
||||||
|
│
|
||||||
|
┌──────────────────────────┼──────────────────────────┐
|
||||||
|
│ │ │
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||||
|
│ SerpAPI │ │ OpenRouter │ │ PostgreSQL │
|
||||||
|
│ (Patents) │ │ (Claude) │ │ (Storage) │
|
||||||
|
└──────────────┘ └──────────────┘ └──────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Component Responsibilities
|
||||||
|
|
||||||
|
| Component | Purpose |
|
||||||
|
|-----------|---------|
|
||||||
|
| **Dashboard** | Streamlit web UI for interactive analysis |
|
||||||
|
| **FastAPI** | REST API for programmatic access |
|
||||||
|
| **Analyzer** | Orchestrates patent retrieval and LLM analysis |
|
||||||
|
| **SerpAPI** | Retrieves patent data from Google Patents |
|
||||||
|
| **OpenRouter** | Routes requests to Claude for AI analysis |
|
||||||
|
| **PostgreSQL** | Stores prompts, responses, and analytics |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Environment Variables Reference
|
||||||
|
|
||||||
|
| Variable | Required | Default | Description |
|
||||||
|
|----------|----------|---------|-------------|
|
||||||
|
| `API_KEY` | Yes | - | SerpAPI key for patent search |
|
||||||
|
| `OPENROUTER_API_KEY` | Yes | - | OpenRouter API key for Claude access |
|
||||||
|
| `DATABASE_URL` | Yes* | - | PostgreSQL connection string |
|
||||||
|
| `USE_DATABASE` | No | `false` | Set to `true` to enable database storage |
|
||||||
|
|
||||||
|
*Required when `USE_DATABASE=true`
|
||||||
|
|
||||||
|
### Database URL Format
|
||||||
|
|
||||||
|
```
|
||||||
|
postgresql://[user]:[password]@[host]:[port]/[database]
|
||||||
|
```
|
||||||
|
|
||||||
|
Example:
|
||||||
|
```
|
||||||
|
postgresql://postgres:postgres@localhost:5432/sparc
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Production Docker Compose
|
||||||
|
|
||||||
|
Create a `docker-compose.prod.yml` file for full production deployment:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
postgres:
|
||||||
|
image: postgres:16-alpine
|
||||||
|
container_name: sparc-postgres
|
||||||
|
environment:
|
||||||
|
POSTGRES_USER: postgres
|
||||||
|
POSTGRES_PASSWORD: postgres
|
||||||
|
POSTGRES_DB: sparc
|
||||||
|
volumes:
|
||||||
|
- postgres_data:/var/lib/postgresql/data
|
||||||
|
ports:
|
||||||
|
- "5432:5432"
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
||||||
|
interval: 5s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
api:
|
||||||
|
build: .
|
||||||
|
container_name: sparc-api
|
||||||
|
command: uvicorn SPARC.api:app --host 0.0.0.0 --port 8000
|
||||||
|
environment:
|
||||||
|
- API_KEY=${API_KEY}
|
||||||
|
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
|
||||||
|
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc
|
||||||
|
- USE_DATABASE=true
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
volumes:
|
||||||
|
- ./patents:/app/patents
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
dashboard:
|
||||||
|
build: .
|
||||||
|
container_name: sparc-dashboard
|
||||||
|
command: streamlit run dashboard.py --server.port 8501 --server.address 0.0.0.0
|
||||||
|
environment:
|
||||||
|
- API_KEY=${API_KEY}
|
||||||
|
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
|
||||||
|
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc
|
||||||
|
- USE_DATABASE=true
|
||||||
|
ports:
|
||||||
|
- "8501:8501"
|
||||||
|
depends_on:
|
||||||
|
- api
|
||||||
|
volumes:
|
||||||
|
- ./patents:/app/patents
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
init-db:
|
||||||
|
build: .
|
||||||
|
container_name: sparc-init-db
|
||||||
|
command: python scripts/init_database.py
|
||||||
|
environment:
|
||||||
|
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc
|
||||||
|
- USE_DATABASE=true
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
restart: "no"
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
postgres_data:
|
||||||
|
```
|
||||||
|
|
||||||
|
### Deploy with Production Compose
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start all services
|
||||||
|
docker-compose -f docker-compose.prod.yml up -d
|
||||||
|
|
||||||
|
# View logs
|
||||||
|
docker-compose -f docker-compose.prod.yml logs -f
|
||||||
|
|
||||||
|
# Stop all services
|
||||||
|
docker-compose -f docker-compose.prod.yml down
|
||||||
|
|
||||||
|
# Stop and remove volumes (WARNING: deletes data)
|
||||||
|
docker-compose -f docker-compose.prod.yml down -v
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Database Connection Issues
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check if postgres is running
|
||||||
|
docker-compose ps
|
||||||
|
|
||||||
|
# Check postgres logs
|
||||||
|
docker-compose logs postgres
|
||||||
|
|
||||||
|
# Test database connection
|
||||||
|
docker exec -it sparc-postgres psql -U postgres -d sparc -c "SELECT 1;"
|
||||||
|
```
|
||||||
|
|
||||||
|
### API Key Issues
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Verify environment variables are set
|
||||||
|
echo $API_KEY
|
||||||
|
echo $OPENROUTER_API_KEY
|
||||||
|
|
||||||
|
# Test SerpAPI directly
|
||||||
|
curl "https://serpapi.com/search?engine=google_patents&q=Intel&api_key=$API_KEY"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Port Conflicts
|
||||||
|
|
||||||
|
If ports 8000, 8501, or 5432 are in use:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Find what's using the port
|
||||||
|
lsof -i :8000
|
||||||
|
|
||||||
|
# Or change ports in docker-compose.yml
|
||||||
|
ports:
|
||||||
|
- "8080:8000" # Use 8080 instead of 8000
|
||||||
|
```
|
||||||
|
|
||||||
|
### Container Issues
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Rebuild containers after code changes
|
||||||
|
docker-compose build --no-cache
|
||||||
|
|
||||||
|
# Remove all containers and start fresh
|
||||||
|
docker-compose down
|
||||||
|
docker-compose up -d --build
|
||||||
|
```
|
||||||
|
|
||||||
|
### Viewing Application Logs
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# All services
|
||||||
|
docker-compose logs -f
|
||||||
|
|
||||||
|
# Specific service
|
||||||
|
docker-compose logs -f api
|
||||||
|
docker-compose logs -f dashboard
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Development setup
|
||||||
|
cp .env.example .env
|
||||||
|
# Edit .env with API keys
|
||||||
|
docker-compose up -d postgres
|
||||||
|
python scripts/init_database.py
|
||||||
|
uvicorn SPARC.api:app --reload &
|
||||||
|
streamlit run dashboard.py
|
||||||
|
|
||||||
|
# Production setup
|
||||||
|
docker-compose -f docker-compose.prod.yml up -d
|
||||||
|
|
||||||
|
# Check status
|
||||||
|
curl http://localhost:8000/health
|
||||||
|
open http://localhost:8501
|
||||||
|
|
||||||
|
# View data
|
||||||
|
python scripts/view_analytics.py
|
||||||
|
python scripts/view_messages.py
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user