forked from 0xWheatyz/SPARC
docs: reorganize documentation into docs/ directory
- Move CONTAINER_REGISTRY.md and DATABASE_MODE.md to docs/ - Add comprehensive DEPLOYMENT.md with full deployment instructions - Update README.md with documentation section linking to docs/ - Keep README.md at root for GitHub visibility 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,188 @@
|
||||
# Container Registry and CI/CD Setup
|
||||
|
||||
This document explains how to build and push Docker images using Gitea Actions and the Gitea Container Registry.
|
||||
|
||||
## Overview
|
||||
|
||||
The SPARC project uses Gitea Actions (GitHub Actions-compatible) to automatically build and push Docker images to the Gitea Container Registry whenever code is pushed to the repository.
|
||||
|
||||
## Workflow Configuration
|
||||
|
||||
The workflow is defined in `.gitea/workflows/build.yaml` and automatically:
|
||||
- Builds the Docker image from the `Dockerfile`
|
||||
- Tags the image appropriately based on the git ref (branch/tag)
|
||||
- Pushes to the Gitea Container Registry at `10.0.1.10`
|
||||
|
||||
### Triggers
|
||||
|
||||
The workflow runs on:
|
||||
- **Push to main branch**: Builds and tags with commit SHA + `latest`
|
||||
- **Push of tags**: Builds and tags with the tag name + `latest`
|
||||
- **Manual dispatch**: Can be triggered manually from Gitea UI
|
||||
|
||||
### Image Naming
|
||||
|
||||
Images are pushed to: `10.0.1.10/0xwheatyz/sparc:<tag>`
|
||||
|
||||
- Main branch commits: `10.0.1.10/0xwheatyz/sparc:<sha>` and `10.0.1.10/0xwheatyz/sparc:latest`
|
||||
- Tags: `10.0.1.10/0xwheatyz/sparc:<tag-name>` and `10.0.1.10/0xwheatyz/sparc:latest`
|
||||
- Other branches: `10.0.1.10/0xwheatyz/sparc:<branch-name>`
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### 1. Enable Container Registry in Gitea
|
||||
|
||||
The Gitea instance must have the Container Registry (Packages) feature enabled:
|
||||
|
||||
1. Access Gitea as administrator
|
||||
2. Go to Site Administration > Configuration
|
||||
3. Find "Packages" section
|
||||
4. Ensure packages/container registry is enabled
|
||||
|
||||
### 2. Create Personal Access Token
|
||||
|
||||
The workflow needs a personal access token with package write permissions:
|
||||
|
||||
1. In Gitea UI, click your profile → Settings
|
||||
2. Go to Applications → Manage Access Tokens
|
||||
3. Click "Generate New Token"
|
||||
4. Give it a descriptive name (e.g., "Actions Container Registry")
|
||||
5. Select scopes:
|
||||
- `write:package` (required)
|
||||
- `read:package` (required)
|
||||
6. Click "Generate Token"
|
||||
7. **Copy the token immediately** (you won't see it again)
|
||||
|
||||
### 3. Add Token as Repository Secret
|
||||
|
||||
1. Go to your repository in Gitea
|
||||
2. Click Settings → Secrets
|
||||
3. Click "Add Secret"
|
||||
4. Name: `GITEA_TOKEN`
|
||||
5. Value: Paste the personal access token
|
||||
6. Click "Add Secret"
|
||||
|
||||
## Usage
|
||||
|
||||
### Automatic Builds
|
||||
|
||||
Once configured, the workflow runs automatically:
|
||||
|
||||
```bash
|
||||
# Push to main branch - triggers build
|
||||
git add .
|
||||
git commit -m "feat: add new feature"
|
||||
git push origin main
|
||||
|
||||
# Create and push a tag - triggers build with tag
|
||||
git tag v1.0.0
|
||||
git push origin v1.0.0
|
||||
```
|
||||
|
||||
### Manual Builds
|
||||
|
||||
You can also trigger builds manually:
|
||||
|
||||
1. Go to repository → Actions
|
||||
2. Click on "Build and Push Docker Image" workflow
|
||||
3. Click "Run workflow"
|
||||
4. Select the branch
|
||||
5. Click "Run workflow"
|
||||
|
||||
### Monitor Build Progress
|
||||
|
||||
1. Go to repository → Actions
|
||||
2. Click on the running workflow
|
||||
3. View logs for each step
|
||||
|
||||
## Pulling Images
|
||||
|
||||
Once built, images can be pulled from the registry:
|
||||
|
||||
```bash
|
||||
# Log in to registry
|
||||
docker login 10.0.1.10 -u your-username
|
||||
|
||||
# Pull the latest image
|
||||
docker pull 10.0.1.10/0xwheatyz/sparc:latest
|
||||
|
||||
# Pull a specific tag
|
||||
docker pull 10.0.1.10/0xwheatyz/sparc:v1.0.0
|
||||
|
||||
# Pull a specific commit
|
||||
docker pull 10.0.1.10/0xwheatyz/sparc:abc1234
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Workflow Fails at Login Step
|
||||
|
||||
**Error**: `Error response from daemon: login attempt to http://10.0.1.10/v2/ failed with status: 404 Not Found`
|
||||
|
||||
**Solution**: Container registry is not enabled in Gitea. Contact administrator to enable packages feature.
|
||||
|
||||
### Workflow Fails with 401 Unauthorized
|
||||
|
||||
**Error**: `unauthorized: authentication required`
|
||||
|
||||
**Solutions**:
|
||||
1. Verify `GITEA_TOKEN` secret exists and is correct
|
||||
2. Verify token has `write:package` and `read:package` scopes
|
||||
3. Regenerate token if it has expired
|
||||
|
||||
### Workflow Fails at Push Step
|
||||
|
||||
**Error**: `denied: permission denied`
|
||||
|
||||
**Solutions**:
|
||||
1. Ensure your user account has write access to the repository
|
||||
2. Verify the token has the correct permissions
|
||||
3. Check if the repository owner matches the registry path
|
||||
|
||||
### Image Not Appearing in Packages
|
||||
|
||||
**Check**:
|
||||
1. Go to repository → Packages tab
|
||||
2. If no packages appear, check workflow logs for errors
|
||||
3. Verify the image was successfully pushed (check workflow output)
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Using a Different Registry
|
||||
|
||||
To push to a different container registry (e.g., Docker Hub, GHCR):
|
||||
|
||||
1. Update the `REGISTRY` variable in `.gitea/workflows/build.yaml`
|
||||
2. Update the login step with appropriate credentials
|
||||
3. Add registry credentials as secrets
|
||||
|
||||
### Building Multi-platform Images
|
||||
|
||||
To build for multiple architectures:
|
||||
|
||||
```yaml
|
||||
- name: Build Docker image
|
||||
run: |
|
||||
docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
-t ${{ steps.tags.outputs.IMAGE_TAG }} \
|
||||
--push .
|
||||
```
|
||||
|
||||
### Adding Build Arguments
|
||||
|
||||
To pass build arguments:
|
||||
|
||||
```yaml
|
||||
- name: Build Docker image
|
||||
run: |
|
||||
docker build \
|
||||
--build-arg VERSION=${{ gitea.sha_short }} \
|
||||
-t ${{ steps.tags.outputs.IMAGE_TAG }} .
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Gitea Actions Documentation](https://docs.gitea.com/usage/actions/overview)
|
||||
- [Gitea Packages Documentation](https://docs.gitea.com/usage/packages/overview)
|
||||
- [GitHub Actions Syntax](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions) (Gitea Actions compatible)
|
||||
@@ -0,0 +1,318 @@
|
||||
# Database Mode for Testing and Analytics
|
||||
|
||||
This document explains how to use SPARC's database mode for storing LLM messages for testing and analytics purposes.
|
||||
|
||||
## Overview
|
||||
|
||||
SPARC supports two modes of operation:
|
||||
|
||||
1. **API Mode** (default): Messages are sent to OpenRouter's API and you receive real LLM responses
|
||||
2. **Database Mode**: Messages are stored in a PostgreSQL database without making API calls, useful for:
|
||||
- Testing the application without consuming API credits
|
||||
- Collecting analytics on message patterns and usage
|
||||
- Development and debugging
|
||||
|
||||
## Setup
|
||||
|
||||
### 1. Start the Database
|
||||
|
||||
Use docker-compose to start the PostgreSQL database:
|
||||
|
||||
```bash
|
||||
docker-compose up -d postgres
|
||||
```
|
||||
|
||||
This will start a PostgreSQL instance accessible at `localhost:5432`.
|
||||
|
||||
### 2. Initialize the Database Schema
|
||||
|
||||
Run the initialization script to create the necessary tables:
|
||||
|
||||
```bash
|
||||
python scripts/init_database.py
|
||||
```
|
||||
|
||||
This creates the `llm_messages` table and indexes for efficient querying.
|
||||
|
||||
### 3. Configure Environment Variables
|
||||
|
||||
Create a `.env` file (or copy from `.env.example`):
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Edit `.env` and set:
|
||||
|
||||
```env
|
||||
# For database mode (testing/analytics)
|
||||
USE_DATABASE=true
|
||||
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc
|
||||
|
||||
# For API mode (production)
|
||||
USE_DATABASE=false
|
||||
OPENROUTER_API_KEY=your_openrouter_key_here
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Running in Database Mode
|
||||
|
||||
Set `USE_DATABASE=true` in your `.env` file, then run the application normally:
|
||||
|
||||
```bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
Instead of sending messages to OpenRouter, the application will:
|
||||
- Store all prompts in the database
|
||||
- Return a placeholder response
|
||||
- Log metadata (company name, analysis type, timestamps)
|
||||
|
||||
### Running in API Mode
|
||||
|
||||
Set `USE_DATABASE=false` in your `.env` file, then run the application normally:
|
||||
|
||||
```bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
The application will send messages to OpenRouter and return real LLM responses.
|
||||
|
||||
### Hybrid Mode (Optional)
|
||||
|
||||
You can also enable database logging while still using the API by initializing the database client in your code. The `LLMAnalyzer` will automatically log all API calls to the database if a database client is available.
|
||||
|
||||
## Viewing Analytics
|
||||
|
||||
### View Message Statistics
|
||||
|
||||
```bash
|
||||
python scripts/view_analytics.py
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--days N`: Analyze messages from the last N days (default: 30)
|
||||
|
||||
Example output:
|
||||
```
|
||||
SPARC Analytics - Last 30 days
|
||||
======================================================================
|
||||
|
||||
Total Messages: 45
|
||||
|
||||
Messages by Company:
|
||||
nvidia: 25
|
||||
intel: 12
|
||||
amd: 8
|
||||
|
||||
Messages by Analysis Type:
|
||||
portfolio: 30
|
||||
single_patent: 15
|
||||
|
||||
======================================================================
|
||||
```
|
||||
|
||||
### View Stored Messages
|
||||
|
||||
```bash
|
||||
python scripts/view_messages.py
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--company COMPANY`: Filter by company name
|
||||
- `--type TYPE`: Filter by analysis type (single_patent or portfolio)
|
||||
- `--limit N`: Maximum number of messages to display (default: 10)
|
||||
|
||||
Examples:
|
||||
```bash
|
||||
# View last 10 messages
|
||||
python scripts/view_messages.py
|
||||
|
||||
# View all messages for nvidia
|
||||
python scripts/view_messages.py --company nvidia --limit 100
|
||||
|
||||
# View portfolio analyses only
|
||||
python scripts/view_messages.py --type portfolio
|
||||
```
|
||||
|
||||
## Database Schema
|
||||
|
||||
### llm_messages Table
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| id | SERIAL | Primary key |
|
||||
| timestamp | TIMESTAMP | When the message was created |
|
||||
| company_name | VARCHAR(255) | Company being analyzed |
|
||||
| analysis_type | VARCHAR(50) | Type of analysis (single_patent, portfolio) |
|
||||
| model | VARCHAR(100) | LLM model identifier |
|
||||
| prompt | TEXT | The full prompt sent to the LLM |
|
||||
| response | TEXT | The response from the LLM |
|
||||
| metadata | JSONB | Additional metadata (patent IDs, content length, etc.) |
|
||||
| token_usage | JSONB | Token usage statistics (when available) |
|
||||
| created_at | TIMESTAMP | Record creation timestamp |
|
||||
|
||||
### Indexes
|
||||
|
||||
- `idx_messages_timestamp`: Speeds up time-based queries
|
||||
- `idx_messages_company`: Speeds up company-specific queries
|
||||
|
||||
## Docker Compose
|
||||
|
||||
The included `docker-compose.yml` provides:
|
||||
|
||||
1. **PostgreSQL Database**:
|
||||
- Image: `postgres:16-alpine`
|
||||
- Port: `5432`
|
||||
- Credentials: postgres/postgres
|
||||
- Database: sparc
|
||||
- Persistent storage via volume
|
||||
|
||||
2. **Application Container** (optional):
|
||||
- Builds from Dockerfile
|
||||
- Connects to PostgreSQL
|
||||
- Mounts current directory
|
||||
|
||||
### Start Services
|
||||
|
||||
```bash
|
||||
# Start just the database
|
||||
docker-compose up -d postgres
|
||||
|
||||
# Start everything
|
||||
docker-compose up -d
|
||||
|
||||
# View logs
|
||||
docker-compose logs -f
|
||||
|
||||
# Stop services
|
||||
docker-compose down
|
||||
|
||||
# Stop and remove volumes (WARNING: deletes data)
|
||||
docker-compose down -v
|
||||
```
|
||||
|
||||
## Toggling Between Modes
|
||||
|
||||
You can easily switch between modes by changing the `USE_DATABASE` environment variable:
|
||||
|
||||
### Quick Toggle (temporary, for testing)
|
||||
|
||||
```bash
|
||||
# Run in database mode
|
||||
USE_DATABASE=true python main.py
|
||||
|
||||
# Run in API mode
|
||||
USE_DATABASE=false python main.py
|
||||
```
|
||||
|
||||
### Persistent Toggle
|
||||
|
||||
Edit your `.env` file:
|
||||
|
||||
```env
|
||||
# For testing/analytics
|
||||
USE_DATABASE=true
|
||||
|
||||
# For production use
|
||||
USE_DATABASE=false
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Testing Without API Costs
|
||||
|
||||
During development, enable database mode to test the full application flow without consuming API credits:
|
||||
|
||||
```bash
|
||||
USE_DATABASE=true python main.py
|
||||
```
|
||||
|
||||
### Collecting Usage Analytics
|
||||
|
||||
Enable database mode in a test environment to collect analytics on:
|
||||
- Which companies are analyzed most frequently
|
||||
- Types of analyses performed
|
||||
- Prompt patterns and lengths
|
||||
- Usage over time
|
||||
|
||||
### Development and Debugging
|
||||
|
||||
Database mode is useful for:
|
||||
- Testing patent parsing logic without API calls
|
||||
- Debugging the full pipeline end-to-end
|
||||
- Collecting sample prompts for optimization
|
||||
- Understanding token usage patterns (when in API mode with logging)
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Connection Refused
|
||||
|
||||
If you get "connection refused" errors:
|
||||
|
||||
1. Ensure PostgreSQL is running: `docker-compose ps`
|
||||
2. Check the DATABASE_URL in your `.env` file
|
||||
3. Wait for the database to be healthy: `docker-compose logs postgres`
|
||||
|
||||
### Schema Not Found
|
||||
|
||||
If you get "relation does not exist" errors:
|
||||
|
||||
1. Run the initialization script: `python scripts/init_database.py`
|
||||
2. Verify tables were created: `docker-compose exec postgres psql -U postgres -d sparc -c "\dt"`
|
||||
|
||||
### Permission Denied
|
||||
|
||||
If you get permission errors:
|
||||
|
||||
1. Check your DATABASE_URL credentials match docker-compose.yml
|
||||
2. Ensure the database container is running: `docker-compose up -d postgres`
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Direct Database Access
|
||||
|
||||
You can access the database directly using psql:
|
||||
|
||||
```bash
|
||||
docker-compose exec postgres psql -U postgres -d sparc
|
||||
```
|
||||
|
||||
Example queries:
|
||||
|
||||
```sql
|
||||
-- View all messages
|
||||
SELECT id, company_name, analysis_type, timestamp FROM llm_messages ORDER BY timestamp DESC LIMIT 10;
|
||||
|
||||
-- Count messages by company
|
||||
SELECT company_name, COUNT(*) FROM llm_messages GROUP BY company_name;
|
||||
|
||||
-- View recent prompts
|
||||
SELECT prompt FROM llm_messages ORDER BY timestamp DESC LIMIT 5;
|
||||
```
|
||||
|
||||
### Programmatic Access
|
||||
|
||||
You can use the `DatabaseClient` directly in your code:
|
||||
|
||||
```python
|
||||
from SPARC.database import DatabaseClient
|
||||
from SPARC import config
|
||||
|
||||
db = DatabaseClient(config.database_url)
|
||||
|
||||
# Get messages
|
||||
messages = db.get_messages(company_name="nvidia", limit=10)
|
||||
|
||||
# Get analytics
|
||||
analytics = db.get_analytics(days=7)
|
||||
|
||||
# Store a custom message
|
||||
db.store_message(
|
||||
prompt="test prompt",
|
||||
response="test response",
|
||||
company_name="test",
|
||||
analysis_type="custom"
|
||||
)
|
||||
```
|
||||
@@ -0,0 +1,438 @@
|
||||
# SPARC Complete Deployment Guide
|
||||
|
||||
This guide provides step-by-step instructions for deploying the SPARC (Semiconductor Patent & Analytics Report Core) application with all features enabled, including SERP API patent retrieval, LLM analysis, database storage, and the web UI.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Step 1: Clone and Configure](#step-1-clone-and-configure)
|
||||
- [Step 2: Start Services with Docker Compose](#step-2-start-services-with-docker-compose)
|
||||
- [Step 3: Initialize the Database](#step-3-initialize-the-database)
|
||||
- [Step 4: Run the Services](#step-4-run-the-services)
|
||||
- [Step 5: Verify Deployment](#step-5-verify-deployment)
|
||||
- [Step 6: Using the Application](#step-6-using-the-application)
|
||||
- [Step 7: View Stored Data](#step-7-view-stored-data)
|
||||
- [Architecture Overview](#architecture-overview)
|
||||
- [Environment Variables Reference](#environment-variables-reference)
|
||||
- [Production Docker Compose](#production-docker-compose)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Docker & Docker Compose** installed
|
||||
2. **API Keys** (you'll need to obtain these):
|
||||
- **SerpAPI Key**: Sign up at https://serpapi.com/ (free tier: 100 searches/month)
|
||||
- **OpenRouter API Key**: Sign up at https://openrouter.ai/ (pay-as-you-go)
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Clone and Configure
|
||||
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd SPARC
|
||||
|
||||
# Create environment file
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Edit `.env` with your API keys:
|
||||
|
||||
```env
|
||||
# Required API Keys
|
||||
API_KEY=your_serpapi_key_here
|
||||
OPENROUTER_API_KEY=your_openrouter_key_here
|
||||
|
||||
# Database Configuration (matches docker-compose.yml)
|
||||
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc
|
||||
USE_DATABASE=true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Start Services with Docker Compose
|
||||
|
||||
```bash
|
||||
# Start PostgreSQL database
|
||||
docker-compose up -d postgres
|
||||
|
||||
# Wait for postgres to be healthy (check with)
|
||||
docker-compose ps
|
||||
|
||||
# You should see sparc-postgres with status "healthy"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Initialize the Database
|
||||
|
||||
```bash
|
||||
# Option A: If running locally with Python
|
||||
python scripts/init_database.py
|
||||
|
||||
# Option B: If using Docker, run inside container
|
||||
docker-compose run --rm sparc-app python scripts/init_database.py
|
||||
```
|
||||
|
||||
This creates the `llm_messages` table with the following schema:
|
||||
|
||||
| Column | Type | Purpose |
|
||||
|--------|------|---------|
|
||||
| `id` | SERIAL | Primary key |
|
||||
| `timestamp` | TIMESTAMP | Message creation time |
|
||||
| `company_name` | VARCHAR(255) | Company being analyzed |
|
||||
| `analysis_type` | VARCHAR(50) | 'single_patent' or 'portfolio' |
|
||||
| `model` | VARCHAR(100) | LLM model identifier |
|
||||
| `prompt` | TEXT | Full prompt sent to LLM |
|
||||
| `response` | TEXT | LLM response |
|
||||
| `metadata` | JSONB | Patent IDs, content lengths |
|
||||
| `token_usage` | JSONB | prompt/completion/total tokens |
|
||||
| `created_at` | TIMESTAMP | Record timestamp |
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Run the Services
|
||||
|
||||
### Option A: Run Locally (Development)
|
||||
|
||||
```bash
|
||||
# Terminal 1: Start FastAPI backend
|
||||
uvicorn SPARC.api:app --host 0.0.0.0 --port 8000 --reload
|
||||
|
||||
# Terminal 2: Start Streamlit dashboard
|
||||
streamlit run dashboard.py --server.port 8501 --server.address 0.0.0.0
|
||||
```
|
||||
|
||||
### Option B: Run with Docker (Production)
|
||||
|
||||
See [Production Docker Compose](#production-docker-compose) section below for a complete `docker-compose.prod.yml` configuration.
|
||||
|
||||
```bash
|
||||
docker-compose -f docker-compose.prod.yml up -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Verify Deployment
|
||||
|
||||
```bash
|
||||
# Check API health
|
||||
curl http://localhost:8000/health
|
||||
|
||||
# Expected response:
|
||||
# {"status":"healthy","version":"0.1.0","timestamp":"..."}
|
||||
```
|
||||
|
||||
Access the services:
|
||||
|
||||
| Service | URL |
|
||||
|---------|-----|
|
||||
| REST API | http://localhost:8000 |
|
||||
| API Documentation (Swagger) | http://localhost:8000/docs |
|
||||
| Dashboard (Web UI) | http://localhost:8501 |
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Using the Application
|
||||
|
||||
### Via Dashboard (Web UI)
|
||||
|
||||
1. Open http://localhost:8501
|
||||
2. Select **"Company Analysis"** from the sidebar
|
||||
3. Enter a company name (e.g., "Intel")
|
||||
4. Click **"Analyze"**
|
||||
|
||||
This will:
|
||||
- Query SerpAPI for recent patents
|
||||
- Download and parse patent PDFs
|
||||
- Send patent content to Claude for analysis
|
||||
- Store prompt/response in PostgreSQL
|
||||
- Display results in the dashboard
|
||||
|
||||
### Via REST API
|
||||
|
||||
```bash
|
||||
# Analyze single company
|
||||
curl http://localhost:8000/analyze/Intel
|
||||
|
||||
# Batch analyze multiple companies (synchronous)
|
||||
curl -X POST http://localhost:8000/analyze/batch \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"companies": ["Intel", "AMD", "NVIDIA"], "max_workers": 3}'
|
||||
|
||||
# Async batch (for large jobs)
|
||||
curl -X POST http://localhost:8000/analyze/batch/async \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"companies": ["Intel", "AMD"]}'
|
||||
|
||||
# Check job status
|
||||
curl http://localhost:8000/jobs/{job_id}
|
||||
|
||||
# List all jobs
|
||||
curl http://localhost:8000/jobs
|
||||
```
|
||||
|
||||
### Via Python
|
||||
|
||||
```python
|
||||
from SPARC.analyzer import CompanyAnalyzer
|
||||
|
||||
analyzer = CompanyAnalyzer()
|
||||
result = analyzer.analyze("Intel")
|
||||
print(result.analysis)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 7: View Stored Data
|
||||
|
||||
```bash
|
||||
# View analytics (aggregated usage)
|
||||
python scripts/view_analytics.py
|
||||
|
||||
# View stored messages
|
||||
python scripts/view_messages.py
|
||||
|
||||
# Query database directly
|
||||
docker exec -it sparc-postgres psql -U postgres -d sparc -c \
|
||||
"SELECT company_name, analysis_type, token_usage FROM llm_messages ORDER BY timestamp DESC LIMIT 10;"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||
│ Dashboard │───▶│ FastAPI │───▶│ Analyzer │
|
||||
│ (8501) │ │ (8000) │ │ │
|
||||
└──────────────┘ └──────────────┘ └──────┬───────┘
|
||||
│
|
||||
┌──────────────────────────┼──────────────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
||||
│ SerpAPI │ │ OpenRouter │ │ PostgreSQL │
|
||||
│ (Patents) │ │ (Claude) │ │ (Storage) │
|
||||
└──────────────┘ └──────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
### Component Responsibilities
|
||||
|
||||
| Component | Purpose |
|
||||
|-----------|---------|
|
||||
| **Dashboard** | Streamlit web UI for interactive analysis |
|
||||
| **FastAPI** | REST API for programmatic access |
|
||||
| **Analyzer** | Orchestrates patent retrieval and LLM analysis |
|
||||
| **SerpAPI** | Retrieves patent data from Google Patents |
|
||||
| **OpenRouter** | Routes requests to Claude for AI analysis |
|
||||
| **PostgreSQL** | Stores prompts, responses, and analytics |
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables Reference
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `API_KEY` | Yes | - | SerpAPI key for patent search |
|
||||
| `OPENROUTER_API_KEY` | Yes | - | OpenRouter API key for Claude access |
|
||||
| `DATABASE_URL` | Yes* | - | PostgreSQL connection string |
|
||||
| `USE_DATABASE` | No | `false` | Set to `true` to enable database storage |
|
||||
|
||||
*Required when `USE_DATABASE=true`
|
||||
|
||||
### Database URL Format
|
||||
|
||||
```
|
||||
postgresql://[user]:[password]@[host]:[port]/[database]
|
||||
```
|
||||
|
||||
Example:
|
||||
```
|
||||
postgresql://postgres:postgres@localhost:5432/sparc
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Docker Compose
|
||||
|
||||
Create a `docker-compose.prod.yml` file for full production deployment:
|
||||
|
||||
```yaml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:16-alpine
|
||||
container_name: sparc-postgres
|
||||
environment:
|
||||
POSTGRES_USER: postgres
|
||||
POSTGRES_PASSWORD: postgres
|
||||
POSTGRES_DB: sparc
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
ports:
|
||||
- "5432:5432"
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
restart: unless-stopped
|
||||
|
||||
api:
|
||||
build: .
|
||||
container_name: sparc-api
|
||||
command: uvicorn SPARC.api:app --host 0.0.0.0 --port 8000
|
||||
environment:
|
||||
- API_KEY=${API_KEY}
|
||||
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
|
||||
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc
|
||||
- USE_DATABASE=true
|
||||
ports:
|
||||
- "8000:8000"
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
volumes:
|
||||
- ./patents:/app/patents
|
||||
restart: unless-stopped
|
||||
|
||||
dashboard:
|
||||
build: .
|
||||
container_name: sparc-dashboard
|
||||
command: streamlit run dashboard.py --server.port 8501 --server.address 0.0.0.0
|
||||
environment:
|
||||
- API_KEY=${API_KEY}
|
||||
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
|
||||
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc
|
||||
- USE_DATABASE=true
|
||||
ports:
|
||||
- "8501:8501"
|
||||
depends_on:
|
||||
- api
|
||||
volumes:
|
||||
- ./patents:/app/patents
|
||||
restart: unless-stopped
|
||||
|
||||
init-db:
|
||||
build: .
|
||||
container_name: sparc-init-db
|
||||
command: python scripts/init_database.py
|
||||
environment:
|
||||
- DATABASE_URL=postgresql://postgres:postgres@postgres:5432/sparc
|
||||
- USE_DATABASE=true
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
restart: "no"
|
||||
|
||||
volumes:
|
||||
postgres_data:
|
||||
```
|
||||
|
||||
### Deploy with Production Compose
|
||||
|
||||
```bash
|
||||
# Start all services
|
||||
docker-compose -f docker-compose.prod.yml up -d
|
||||
|
||||
# View logs
|
||||
docker-compose -f docker-compose.prod.yml logs -f
|
||||
|
||||
# Stop all services
|
||||
docker-compose -f docker-compose.prod.yml down
|
||||
|
||||
# Stop and remove volumes (WARNING: deletes data)
|
||||
docker-compose -f docker-compose.prod.yml down -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Database Connection Issues
|
||||
|
||||
```bash
|
||||
# Check if postgres is running
|
||||
docker-compose ps
|
||||
|
||||
# Check postgres logs
|
||||
docker-compose logs postgres
|
||||
|
||||
# Test database connection
|
||||
docker exec -it sparc-postgres psql -U postgres -d sparc -c "SELECT 1;"
|
||||
```
|
||||
|
||||
### API Key Issues
|
||||
|
||||
```bash
|
||||
# Verify environment variables are set
|
||||
echo $API_KEY
|
||||
echo $OPENROUTER_API_KEY
|
||||
|
||||
# Test SerpAPI directly
|
||||
curl "https://serpapi.com/search?engine=google_patents&q=Intel&api_key=$API_KEY"
|
||||
```
|
||||
|
||||
### Port Conflicts
|
||||
|
||||
If ports 8000, 8501, or 5432 are in use:
|
||||
|
||||
```bash
|
||||
# Find what's using the port
|
||||
lsof -i :8000
|
||||
|
||||
# Or change ports in docker-compose.yml
|
||||
ports:
|
||||
- "8080:8000" # Use 8080 instead of 8000
|
||||
```
|
||||
|
||||
### Container Issues
|
||||
|
||||
```bash
|
||||
# Rebuild containers after code changes
|
||||
docker-compose build --no-cache
|
||||
|
||||
# Remove all containers and start fresh
|
||||
docker-compose down
|
||||
docker-compose up -d --build
|
||||
```
|
||||
|
||||
### Viewing Application Logs
|
||||
|
||||
```bash
|
||||
# All services
|
||||
docker-compose logs -f
|
||||
|
||||
# Specific service
|
||||
docker-compose logs -f api
|
||||
docker-compose logs -f dashboard
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
```bash
|
||||
# Development setup
|
||||
cp .env.example .env
|
||||
# Edit .env with API keys
|
||||
docker-compose up -d postgres
|
||||
python scripts/init_database.py
|
||||
uvicorn SPARC.api:app --reload &
|
||||
streamlit run dashboard.py
|
||||
|
||||
# Production setup
|
||||
docker-compose -f docker-compose.prod.yml up -d
|
||||
|
||||
# Check status
|
||||
curl http://localhost:8000/health
|
||||
open http://localhost:8501
|
||||
|
||||
# View data
|
||||
python scripts/view_analytics.py
|
||||
python scripts/view_messages.py
|
||||
```
|
||||
Reference in New Issue
Block a user