Files
SPARC/docs/DATABASE_MODE.md
T
0xWheatyz 490850d7a6
Build and Push Docker Image / build-and-push (push) Successful in 1h1m27s
docs: reorganize documentation into docs/ directory
- Move CONTAINER_REGISTRY.md and DATABASE_MODE.md to docs/
- Add comprehensive DEPLOYMENT.md with full deployment instructions
- Update README.md with documentation section linking to docs/
- Keep README.md at root for GitHub visibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-12 23:51:32 -04:00

7.3 KiB

Database Mode for Testing and Analytics

This document explains how to use SPARC's database mode for storing LLM messages for testing and analytics purposes.

Overview

SPARC supports two modes of operation:

  1. API Mode (default): Messages are sent to OpenRouter's API and you receive real LLM responses
  2. Database Mode: Messages are stored in a PostgreSQL database without making API calls, useful for:
    • Testing the application without consuming API credits
    • Collecting analytics on message patterns and usage
    • Development and debugging

Setup

1. Start the Database

Use docker-compose to start the PostgreSQL database:

docker-compose up -d postgres

This will start a PostgreSQL instance accessible at localhost:5432.

2. Initialize the Database Schema

Run the initialization script to create the necessary tables:

python scripts/init_database.py

This creates the llm_messages table and indexes for efficient querying.

3. Configure Environment Variables

Create a .env file (or copy from .env.example):

cp .env.example .env

Edit .env and set:

# For database mode (testing/analytics)
USE_DATABASE=true
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc

# For API mode (production)
USE_DATABASE=false
OPENROUTER_API_KEY=your_openrouter_key_here

Usage

Running in Database Mode

Set USE_DATABASE=true in your .env file, then run the application normally:

python main.py

Instead of sending messages to OpenRouter, the application will:

  • Store all prompts in the database
  • Return a placeholder response
  • Log metadata (company name, analysis type, timestamps)

Running in API Mode

Set USE_DATABASE=false in your .env file, then run the application normally:

python main.py

The application will send messages to OpenRouter and return real LLM responses.

Hybrid Mode (Optional)

You can also enable database logging while still using the API by initializing the database client in your code. The LLMAnalyzer will automatically log all API calls to the database if a database client is available.

Viewing Analytics

View Message Statistics

python scripts/view_analytics.py

Options:

  • --days N: Analyze messages from the last N days (default: 30)

Example output:

SPARC Analytics - Last 30 days
======================================================================

Total Messages: 45

Messages by Company:
  nvidia: 25
  intel: 12
  amd: 8

Messages by Analysis Type:
  portfolio: 30
  single_patent: 15

======================================================================

View Stored Messages

python scripts/view_messages.py

Options:

  • --company COMPANY: Filter by company name
  • --type TYPE: Filter by analysis type (single_patent or portfolio)
  • --limit N: Maximum number of messages to display (default: 10)

Examples:

# View last 10 messages
python scripts/view_messages.py

# View all messages for nvidia
python scripts/view_messages.py --company nvidia --limit 100

# View portfolio analyses only
python scripts/view_messages.py --type portfolio

Database Schema

llm_messages Table

Column Type Description
id SERIAL Primary key
timestamp TIMESTAMP When the message was created
company_name VARCHAR(255) Company being analyzed
analysis_type VARCHAR(50) Type of analysis (single_patent, portfolio)
model VARCHAR(100) LLM model identifier
prompt TEXT The full prompt sent to the LLM
response TEXT The response from the LLM
metadata JSONB Additional metadata (patent IDs, content length, etc.)
token_usage JSONB Token usage statistics (when available)
created_at TIMESTAMP Record creation timestamp

Indexes

  • idx_messages_timestamp: Speeds up time-based queries
  • idx_messages_company: Speeds up company-specific queries

Docker Compose

The included docker-compose.yml provides:

  1. PostgreSQL Database:

    • Image: postgres:16-alpine
    • Port: 5432
    • Credentials: postgres/postgres
    • Database: sparc
    • Persistent storage via volume
  2. Application Container (optional):

    • Builds from Dockerfile
    • Connects to PostgreSQL
    • Mounts current directory

Start Services

# Start just the database
docker-compose up -d postgres

# Start everything
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

# Stop and remove volumes (WARNING: deletes data)
docker-compose down -v

Toggling Between Modes

You can easily switch between modes by changing the USE_DATABASE environment variable:

Quick Toggle (temporary, for testing)

# Run in database mode
USE_DATABASE=true python main.py

# Run in API mode
USE_DATABASE=false python main.py

Persistent Toggle

Edit your .env file:

# For testing/analytics
USE_DATABASE=true

# For production use
USE_DATABASE=false

Use Cases

Testing Without API Costs

During development, enable database mode to test the full application flow without consuming API credits:

USE_DATABASE=true python main.py

Collecting Usage Analytics

Enable database mode in a test environment to collect analytics on:

  • Which companies are analyzed most frequently
  • Types of analyses performed
  • Prompt patterns and lengths
  • Usage over time

Development and Debugging

Database mode is useful for:

  • Testing patent parsing logic without API calls
  • Debugging the full pipeline end-to-end
  • Collecting sample prompts for optimization
  • Understanding token usage patterns (when in API mode with logging)

Troubleshooting

Connection Refused

If you get "connection refused" errors:

  1. Ensure PostgreSQL is running: docker-compose ps
  2. Check the DATABASE_URL in your .env file
  3. Wait for the database to be healthy: docker-compose logs postgres

Schema Not Found

If you get "relation does not exist" errors:

  1. Run the initialization script: python scripts/init_database.py
  2. Verify tables were created: docker-compose exec postgres psql -U postgres -d sparc -c "\dt"

Permission Denied

If you get permission errors:

  1. Check your DATABASE_URL credentials match docker-compose.yml
  2. Ensure the database container is running: docker-compose up -d postgres

Advanced Usage

Direct Database Access

You can access the database directly using psql:

docker-compose exec postgres psql -U postgres -d sparc

Example queries:

-- View all messages
SELECT id, company_name, analysis_type, timestamp FROM llm_messages ORDER BY timestamp DESC LIMIT 10;

-- Count messages by company
SELECT company_name, COUNT(*) FROM llm_messages GROUP BY company_name;

-- View recent prompts
SELECT prompt FROM llm_messages ORDER BY timestamp DESC LIMIT 5;

Programmatic Access

You can use the DatabaseClient directly in your code:

from SPARC.database import DatabaseClient
from SPARC import config

db = DatabaseClient(config.database_url)

# Get messages
messages = db.get_messages(company_name="nvidia", limit=10)

# Get analytics
analytics = db.get_analytics(days=7)

# Store a custom message
db.store_message(
    prompt="test prompt",
    response="test response",
    company_name="test",
    analysis_type="custom"
)