T

This branch is 64 commits behind 0xWheatyz/SPARC:main

agent-company 4696838fb8 ci: add tsc --noEmit TypeScript type checking to CI pipeline

Upgrade lucide-react to v1.7.0 for proper TypeScript declarations and
add a TypeScript type check step to the test workflow. Both ruff (Python)
and tsc --noEmit (TypeScript) now block merging on failure.

Closes leeworks-agents/SPARC#52

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 10:05:55 +00:00

.gitea/workflows

ci: add tsc --noEmit TypeScript type checking to CI pipeline

2026-03-26 10:05:55 +00:00

docs

docs: update documentation for React frontend and cache mode

2026-03-14 14:30:21 -04:00

frontend

ci: add tsc --noEmit TypeScript type checking to CI pipeline

2026-03-26 10:05:55 +00:00

scripts

feat(jobs): persist async batch job state in PostgreSQL

2026-03-26 04:22:57 +00:00

SPARC

ci: add pytest and ruff linting to CI, fix all lint errors

2026-03-26 07:04:00 +00:00

tests

ci: add pytest and ruff linting to CI, fix all lint errors

2026-03-26 07:04:00 +00:00

.env.example

feat(security): add JWT startup guard, configurable CORS, and externalize DB credentials

2026-03-26 04:06:31 +00:00

.gitignore

tests: testing modes have been added in an attempt to tune without wasting tokens.

2026-02-19 22:46:15 -05:00

.update

chore: forcing new git commit

2026-03-23 17:45:42 -04:00

docker-compose.yml

feat(security): add JWT startup guard, configurable CORS, and externalize DB credentials

2026-03-26 04:06:31 +00:00

Dockerfile

feat: update Docker config to run API and dashboard services

2026-03-13 15:49:59 -04:00

flake.lock

feat: patent retrival and semi-processed

2025-12-08 19:33:02 -05:00

flake.nix

build: add numpy and native library dependencies

2026-03-13 15:36:41 -04:00

main.py

docs: update documentation for OpenRouter migration

2026-02-22 12:27:06 -05:00

README.md

docs: document patent PDF storage, add FileNotFoundError, commit lockfile

2026-03-26 04:17:09 +00:00

requirements.txt

feat(auth): add rate limiting to login and register endpoints

2026-03-26 04:08:22 +00:00

ROADMAP.md

chore: add ROADMAP.md for SPARC application development

2026-03-26 00:06:56 +00:00

ruff.toml

ci: add pytest and ruff linting to CI, fix all lint errors

2026-03-26 07:04:00 +00:00

test_database_mode.py

test: update tests for cache mode terminology

2026-03-14 13:41:05 -04:00

README.md

SPARC

Semiconductor Patent & Analytics Report Core

A patent analysis system that estimates company performance by analyzing their patent portfolios using LLM-powered insights.

Overview

SPARC automatically collects, parses, and analyzes patents from companies to provide performance estimations. It uses Claude AI to evaluate innovation quality, strategic direction, and competitive positioning based on patent content.

Features

Patent Retrieval: Automated collection via SerpAPI's Google Patents engine
Intelligent Parsing: Extracts key sections (abstract, claims, summary) from patent PDFs
Content Minimization: Removes verbose descriptions to reduce LLM token usage
AI Analysis: Uses Claude 3.5 Sonnet via OpenRouter to analyze innovation quality and market potential
Portfolio Analysis: Evaluates multiple patents holistically for comprehensive insights
Batch Processing: Analyze multiple companies concurrently with progress tracking
REST API: FastAPI web service with async job support
Dashboard: React TypeScript web dashboard with authentication
Robust Testing: 40 tests covering all major functionality

Architecture

SPARC/
├── serp_api.py       # Patent retrieval and PDF parsing
├── llm.py            # Claude AI integration via OpenRouter
├── analyzer.py       # High-level orchestration
├── api.py            # FastAPI web service with auth endpoints
├── auth.py           # JWT authentication module
├── database.py       # PostgreSQL storage with caching
├── types.py          # Data models
└── config.py         # Environment configuration

Installation

Docker (Recommended)

# Clone and configure
git clone <repository-url>
cd SPARC
cp .env.example .env
# Edit .env with your API keys

# Start all services (API, Dashboard, PostgreSQL)
docker-compose up -d

# Access the services
# - API: http://localhost:8000
# - Dashboard: http://localhost:8080
# - API Docs: http://localhost:8000/docs

Patent PDF Storage

The API stores downloaded patent PDFs in a patents/ directory. In Docker, this is mounted as a bind mount (./patents:/app/patents) so that PDFs persist across container restarts.

If you deploy to a different environment, ensure the patents/ directory is a persistent volume. Without it, PDFs will be re-downloaded on every analysis.

# docker-compose.yml excerpt
volumes:
  - ./patents:/app/patents

NixOS

nix develop

This automatically creates a virtual environment and installs all dependencies.

Manual Installation

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configuration

Create a .env file in the project root:

# SerpAPI key for patent search
API_KEY=your_serpapi_key_here

# OpenRouter API key for Claude AI analysis
OPENROUTER_API_KEY=your_openrouter_key_here

Get your API keys:

SerpAPI: https://serpapi.com/
OpenRouter: https://openrouter.ai/

Usage

Basic Usage

from SPARC.analyzer import CompanyAnalyzer

# Initialize the analyzer
analyzer = CompanyAnalyzer()

# Analyze a company's patent portfolio
analysis = analyzer.analyze_company("nvidia")
print(analysis)

Run the Example

python main.py

This will:

Retrieve recent NVIDIA patents
Parse and minimize content
Analyze with Claude AI
Print comprehensive performance assessment

Single Patent Analysis

# Analyze a specific patent
result = analyzer.analyze_single_patent(
    patent_id="US11322171B1",
    company_name="nvidia"
)

Multi-Company Batch Analysis

from SPARC.analyzer import CompanyAnalyzer

analyzer = CompanyAnalyzer()

# Analyze multiple companies concurrently (default 3 workers)
batch_result = analyzer.analyze_companies(
    ["nvidia", "amd", "intel", "qualcomm"],
    max_workers=3
)

# Access results
print(f"Analyzed: {batch_result.total_companies}")
print(f"Successful: {batch_result.successful}")
print(f"Failed: {batch_result.failed}")

for result in batch_result.results:
    if result.success:
        print(f"{result.company_name}: {result.patent_count} patents")
        print(result.analysis)

# Or use sequential processing (safer for rate limits)
batch_result = analyzer.analyze_companies_sequential(["nvidia", "amd"])

REST API

Start the FastAPI server:

uvicorn SPARC.api:app --reload

API endpoints:

Endpoint	Method	Description
`/health`	GET	Health check
`/analyze/{company}`	GET	Analyze single company
`/analyze/batch`	POST	Analyze multiple companies
`/analyze/batch/async`	POST	Start async batch job
`/jobs/{job_id}`	GET	Get job status
`/jobs`	GET	List all jobs

Interactive docs available at http://localhost:8000/docs

Example API usage:

# Single company
curl http://localhost:8000/analyze/nvidia

# Batch analysis
curl -X POST http://localhost:8000/analyze/batch \
  -H "Content-Type: application/json" \
  -d '{"companies": ["nvidia", "amd", "intel"]}'

# Async batch (for long-running jobs)
curl -X POST http://localhost:8000/analyze/batch/async \
  -H "Content-Type: application/json" \
  -d '{"companies": ["nvidia", "amd", "intel", "qualcomm"]}'

Web Dashboard

The React dashboard is included in Docker Compose:

docker-compose up -d

Dashboard features:

Authentication: User registration, login, and JWT-based sessions
Company Analysis: Analyze individual companies with real-time results
Batch Analysis: Process multiple companies with progress tracking
Analytics: View historical analysis data and trends
Admin Panel: User management for administrators

The dashboard runs at http://localhost:8080 when using Docker Compose.

Running Tests

# Run all tests
pytest tests/ -v

# Run specific test modules
pytest tests/test_analyzer.py -v
pytest tests/test_llm.py -v
pytest tests/test_serp_api.py -v

# Run with coverage
pytest tests/ --cov=SPARC --cov-report=term-missing

How It Works

Patent Collection: Queries SerpAPI for company patents
PDF Download: Retrieves patent PDF files
Section Extraction: Parses abstract, claims, summary, and description
Content Minimization: Keeps essential sections, removes bloated descriptions
LLM Analysis: Sends minimized content to Claude for analysis
Performance Estimation: Returns insights on innovation quality and outlook

Roadmap

Retrieve publicationID from SERP API
Parse patents from PDFs (no need for Google Patent API)
Extract and minimize patent content
LLM integration for analysis
Company performance estimation
Multi-company batch processing
FastAPI web service wrapper
Docker containerization
Results persistence (database)
Visualization dashboard

Development

Code Style

Type hints throughout
Comprehensive docstrings
Small, testable functions
Conventional commits

Testing Philosophy

Unit tests for core logic
Integration tests for orchestration
Mock external APIs
Aim for high coverage

Making Changes

Write tests first
Implement feature
Verify all tests pass
Commit with conventional format: type: description

Types: feat, fix, docs, test, refactor, chore

Documentation

Additional documentation is available in the docs/ directory:

Deployment Guide - Complete deployment instructions for Docker, database setup, and production configuration
Database Mode - Database storage for prompts, responses, and analytics
Container Registry - CI/CD and container registry setup with Gitea Actions

License

For open source projects, say how it is licensed.

Project Status

Core functionality complete. Ready for production use with API keys configured.

All major features implemented: REST API, React dashboard with authentication, Docker containerization, database storage with caching, and multi-company batch processing.

Languages

Python 74.8%

TypeScript 23.7%

Nix 0.5%

Dockerfile 0.3%

CSS 0.3%

Other 0.4%