Go to file

0xWheatyz 8971ebc913 chore: removed extra files		2026-02-19 22:46:53 -05:00
SPARC	tests: testing modes have been added in an attempt to tune without wasting tokens.	2026-02-19 22:46:15 -05:00
tests	feat: implement company performance estimation orchestration	2026-02-19 18:57:10 -05:00
.gitignore	tests: testing modes have been added in an attempt to tune without wasting tokens.	2026-02-19 22:46:15 -05:00
flake.lock	feat: patent retrival and semi-processed	2025-12-08 19:33:02 -05:00
flake.nix	feat: patent retrival and semi-processed	2025-12-08 19:33:02 -05:00
main.py	feat: implement company performance estimation orchestration	2026-02-19 18:57:10 -05:00
README.md	docs: comprehensive README update	2026-02-19 18:57:57 -05:00
requirements.txt	feat: add LLM integration for patent analysis	2026-02-19 18:55:35 -05:00

README.md

SPARC

Semiconductor Patent & Analytics Report Core

A patent analysis system that estimates company performance by analyzing their patent portfolios using LLM-powered insights.

Overview

SPARC automatically collects, parses, and analyzes patents from companies to provide performance estimations. It uses Claude AI to evaluate innovation quality, strategic direction, and competitive positioning based on patent content.

Features

Patent Retrieval: Automated collection via SerpAPI's Google Patents engine
Intelligent Parsing: Extracts key sections (abstract, claims, summary) from patent PDFs
Content Minimization: Removes verbose descriptions to reduce LLM token usage
AI Analysis: Uses Claude 3.5 Sonnet to analyze innovation quality and market potential
Portfolio Analysis: Evaluates multiple patents holistically for comprehensive insights
Robust Testing: 26 tests covering all major functionality

Architecture

SPARC/
├── serp_api.py       # Patent retrieval and PDF parsing
├── llm.py            # Claude AI integration for analysis
├── analyzer.py       # High-level orchestration
├── types.py          # Data models
└── config.py         # Environment configuration

Installation

NixOS (Recommended)

nix develop

This automatically creates a virtual environment and installs all dependencies.

Manual Installation

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configuration

Create a .env file in the project root:

# SerpAPI key for patent search
API_KEY=your_serpapi_key_here

# Anthropic API key for Claude AI analysis
ANTHROPIC_API_KEY=your_anthropic_key_here

Get your API keys:

SerpAPI: https://serpapi.com/
Anthropic: https://console.anthropic.com/

Usage

Basic Usage

from SPARC.analyzer import CompanyAnalyzer

# Initialize the analyzer
analyzer = CompanyAnalyzer()

# Analyze a company's patent portfolio
analysis = analyzer.analyze_company("nvidia")
print(analysis)

Run the Example

python main.py

This will:

Retrieve recent NVIDIA patents
Parse and minimize content
Analyze with Claude AI
Print comprehensive performance assessment

Single Patent Analysis

# Analyze a specific patent
result = analyzer.analyze_single_patent(
    patent_id="US11322171B1",
    company_name="nvidia"
)

Running Tests

# Run all tests
pytest tests/ -v

# Run specific test modules
pytest tests/test_analyzer.py -v
pytest tests/test_llm.py -v
pytest tests/test_serp_api.py -v

# Run with coverage
pytest tests/ --cov=SPARC --cov-report=term-missing

How It Works

Patent Collection: Queries SerpAPI for company patents
PDF Download: Retrieves patent PDF files
Section Extraction: Parses abstract, claims, summary, and description
Content Minimization: Keeps essential sections, removes bloated descriptions
LLM Analysis: Sends minimized content to Claude for analysis
Performance Estimation: Returns insights on innovation quality and outlook

Roadmap

Retrieve publicationID from SERP API
Parse patents from PDFs (no need for Google Patent API)
Extract and minimize patent content
LLM integration for analysis
Company performance estimation
Multi-company batch processing
FastAPI web service wrapper
Docker containerization
Results persistence (database)
Visualization dashboard

Development

Code Style

Type hints throughout
Comprehensive docstrings
Small, testable functions
Conventional commits

Testing Philosophy

Unit tests for core logic
Integration tests for orchestration
Mock external APIs
Aim for high coverage

Making Changes

Write tests first
Implement feature
Verify all tests pass
Commit with conventional format: type: description

Types: feat, fix, docs, test, refactor, chore

License

For open source projects, say how it is licensed.

Project Status

Core functionality complete. Ready for production use with API keys configured.

Next steps: API wrapper, containerization, and multi-company support.