Compare commits

..

No commits in common. "c6843ac115384142e325e82896fb59067cc5c28d" and "d7cf80f02fe8160815103af6eb9b1c3be3800383" have entirely different histories.

12 changed files with 73 additions and 621 deletions

3
.gitignore vendored
View File

@ -2,5 +2,4 @@
.pyenv
__pycache__
.venv
patents
tmp/
patents

View File

@ -1,33 +0,0 @@
stages:
- build
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
LATEST_TAG: $CI_REGISTRY_IMAGE:latest
build-and-push:
stage: build
image: docker:24-cli
services:
- docker:24-dind
before_script:
- echo "Logging into GitLab Container Registry..."
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- echo "Building Docker image..."
- docker build -t $IMAGE_TAG -t $LATEST_TAG .
- echo "Pushing Docker image to registry..."
- docker push $IMAGE_TAG
- docker push $LATEST_TAG
- echo "Build and push completed successfully!"
- echo "Image available at $IMAGE_TAG"
rules:
- if: $CI_COMMIT_BRANCH == "main"
when: always
- if: $CI_COMMIT_TAG
when: always
- when: manual
tags:
- docker

View File

@ -1,16 +0,0 @@
FROM python:3.14
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
RUN useradd app
USER app
CMD ["python3", "main.py"]

168
README.md
View File

@ -1,172 +1,28 @@
# SPARC
**Semiconductor Patent & Analytics Report Core**
## Name
Semiconductor Patent & Analytics Report Core
A patent analysis system that estimates company performance by analyzing their patent portfolios using LLM-powered insights.
## Overview
SPARC automatically collects, parses, and analyzes patents from companies to provide performance estimations. It uses Claude AI to evaluate innovation quality, strategic direction, and competitive positioning based on patent content.
## Features
- **Patent Retrieval**: Automated collection via SerpAPI's Google Patents engine
- **Intelligent Parsing**: Extracts key sections (abstract, claims, summary) from patent PDFs
- **Content Minimization**: Removes verbose descriptions to reduce LLM token usage
- **AI Analysis**: Uses Claude 3.5 Sonnet via OpenRouter to analyze innovation quality and market potential
- **Portfolio Analysis**: Evaluates multiple patents holistically for comprehensive insights
- **Robust Testing**: 26 tests covering all major functionality
## Architecture
```
SPARC/
├── serp_api.py # Patent retrieval and PDF parsing
├── llm.py # Claude AI integration via OpenRouter
├── analyzer.py # High-level orchestration
├── types.py # Data models
└── config.py # Environment configuration
```
## Description
## Installation
### NixOS (Recommended)
```bash
nix develop
```
This automatically creates a virtual environment and installs all dependencies.
### Manual Installation
```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
## Configuration
Create a `.env` file in the project root:
```bash
# SerpAPI key for patent search
API_KEY=your_serpapi_key_here
# OpenRouter API key for Claude AI analysis
OPENROUTER_API_KEY=your_openrouter_key_here
```
Get your API keys:
- SerpAPI: https://serpapi.com/
- OpenRouter: https://openrouter.ai/
### NixOS Installation
`nix develop` to build and configure nix dev environment
## Usage
### Basic Usage
```python
from SPARC.analyzer import CompanyAnalyzer
# Initialize the analyzer
analyzer = CompanyAnalyzer()
# Analyze a company's patent portfolio
analysis = analyzer.analyze_company("nvidia")
print(analysis)
```
### Run the Example
```bash
python main.py
docker compose up -d
```
This will:
1. Retrieve recent NVIDIA patents
2. Parse and minimize content
3. Analyze with Claude AI
4. Print comprehensive performance assessment
### Single Patent Analysis
```python
# Analyze a specific patent
result = analyzer.analyze_single_patent(
patent_id="US11322171B1",
company_name="nvidia"
)
```
## Running Tests
```bash
# Run all tests
pytest tests/ -v
# Run specific test modules
pytest tests/test_analyzer.py -v
pytest tests/test_llm.py -v
pytest tests/test_serp_api.py -v
# Run with coverage
pytest tests/ --cov=SPARC --cov-report=term-missing
```
## How It Works
1. **Patent Collection**: Queries SerpAPI for company patents
2. **PDF Download**: Retrieves patent PDF files
3. **Section Extraction**: Parses abstract, claims, summary, and description
4. **Content Minimization**: Keeps essential sections, removes bloated descriptions
5. **LLM Analysis**: Sends minimized content to Claude for analysis
6. **Performance Estimation**: Returns insights on innovation quality and outlook
## Roadmap
- [X] Retrive `publicationID` from SERP API
- [ ] Retrive data from Google's patent API based on those `publicationID`'s
- This may not be needed, looking to parse the patents based soley on the pdf retrived from SERP
- [ ] Wrap this into a python fastAPI, then bundle with docker
- [X] Retrieve `publicationID` from SERP API
- [X] Parse patents from PDFs (no need for Google Patent API)
- [X] Extract and minimize patent content
- [X] LLM integration for analysis
- [X] Company performance estimation
- [ ] Multi-company batch processing
- [ ] FastAPI web service wrapper
- [ ] Docker containerization
- [ ] Results persistence (database)
- [ ] Visualization dashboard
## Development
### Code Style
- Type hints throughout
- Comprehensive docstrings
- Small, testable functions
- Conventional commits
### Testing Philosophy
- Unit tests for core logic
- Integration tests for orchestration
- Mock external APIs
- Aim for high coverage
### Making Changes
1. Write tests first
2. Implement feature
3. Verify all tests pass
4. Commit with conventional format: `type: description`
Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
## License
For open source projects, say how it is licensed.
## Project Status
Core functionality complete. Ready for production use with API keys configured.
Next steps: API wrapper, containerization, and multi-company support.
## Project status
Heavy development for the limited time available to me

View File

@ -1,112 +0,0 @@
"""High-level patent analysis orchestration.
This module ties together patent retrieval, parsing, and LLM analysis
to provide company performance estimation based on patent portfolios.
"""
from SPARC.serp_api import SERP
from SPARC.llm import LLMAnalyzer
from SPARC.types import Patent
from typing import List
class CompanyAnalyzer:
"""Orchestrates end-to-end company performance analysis via patents."""
def __init__(self, openrouter_api_key: str | None = None):
"""Initialize the company analyzer.
Args:
openrouter_api_key: Optional OpenRouter API key. If None, loads from config.
"""
self.llm_analyzer = LLMAnalyzer(api_key=openrouter_api_key)
def analyze_company(self, company_name: str) -> str:
"""Analyze a company's performance based on their patent portfolio.
This is the main entry point that orchestrates the full pipeline:
1. Retrieve patents from SERP API
2. Download and parse each patent PDF
3. Minimize patent content (remove bloat)
4. Analyze portfolio with LLM
5. Return performance estimation
Args:
company_name: Name of the company to analyze
Returns:
Comprehensive analysis of company's innovation and performance outlook
"""
print(f"Retrieving patents for {company_name}...")
patents = SERP.query(company_name)
if not patents.patents:
return f"No patents found for {company_name}"
print(f"Found {len(patents.patents)} patents. Processing...")
# Download and parse each patent
processed_patents = []
for idx, patent in enumerate(patents.patents, 1):
print(f"Processing patent {idx}/{len(patents.patents)}: {patent.patent_id}")
try:
# Download PDF
patent = SERP.save_patents(patent)
# Parse sections from PDF
sections = SERP.parse_patent_pdf(patent.pdf_path)
# Minimize for LLM (remove bloat)
minimized_content = SERP.minimize_patent_for_llm(sections)
processed_patents.append(
{"patent_id": patent.patent_id, "content": minimized_content}
)
except Exception as e:
print(f"Warning: Failed to process {patent.patent_id}: {e}")
continue
if not processed_patents:
return f"Failed to process any patents for {company_name}"
print(f"Analyzing portfolio with LLM...")
# Analyze the full portfolio with LLM
analysis = self.llm_analyzer.analyze_patent_portfolio(
patents_data=processed_patents, company_name=company_name
)
return analysis
def analyze_single_patent(self, patent_id: str, company_name: str) -> str:
"""Analyze a single patent by ID.
Useful for focused analysis of specific innovations.
Args:
patent_id: Publication ID of the patent
company_name: Name of the company (for context)
Returns:
Analysis of the specific patent's innovation quality
"""
# Note: This simplified version assumes the patent PDF is already downloaded
# A more complete implementation would support direct patent ID lookup
print(f"Analyzing patent {patent_id} for {company_name}...")
patent_path = f"patents/{patent_id}.pdf"
try:
sections = SERP.parse_patent_pdf(patent_path)
minimized_content = SERP.minimize_patent_for_llm(sections)
analysis = self.llm_analyzer.analyze_patent_content(
patent_content=minimized_content, company_name=company_name
)
return analysis
except Exception as e:
return f"Failed to analyze patent {patent_id}: {e}"

View File

@ -10,5 +10,5 @@ load_dotenv()
# SerpAPI key for patent search
api_key = os.getenv("API_KEY")
# OpenRouter API key for LLM analysis
openrouter_api_key = os.getenv("OPENROUTER_API_KEY")
# Anthropic API key for LLM analysis
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")

View File

@ -1,6 +1,6 @@
"""LLM integration for patent analysis using OpenRouter."""
"""LLM integration for patent analysis using Anthropic's Claude."""
from openai import OpenAI
from anthropic import Anthropic
from SPARC import config
from typing import Dict
@ -8,23 +8,14 @@ from typing import Dict
class LLMAnalyzer:
"""Handles LLM-based analysis of patent content."""
def __init__(self, api_key: str | None = None, test_mode: bool = False):
def __init__(self, api_key: str | None = None):
"""Initialize the LLM analyzer.
Args:
api_key: OpenRouter API key. If None, will attempt to load from config.
test_mode: If True, print prompts instead of making API calls
api_key: Anthropic API key. If None, will attempt to load from config.
"""
self.test_mode = test_mode
if (api_key or config.openrouter_api_key) and not test_mode:
self.client = OpenAI(
api_key=api_key or config.openrouter_api_key,
base_url="https://openrouter.ai/api/v1"
)
self.model = "anthropic/claude-3.5-sonnet"
else:
self.client = None
self.client = Anthropic(api_key=api_key or config.anthropic_api_key)
self.model = "claude-3-5-sonnet-20241022"
def analyze_patent_content(self, patent_content: str, company_name: str) -> str:
"""Analyze patent content to estimate company innovation and performance.
@ -49,22 +40,14 @@ Patent Content:
Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals about the company's technical direction and competitive advantage."""
if self.test_mode:
print("=" * 80)
print("TEST MODE - Prompt that would be sent to LLM:")
print("=" * 80)
print(prompt)
print("=" * 80)
return "[TEST MODE - No API call made]"
message = self.client.messages.create(
model=self.model,
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return message.content[0].text
if self.client:
response = self.client.chat.completions.create(
model=self.model,
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return response.choices[0].message.content
def analyze_patent_portfolio(
self, patents_data: list[Dict[str, str]], company_name: str
) -> str:
@ -101,18 +84,10 @@ Patent Portfolio:
Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the company's innovation strength and performance outlook."""
if self.test_mode:
print(prompt)
return "[TEST MODE]"
message = self.client.messages.create(
model=self.model,
max_tokens=2048,
messages=[{"role": "user", "content": prompt}],
)
try:
response = self.client.chat.completions.create(
model=self.model,
max_tokens=2048,
messages=[{"role": "user", "content": prompt}],
)
return response.choices[0].message.content
except AttributeError:
return prompt
return message.content[0].text

View File

@ -48,8 +48,8 @@
fi
# Prompt tweak so you can see when venv is active
export NIX_PROJECT_SHELL="SPARC"
export PS1="(SPARC-venv) $PS1"
'';
};
});
}
}

47
main.py
View File

@ -1,43 +1,10 @@
"""SPARC - Semiconductor Patent & Analytics Report Core
from SPARC.serp_api import SERP
Example usage of the company performance analyzer.
patents = SERP.query("nvidia")
Before running:
1. Create a .env file with:
API_KEY=your_serpapi_key
OPENROUTER_API_KEY=your_openrouter_key
for patent in patents.patents:
patent = SERP.save_patents(patent)
patent.summary = SERP.parse_patent_pdf(patent.pdf_path)
print(patent.summary)
2. Run: python main.py
"""
from SPARC.analyzer import CompanyAnalyzer
def main():
"""Analyze a company's performance based on their patent portfolio."""
# Initialize the analyzer (loads API keys from .env)
analyzer = CompanyAnalyzer()
# Analyze a company - this will:
# 1. Retrieve patents from SERP API
# 2. Download and parse patent PDFs
# 3. Minimize content (remove bloat)
# 4. Analyze with Claude to estimate performance
company_name = "nvidia"
print(f"\n{'=' * 70}")
print(f"SPARC Patent Analysis - {company_name.upper()}")
print(f"{'=' * 70}\n")
analysis = analyzer.analyze_company(company_name)
print(f"\n{'=' * 70}")
print("ANALYSIS RESULTS")
print(f"{'=' * 70}\n")
print(analysis)
print(f"\n{'=' * 70}\n")
if __name__ == "__main__":
main()
print(patents)

View File

@ -4,4 +4,4 @@ pdfplumber
requests
pytest
pytest-mock
openai
anthropic

View File

@ -1,178 +0,0 @@
"""Tests for the high-level company analyzer orchestration."""
import pytest
from unittest.mock import Mock, patch
from SPARC.analyzer import CompanyAnalyzer
from SPARC.types import Patent, Patents
class TestCompanyAnalyzer:
"""Test the CompanyAnalyzer orchestration logic."""
def test_analyzer_initialization(self, mocker):
"""Test analyzer initialization with API key."""
mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
analyzer = CompanyAnalyzer(openrouter_api_key="test-key")
mock_llm.assert_called_once_with(api_key="test-key")
def test_analyze_company_full_pipeline(self, mocker):
"""Test complete company analysis pipeline."""
# Mock all the dependencies
mock_query = mocker.patch("SPARC.analyzer.SERP.query")
mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
# Setup mock return values
test_patent = Patent(
patent_id="US123", pdf_link="http://example.com/test.pdf"
)
mock_query.return_value = Patents(patents=[test_patent])
test_patent.pdf_path = "patents/US123.pdf"
mock_save.return_value = test_patent
mock_parse.return_value = {
"abstract": "Test abstract",
"claims": "Test claims",
}
mock_minimize.return_value = "Minimized content"
mock_llm_instance = Mock()
mock_llm_instance.analyze_patent_portfolio.return_value = (
"Strong innovation portfolio"
)
mock_llm.return_value = mock_llm_instance
# Run the analysis
analyzer = CompanyAnalyzer()
result = analyzer.analyze_company("TestCorp")
# Verify the pipeline executed correctly
assert result == "Strong innovation portfolio"
mock_query.assert_called_once_with("TestCorp")
mock_save.assert_called_once()
mock_parse.assert_called_once_with("patents/US123.pdf")
mock_minimize.assert_called_once()
mock_llm_instance.analyze_patent_portfolio.assert_called_once()
# Verify the data passed to LLM
llm_call_args = mock_llm_instance.analyze_patent_portfolio.call_args
patents_data = llm_call_args[1]["patents_data"]
assert len(patents_data) == 1
assert patents_data[0]["patent_id"] == "US123"
assert patents_data[0]["content"] == "Minimized content"
def test_analyze_company_no_patents_found(self, mocker):
"""Test handling when no patents are found for a company."""
mock_query = mocker.patch("SPARC.analyzer.SERP.query")
mock_query.return_value = Patents(patents=[])
mocker.patch("SPARC.analyzer.LLMAnalyzer")
analyzer = CompanyAnalyzer()
result = analyzer.analyze_company("UnknownCorp")
assert result == "No patents found for UnknownCorp"
def test_analyze_company_handles_processing_errors(self, mocker):
"""Test that analysis continues even if some patents fail to process."""
mock_query = mocker.patch("SPARC.analyzer.SERP.query")
mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
# Create two test patents
patent1 = Patent(patent_id="US123", pdf_link="http://example.com/1.pdf")
patent2 = Patent(patent_id="US456", pdf_link="http://example.com/2.pdf")
mock_query.return_value = Patents(patents=[patent1, patent2])
# First patent processes successfully
patent1.pdf_path = "patents/US123.pdf"
# Second patent raises an error
def save_side_effect(p):
if p.patent_id == "US123":
p.pdf_path = "patents/US123.pdf"
return p
else:
raise Exception("Download failed")
mock_save.side_effect = save_side_effect
mock_parse.return_value = {"abstract": "Test"}
mock_minimize.return_value = "Content"
mock_llm_instance = Mock()
mock_llm_instance.analyze_patent_portfolio.return_value = "Analysis result"
mock_llm.return_value = mock_llm_instance
analyzer = CompanyAnalyzer()
result = analyzer.analyze_company("TestCorp")
# Should still succeed with the one patent that worked
assert result == "Analysis result"
# Verify only one patent was analyzed
llm_call_args = mock_llm_instance.analyze_patent_portfolio.call_args
patents_data = llm_call_args[1]["patents_data"]
assert len(patents_data) == 1
assert patents_data[0]["patent_id"] == "US123"
def test_analyze_company_all_patents_fail(self, mocker):
"""Test handling when all patents fail to process."""
mock_query = mocker.patch("SPARC.analyzer.SERP.query")
mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
mocker.patch("SPARC.analyzer.LLMAnalyzer")
patent = Patent(patent_id="US123", pdf_link="http://example.com/1.pdf")
mock_query.return_value = Patents(patents=[patent])
# Make processing fail
mock_save.side_effect = Exception("Processing error")
analyzer = CompanyAnalyzer()
result = analyzer.analyze_company("TestCorp")
assert result == "Failed to process any patents for TestCorp"
def test_analyze_single_patent(self, mocker):
"""Test single patent analysis."""
mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
mock_parse.return_value = {"abstract": "Test abstract"}
mock_minimize.return_value = "Minimized content"
mock_llm_instance = Mock()
mock_llm_instance.analyze_patent_content.return_value = (
"Innovative patent analysis"
)
mock_llm.return_value = mock_llm_instance
analyzer = CompanyAnalyzer()
result = analyzer.analyze_single_patent("US123", "TestCorp")
assert result == "Innovative patent analysis"
mock_parse.assert_called_once_with("patents/US123.pdf")
mock_llm_instance.analyze_patent_content.assert_called_once_with(
patent_content="Minimized content", company_name="TestCorp"
)
def test_analyze_single_patent_error_handling(self, mocker):
"""Test single patent analysis with processing error."""
mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
mocker.patch("SPARC.analyzer.LLMAnalyzer")
mock_parse.side_effect = FileNotFoundError("PDF not found")
analyzer = CompanyAnalyzer()
result = analyzer.analyze_single_patent("US999", "TestCorp")
assert "Failed to analyze patent US999" in result
assert "PDF not found" in result

View File

@ -10,39 +10,33 @@ class TestLLMAnalyzer:
def test_analyzer_initialization_with_api_key(self, mocker):
"""Test that analyzer initializes with provided API key."""
mock_openai = mocker.patch("SPARC.llm.OpenAI")
mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
analyzer = LLMAnalyzer(api_key="test-key-123")
mock_openai.assert_called_once_with(
api_key="test-key-123",
base_url="https://openrouter.ai/api/v1"
)
assert analyzer.model == "anthropic/claude-3.5-sonnet"
mock_anthropic.assert_called_once_with(api_key="test-key-123")
assert analyzer.model == "claude-3-5-sonnet-20241022"
def test_analyzer_initialization_from_config(self, mocker):
"""Test that analyzer loads API key from config when not provided."""
mock_openai = mocker.patch("SPARC.llm.OpenAI")
mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
mock_config = mocker.patch("SPARC.llm.config")
mock_config.openrouter_api_key = "config-key-456"
mock_config.anthropic_api_key = "config-key-456"
analyzer = LLMAnalyzer()
mock_openai.assert_called_once_with(
api_key="config-key-456",
base_url="https://openrouter.ai/api/v1"
)
mock_anthropic.assert_called_once_with(api_key="config-key-456")
def test_analyze_patent_content(self, mocker):
"""Test single patent content analysis."""
mock_openai = mocker.patch("SPARC.llm.OpenAI")
mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
mock_client = Mock()
mock_openai.return_value = mock_client
mock_anthropic.return_value = mock_client
# Mock the API response
mock_response = Mock()
mock_response.choices = [Mock(message=Mock(content="Innovative GPU architecture."))]
mock_client.chat.completions.create.return_value = mock_response
mock_response.content = [Mock(text="Innovative GPU architecture.")]
mock_client.messages.create.return_value = mock_response
analyzer = LLMAnalyzer(api_key="test-key")
result = analyzer.analyze_patent_content(
@ -51,26 +45,26 @@ class TestLLMAnalyzer:
)
assert result == "Innovative GPU architecture."
mock_client.chat.completions.create.assert_called_once()
mock_client.messages.create.assert_called_once()
# Verify the prompt includes company name and content
call_args = mock_client.chat.completions.create.call_args
call_args = mock_client.messages.create.call_args
prompt_text = call_args[1]["messages"][0]["content"]
assert "NVIDIA" in prompt_text
assert "GPU with new cache design" in prompt_text
def test_analyze_patent_portfolio(self, mocker):
"""Test portfolio analysis with multiple patents."""
mock_openai = mocker.patch("SPARC.llm.OpenAI")
mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
mock_client = Mock()
mock_openai.return_value = mock_client
mock_anthropic.return_value = mock_client
# Mock the API response
mock_response = Mock()
mock_response.choices = [
Mock(message=Mock(content="Strong portfolio in AI and graphics."))
mock_response.content = [
Mock(text="Strong portfolio in AI and graphics.")
]
mock_client.chat.completions.create.return_value = mock_response
mock_client.messages.create.return_value = mock_response
analyzer = LLMAnalyzer(api_key="test-key")
patents_data = [
@ -83,10 +77,10 @@ class TestLLMAnalyzer:
)
assert result == "Strong portfolio in AI and graphics."
mock_client.chat.completions.create.assert_called_once()
mock_client.messages.create.assert_called_once()
# Verify the prompt includes all patents
call_args = mock_client.chat.completions.create.call_args
call_args = mock_client.messages.create.call_args
prompt_text = call_args[1]["messages"][0]["content"]
assert "US123" in prompt_text
assert "US456" in prompt_text
@ -95,36 +89,36 @@ class TestLLMAnalyzer:
def test_analyze_patent_portfolio_with_correct_token_limit(self, mocker):
"""Test that portfolio analysis uses higher token limit."""
mock_openai = mocker.patch("SPARC.llm.OpenAI")
mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
mock_client = Mock()
mock_openai.return_value = mock_client
mock_anthropic.return_value = mock_client
mock_response = Mock()
mock_response.choices = [Mock(message=Mock(content="Analysis result."))]
mock_client.chat.completions.create.return_value = mock_response
mock_response.content = [Mock(text="Analysis result.")]
mock_client.messages.create.return_value = mock_response
analyzer = LLMAnalyzer(api_key="test-key")
patents_data = [{"patent_id": "US123", "content": "Test content"}]
analyzer.analyze_patent_portfolio(patents_data, "TestCo")
call_args = mock_client.chat.completions.create.call_args
call_args = mock_client.messages.create.call_args
# Portfolio analysis should use 2048 tokens
assert call_args[1]["max_tokens"] == 2048
def test_analyze_single_patent_with_correct_token_limit(self, mocker):
"""Test that single patent analysis uses lower token limit."""
mock_openai = mocker.patch("SPARC.llm.OpenAI")
mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
mock_client = Mock()
mock_openai.return_value = mock_client
mock_anthropic.return_value = mock_client
mock_response = Mock()
mock_response.choices = [Mock(message=Mock(content="Analysis result."))]
mock_client.chat.completions.create.return_value = mock_response
mock_response.content = [Mock(text="Analysis result.")]
mock_client.messages.create.return_value = mock_response
analyzer = LLMAnalyzer(api_key="test-key")
analyzer.analyze_patent_content("Test content", "TestCo")
call_args = mock_client.chat.completions.create.call_args
call_args = mock_client.messages.create.call_args
# Single patent should use 1024 tokens
assert call_args[1]["max_tokens"] == 1024