fix(ci): invaild syntax in ci

feat: gitlab container
feat: Docker integration
2026-02-22 12:45:12 -05:00 · 2026-02-22 12:43:32 -05:00 · 2026-02-22 12:30:37 -05:00 · 2026-02-22 12:27:16 -05:00 · 2026-02-22 12:27:06 -05:00 · 2026-02-22 12:26:56 -05:00
12 changed files with 625 additions and 77 deletions
@@ -2,4 +2,5 @@
 .pyenv
 __pycache__
 .venv
-patents
+patents
+tmp/
@@ -0,0 +1,33 @@
+stages:
+  - build
+
+variables:
+  DOCKER_DRIVER: overlay2
+  DOCKER_TLS_CERTDIR: "/certs"
+  IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
+  LATEST_TAG: $CI_REGISTRY_IMAGE:latest
+
+build-and-push:
+  stage: build
+  image: docker:24-cli
+  services:
+    - docker:24-dind
+  before_script:
+    - echo "Logging into GitLab Container Registry..."
+    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
+  script:
+    - echo "Building Docker image..."
+    - docker build -t $IMAGE_TAG -t $LATEST_TAG .
+    - echo "Pushing Docker image to registry..."
+    - docker push $IMAGE_TAG
+    - docker push $LATEST_TAG
+    - echo "Build and push completed successfully!"
+    - echo "Image available at $IMAGE_TAG"
+  rules:
+    - if: $CI_COMMIT_BRANCH == "main"
+      when: always
+    - if: $CI_COMMIT_TAG
+      when: always
+    - when: manual
+  tags:
+    - docker
@@ -0,0 +1,16 @@
+FROM python:3.14
+
+WORKDIR /app
+
+COPY requirements.txt .
+
+RUN pip install --no-cache-dir -r requirements.txt
+
+COPY . .
+
+RUN useradd app
+
+USER app
+
+CMD ["python3", "main.py"]
+
@@ -1,28 +1,172 @@
 # SPARC

-## Name
-Semiconductor Patent & Analytics Report Core
+**Semiconductor Patent & Analytics Report Core**

-## Description
+A patent analysis system that estimates company performance by analyzing their patent portfolios using LLM-powered insights.

-## Installation
-### NixOS Installation
-`nix develop` to build and configure nix dev environment
+## Overview

-## Usage
-```bash
-docker compose up -d
+SPARC automatically collects, parses, and analyzes patents from companies to provide performance estimations. It uses Claude AI to evaluate innovation quality, strategic direction, and competitive positioning based on patent content.
+
+## Features
+
+- **Patent Retrieval**: Automated collection via SerpAPI's Google Patents engine
+- **Intelligent Parsing**: Extracts key sections (abstract, claims, summary) from patent PDFs
+- **Content Minimization**: Removes verbose descriptions to reduce LLM token usage
+- **AI Analysis**: Uses Claude 3.5 Sonnet via OpenRouter to analyze innovation quality and market potential
+- **Portfolio Analysis**: Evaluates multiple patents holistically for comprehensive insights
+- **Robust Testing**: 26 tests covering all major functionality
+
+## Architecture
+
+```
+SPARC/
+├── serp_api.py       # Patent retrieval and PDF parsing
+├── llm.py            # Claude AI integration via OpenRouter
+├── analyzer.py       # High-level orchestration
+├── types.py          # Data models
+└── config.py         # Environment configuration
 ```

-## Roadmap
- [X] Retrive `publicationID` from SERP API 
- [ ] Retrive data from Google's patent API based on those `publicationID`'s
-    - This may not be needed, looking to parse the patents based soley on the pdf retrived from SERP
- [ ] Wrap this into a python fastAPI, then bundle with docker
+## Installation

+### NixOS (Recommended)
+
+```bash
+nix develop
+```
+
+This automatically creates a virtual environment and installs all dependencies.
+
+### Manual Installation
+
+```bash
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+```
+
+## Configuration
+
+Create a `.env` file in the project root:
+
+```bash
+# SerpAPI key for patent search
+API_KEY=your_serpapi_key_here
+
+# OpenRouter API key for Claude AI analysis
+OPENROUTER_API_KEY=your_openrouter_key_here
+```
+
+Get your API keys:
+- SerpAPI: https://serpapi.com/
+- OpenRouter: https://openrouter.ai/
+
+## Usage
+
+### Basic Usage
+
+```python
+from SPARC.analyzer import CompanyAnalyzer
+
+# Initialize the analyzer
+analyzer = CompanyAnalyzer()
+
+# Analyze a company's patent portfolio
+analysis = analyzer.analyze_company("nvidia")
+print(analysis)
+```
+
+### Run the Example
+
+```bash
+python main.py
+```
+
+This will:
+1. Retrieve recent NVIDIA patents
+2. Parse and minimize content
+3. Analyze with Claude AI
+4. Print comprehensive performance assessment
+
+### Single Patent Analysis
+
+```python
+# Analyze a specific patent
+result = analyzer.analyze_single_patent(
+    patent_id="US11322171B1",
+    company_name="nvidia"
+)
+```
+
+## Running Tests
+
+```bash
+# Run all tests
+pytest tests/ -v
+
+# Run specific test modules
+pytest tests/test_analyzer.py -v
+pytest tests/test_llm.py -v
+pytest tests/test_serp_api.py -v
+
+# Run with coverage
+pytest tests/ --cov=SPARC --cov-report=term-missing
+```
+
+## How It Works
+
+1. **Patent Collection**: Queries SerpAPI for company patents
+2. **PDF Download**: Retrieves patent PDF files
+3. **Section Extraction**: Parses abstract, claims, summary, and description
+4. **Content Minimization**: Keeps essential sections, removes bloated descriptions
+5. **LLM Analysis**: Sends minimized content to Claude for analysis
+6. **Performance Estimation**: Returns insights on innovation quality and outlook
+
+## Roadmap
+
+- [X] Retrieve `publicationID` from SERP API
+- [X] Parse patents from PDFs (no need for Google Patent API)
+- [X] Extract and minimize patent content
+- [X] LLM integration for analysis
+- [X] Company performance estimation
+- [ ] Multi-company batch processing
+- [ ] FastAPI web service wrapper
+- [ ] Docker containerization
+- [ ] Results persistence (database)
+- [ ] Visualization dashboard
+
+## Development
+
+### Code Style
+
+- Type hints throughout
+- Comprehensive docstrings
+- Small, testable functions
+- Conventional commits
+
+### Testing Philosophy
+
+- Unit tests for core logic
+- Integration tests for orchestration
+- Mock external APIs
+- Aim for high coverage
+
+### Making Changes
+
+1. Write tests first
+2. Implement feature
+3. Verify all tests pass
+4. Commit with conventional format: `type: description`
+
+Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`

 ## License
+
 For open source projects, say how it is licensed.

-## Project status
-Heavy development for the limited time available to me
+## Project Status
+
+Core functionality complete. Ready for production use with API keys configured.
+
+Next steps: API wrapper, containerization, and multi-company support.
@@ -0,0 +1,112 @@
+"""High-level patent analysis orchestration.
+
+This module ties together patent retrieval, parsing, and LLM analysis
+to provide company performance estimation based on patent portfolios.
+"""
+
+from SPARC.serp_api import SERP
+from SPARC.llm import LLMAnalyzer
+from SPARC.types import Patent
+from typing import List
+
+
+class CompanyAnalyzer:
+    """Orchestrates end-to-end company performance analysis via patents."""
+
+    def __init__(self, openrouter_api_key: str | None = None):
+        """Initialize the company analyzer.
+
+        Args:
+          openrouter_api_key: Optional OpenRouter API key. If None, loads from config.
+        """
+        self.llm_analyzer = LLMAnalyzer(api_key=openrouter_api_key)
+
+    def analyze_company(self, company_name: str) -> str:
+        """Analyze a company's performance based on their patent portfolio.
+
+        This is the main entry point that orchestrates the full pipeline:
+        1. Retrieve patents from SERP API
+        2. Download and parse each patent PDF
+        3. Minimize patent content (remove bloat)
+        4. Analyze portfolio with LLM
+        5. Return performance estimation
+
+        Args:
+          company_name: Name of the company to analyze
+
+        Returns:
+          Comprehensive analysis of company's innovation and performance outlook
+        """
+        print(f"Retrieving patents for {company_name}...")
+        patents = SERP.query(company_name)
+
+        if not patents.patents:
+            return f"No patents found for {company_name}"
+
+        print(f"Found {len(patents.patents)} patents. Processing...")
+
+        # Download and parse each patent
+        processed_patents = []
+        for idx, patent in enumerate(patents.patents, 1):
+            print(f"Processing patent {idx}/{len(patents.patents)}: {patent.patent_id}")
+
+            try:
+                # Download PDF
+                patent = SERP.save_patents(patent)
+
+                # Parse sections from PDF
+                sections = SERP.parse_patent_pdf(patent.pdf_path)
+
+                # Minimize for LLM (remove bloat)
+                minimized_content = SERP.minimize_patent_for_llm(sections)
+
+                processed_patents.append(
+                    {"patent_id": patent.patent_id, "content": minimized_content}
+                )
+
+            except Exception as e:
+                print(f"Warning: Failed to process {patent.patent_id}: {e}")
+                continue
+
+        if not processed_patents:
+            return f"Failed to process any patents for {company_name}"
+
+        print(f"Analyzing portfolio with LLM...")
+
+        # Analyze the full portfolio with LLM
+        analysis = self.llm_analyzer.analyze_patent_portfolio(
+            patents_data=processed_patents, company_name=company_name
+        )
+
+        return analysis
+
+    def analyze_single_patent(self, patent_id: str, company_name: str) -> str:
+        """Analyze a single patent by ID.
+
+        Useful for focused analysis of specific innovations.
+
+        Args:
+          patent_id: Publication ID of the patent
+          company_name: Name of the company (for context)
+
+        Returns:
+          Analysis of the specific patent's innovation quality
+        """
+        # Note: This simplified version assumes the patent PDF is already downloaded
+        # A more complete implementation would support direct patent ID lookup
+        print(f"Analyzing patent {patent_id} for {company_name}...")
+
+        patent_path = f"patents/{patent_id}.pdf"
+
+        try:
+            sections = SERP.parse_patent_pdf(patent_path)
+            minimized_content = SERP.minimize_patent_for_llm(sections)
+
+            analysis = self.llm_analyzer.analyze_patent_content(
+                patent_content=minimized_content, company_name=company_name
+            )
+
+            return analysis
+
+        except Exception as e:
+            return f"Failed to analyze patent {patent_id}: {e}"
@@ -10,5 +10,5 @@ load_dotenv()
 # SerpAPI key for patent search
 api_key = os.getenv("API_KEY")

-# Anthropic API key for LLM analysis
-anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
+# OpenRouter API key for LLM analysis
+openrouter_api_key = os.getenv("OPENROUTER_API_KEY")
@@ -1,6 +1,6 @@
-"""LLM integration for patent analysis using Anthropic's Claude."""
+"""LLM integration for patent analysis using OpenRouter."""

-from anthropic import Anthropic
+from openai import OpenAI
 from SPARC import config
 from typing import Dict

@@ -8,14 +8,23 @@ from typing import Dict
 class LLMAnalyzer:
    """Handles LLM-based analysis of patent content."""

-    def __init__(self, api_key: str | None = None):
+    def __init__(self, api_key: str | None = None, test_mode: bool = False):
        """Initialize the LLM analyzer.

        Args:
-          api_key: Anthropic API key. If None, will attempt to load from config.
+          api_key: OpenRouter API key. If None, will attempt to load from config.
+          test_mode: If True, print prompts instead of making API calls
        """
-        self.client = Anthropic(api_key=api_key or config.anthropic_api_key)
-        self.model = "claude-3-5-sonnet-20241022"
+        self.test_mode = test_mode
+
+        if (api_key or config.openrouter_api_key) and not test_mode:
+            self.client = OpenAI(
+                api_key=api_key or config.openrouter_api_key,
+                base_url="https://openrouter.ai/api/v1"
+            )
+            self.model = "anthropic/claude-3.5-sonnet"
+        else:
+            self.client = None

    def analyze_patent_content(self, patent_content: str, company_name: str) -> str:
        """Analyze patent content to estimate company innovation and performance.
@@ -40,14 +49,22 @@ Patent Content:

 Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals about the company's technical direction and competitive advantage."""

-        message = self.client.messages.create(
-            model=self.model,
-            max_tokens=1024,
-            messages=[{"role": "user", "content": prompt}],
-        )
-
-        return message.content[0].text
+        if self.test_mode:
+            print("=" * 80)
+            print("TEST MODE - Prompt that would be sent to LLM:")
+            print("=" * 80)
+            print(prompt)
+            print("=" * 80)
+            return "[TEST MODE - No API call made]"

+        if self.client:
+            response = self.client.chat.completions.create(
+                model=self.model,
+                max_tokens=1024,
+                messages=[{"role": "user", "content": prompt}],
+            )
+            return response.choices[0].message.content
+       
    def analyze_patent_portfolio(
        self, patents_data: list[Dict[str, str]], company_name: str
    ) -> str:
@@ -84,10 +101,18 @@ Patent Portfolio:

 Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the company's innovation strength and performance outlook."""

-        message = self.client.messages.create(
-            model=self.model,
-            max_tokens=2048,
-            messages=[{"role": "user", "content": prompt}],
-        )
+        if self.test_mode:
+            print(prompt)
+            return "[TEST MODE]"

-        return message.content[0].text
+        try:
+            response = self.client.chat.completions.create(
+                model=self.model,
+                max_tokens=2048,
+                messages=[{"role": "user", "content": prompt}],
+            )
+
+            return response.choices[0].message.content
+        except AttributeError:
+            return prompt
+       
@@ -48,8 +48,8 @@
            fi

            # Prompt tweak so you can see when venv is active
-            export PS1="(SPARC-venv) $PS1"
+            export NIX_PROJECT_SHELL="SPARC"
          '';
        };
      });
-}
+}
@@ -1,10 +1,43 @@
-from SPARC.serp_api import SERP
+"""SPARC - Semiconductor Patent & Analytics Report Core

-patents = SERP.query("nvidia")
+Example usage of the company performance analyzer.

-for patent in patents.patents:
-  patent = SERP.save_patents(patent)
-  patent.summary = SERP.parse_patent_pdf(patent.pdf_path)
-  print(patent.summary)
+Before running:
+1. Create a .env file with:
+   API_KEY=your_serpapi_key
+   OPENROUTER_API_KEY=your_openrouter_key

-print(patents)
+2. Run: python main.py
+"""
+
+from SPARC.analyzer import CompanyAnalyzer
+
+
+def main():
+    """Analyze a company's performance based on their patent portfolio."""
+
+    # Initialize the analyzer (loads API keys from .env)
+    analyzer = CompanyAnalyzer()
+
+    # Analyze a company - this will:
+    # 1. Retrieve patents from SERP API
+    # 2. Download and parse patent PDFs
+    # 3. Minimize content (remove bloat)
+    # 4. Analyze with Claude to estimate performance
+    company_name = "nvidia"
+
+    print(f"\n{'=' * 70}")
+    print(f"SPARC Patent Analysis - {company_name.upper()}")
+    print(f"{'=' * 70}\n")
+
+    analysis = analyzer.analyze_company(company_name)
+
+    print(f"\n{'=' * 70}")
+    print("ANALYSIS RESULTS")
+    print(f"{'=' * 70}\n")
+    print(analysis)
+    print(f"\n{'=' * 70}\n")
+
+
+if __name__ == "__main__":
+    main()
@@ -4,4 +4,4 @@ pdfplumber
 requests
 pytest
 pytest-mock
-anthropic
+openai
@@ -0,0 +1,178 @@
+"""Tests for the high-level company analyzer orchestration."""
+
+import pytest
+from unittest.mock import Mock, patch
+from SPARC.analyzer import CompanyAnalyzer
+from SPARC.types import Patent, Patents
+
+
+class TestCompanyAnalyzer:
+    """Test the CompanyAnalyzer orchestration logic."""
+
+    def test_analyzer_initialization(self, mocker):
+        """Test analyzer initialization with API key."""
+        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        analyzer = CompanyAnalyzer(openrouter_api_key="test-key")
+
+        mock_llm.assert_called_once_with(api_key="test-key")
+
+    def test_analyze_company_full_pipeline(self, mocker):
+        """Test complete company analysis pipeline."""
+        # Mock all the dependencies
+        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
+        mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
+        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
+        mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
+        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        # Setup mock return values
+        test_patent = Patent(
+            patent_id="US123", pdf_link="http://example.com/test.pdf"
+        )
+        mock_query.return_value = Patents(patents=[test_patent])
+
+        test_patent.pdf_path = "patents/US123.pdf"
+        mock_save.return_value = test_patent
+
+        mock_parse.return_value = {
+            "abstract": "Test abstract",
+            "claims": "Test claims",
+        }
+
+        mock_minimize.return_value = "Minimized content"
+
+        mock_llm_instance = Mock()
+        mock_llm_instance.analyze_patent_portfolio.return_value = (
+            "Strong innovation portfolio"
+        )
+        mock_llm.return_value = mock_llm_instance
+
+        # Run the analysis
+        analyzer = CompanyAnalyzer()
+        result = analyzer.analyze_company("TestCorp")
+
+        # Verify the pipeline executed correctly
+        assert result == "Strong innovation portfolio"
+        mock_query.assert_called_once_with("TestCorp")
+        mock_save.assert_called_once()
+        mock_parse.assert_called_once_with("patents/US123.pdf")
+        mock_minimize.assert_called_once()
+        mock_llm_instance.analyze_patent_portfolio.assert_called_once()
+
+        # Verify the data passed to LLM
+        llm_call_args = mock_llm_instance.analyze_patent_portfolio.call_args
+        patents_data = llm_call_args[1]["patents_data"]
+        assert len(patents_data) == 1
+        assert patents_data[0]["patent_id"] == "US123"
+        assert patents_data[0]["content"] == "Minimized content"
+
+    def test_analyze_company_no_patents_found(self, mocker):
+        """Test handling when no patents are found for a company."""
+        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
+        mock_query.return_value = Patents(patents=[])
+        mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        analyzer = CompanyAnalyzer()
+        result = analyzer.analyze_company("UnknownCorp")
+
+        assert result == "No patents found for UnknownCorp"
+
+    def test_analyze_company_handles_processing_errors(self, mocker):
+        """Test that analysis continues even if some patents fail to process."""
+        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
+        mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
+        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
+        mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
+        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        # Create two test patents
+        patent1 = Patent(patent_id="US123", pdf_link="http://example.com/1.pdf")
+        patent2 = Patent(patent_id="US456", pdf_link="http://example.com/2.pdf")
+        mock_query.return_value = Patents(patents=[patent1, patent2])
+
+        # First patent processes successfully
+        patent1.pdf_path = "patents/US123.pdf"
+
+        # Second patent raises an error
+        def save_side_effect(p):
+            if p.patent_id == "US123":
+                p.pdf_path = "patents/US123.pdf"
+                return p
+            else:
+                raise Exception("Download failed")
+
+        mock_save.side_effect = save_side_effect
+
+        mock_parse.return_value = {"abstract": "Test"}
+        mock_minimize.return_value = "Content"
+
+        mock_llm_instance = Mock()
+        mock_llm_instance.analyze_patent_portfolio.return_value = "Analysis result"
+        mock_llm.return_value = mock_llm_instance
+
+        analyzer = CompanyAnalyzer()
+        result = analyzer.analyze_company("TestCorp")
+
+        # Should still succeed with the one patent that worked
+        assert result == "Analysis result"
+
+        # Verify only one patent was analyzed
+        llm_call_args = mock_llm_instance.analyze_patent_portfolio.call_args
+        patents_data = llm_call_args[1]["patents_data"]
+        assert len(patents_data) == 1
+        assert patents_data[0]["patent_id"] == "US123"
+
+    def test_analyze_company_all_patents_fail(self, mocker):
+        """Test handling when all patents fail to process."""
+        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
+        mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
+        mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        patent = Patent(patent_id="US123", pdf_link="http://example.com/1.pdf")
+        mock_query.return_value = Patents(patents=[patent])
+
+        # Make processing fail
+        mock_save.side_effect = Exception("Processing error")
+
+        analyzer = CompanyAnalyzer()
+        result = analyzer.analyze_company("TestCorp")
+
+        assert result == "Failed to process any patents for TestCorp"
+
+    def test_analyze_single_patent(self, mocker):
+        """Test single patent analysis."""
+        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
+        mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
+        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        mock_parse.return_value = {"abstract": "Test abstract"}
+        mock_minimize.return_value = "Minimized content"
+
+        mock_llm_instance = Mock()
+        mock_llm_instance.analyze_patent_content.return_value = (
+            "Innovative patent analysis"
+        )
+        mock_llm.return_value = mock_llm_instance
+
+        analyzer = CompanyAnalyzer()
+        result = analyzer.analyze_single_patent("US123", "TestCorp")
+
+        assert result == "Innovative patent analysis"
+        mock_parse.assert_called_once_with("patents/US123.pdf")
+        mock_llm_instance.analyze_patent_content.assert_called_once_with(
+            patent_content="Minimized content", company_name="TestCorp"
+        )
+
+    def test_analyze_single_patent_error_handling(self, mocker):
+        """Test single patent analysis with processing error."""
+        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
+        mocker.patch("SPARC.analyzer.LLMAnalyzer")
+
+        mock_parse.side_effect = FileNotFoundError("PDF not found")
+
+        analyzer = CompanyAnalyzer()
+        result = analyzer.analyze_single_patent("US999", "TestCorp")
+
+        assert "Failed to analyze patent US999" in result
+        assert "PDF not found" in result
@@ -10,33 +10,39 @@ class TestLLMAnalyzer:

    def test_analyzer_initialization_with_api_key(self, mocker):
        """Test that analyzer initializes with provided API key."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")

        analyzer = LLMAnalyzer(api_key="test-key-123")

-        mock_anthropic.assert_called_once_with(api_key="test-key-123")
-        assert analyzer.model == "claude-3-5-sonnet-20241022"
+        mock_openai.assert_called_once_with(
+            api_key="test-key-123",
+            base_url="https://openrouter.ai/api/v1"
+        )
+        assert analyzer.model == "anthropic/claude-3.5-sonnet"

    def test_analyzer_initialization_from_config(self, mocker):
        """Test that analyzer loads API key from config when not provided."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_config = mocker.patch("SPARC.llm.config")
-        mock_config.anthropic_api_key = "config-key-456"
+        mock_config.openrouter_api_key = "config-key-456"

        analyzer = LLMAnalyzer()

-        mock_anthropic.assert_called_once_with(api_key="config-key-456")
+        mock_openai.assert_called_once_with(
+            api_key="config-key-456",
+            base_url="https://openrouter.ai/api/v1"
+        )

    def test_analyze_patent_content(self, mocker):
        """Test single patent content analysis."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client

        # Mock the API response
        mock_response = Mock()
-        mock_response.content = [Mock(text="Innovative GPU architecture.")]
-        mock_client.messages.create.return_value = mock_response
+        mock_response.choices = [Mock(message=Mock(content="Innovative GPU architecture."))]
+        mock_client.chat.completions.create.return_value = mock_response

        analyzer = LLMAnalyzer(api_key="test-key")
        result = analyzer.analyze_patent_content(
@@ -45,26 +51,26 @@ class TestLLMAnalyzer:
        )

        assert result == "Innovative GPU architecture."
-        mock_client.messages.create.assert_called_once()
+        mock_client.chat.completions.create.assert_called_once()

        # Verify the prompt includes company name and content
-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        prompt_text = call_args[1]["messages"][0]["content"]
        assert "NVIDIA" in prompt_text
        assert "GPU with new cache design" in prompt_text

    def test_analyze_patent_portfolio(self, mocker):
        """Test portfolio analysis with multiple patents."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client

        # Mock the API response
        mock_response = Mock()
-        mock_response.content = [
-            Mock(text="Strong portfolio in AI and graphics.")
+        mock_response.choices = [
+            Mock(message=Mock(content="Strong portfolio in AI and graphics."))
        ]
-        mock_client.messages.create.return_value = mock_response
+        mock_client.chat.completions.create.return_value = mock_response

        analyzer = LLMAnalyzer(api_key="test-key")
        patents_data = [
@@ -77,10 +83,10 @@ class TestLLMAnalyzer:
        )

        assert result == "Strong portfolio in AI and graphics."
-        mock_client.messages.create.assert_called_once()
+        mock_client.chat.completions.create.assert_called_once()

        # Verify the prompt includes all patents
-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        prompt_text = call_args[1]["messages"][0]["content"]
        assert "US123" in prompt_text
        assert "US456" in prompt_text
@@ -89,36 +95,36 @@ class TestLLMAnalyzer:

    def test_analyze_patent_portfolio_with_correct_token_limit(self, mocker):
        """Test that portfolio analysis uses higher token limit."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client

        mock_response = Mock()
-        mock_response.content = [Mock(text="Analysis result.")]
-        mock_client.messages.create.return_value = mock_response
+        mock_response.choices = [Mock(message=Mock(content="Analysis result."))]
+        mock_client.chat.completions.create.return_value = mock_response

        analyzer = LLMAnalyzer(api_key="test-key")
        patents_data = [{"patent_id": "US123", "content": "Test content"}]

        analyzer.analyze_patent_portfolio(patents_data, "TestCo")

-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        # Portfolio analysis should use 2048 tokens
        assert call_args[1]["max_tokens"] == 2048

    def test_analyze_single_patent_with_correct_token_limit(self, mocker):
        """Test that single patent analysis uses lower token limit."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client

        mock_response = Mock()
-        mock_response.content = [Mock(text="Analysis result.")]
-        mock_client.messages.create.return_value = mock_response
+        mock_response.choices = [Mock(message=Mock(content="Analysis result."))]
+        mock_client.chat.completions.create.return_value = mock_response

        analyzer = LLMAnalyzer(api_key="test-key")
        analyzer.analyze_patent_content("Test content", "TestCo")

-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        # Single patent should use 1024 tokens
        assert call_args[1]["max_tokens"] == 1024
Author	SHA1	Message	Date
0xWheatyz	c6843ac115	fix(ci): invaild syntax in ci	2026-02-22 12:45:12 -05:00
0xWheatyz	56892ebbdc	feat: gitlab container	2026-02-22 12:43:32 -05:00
0xWheatyz	dc7eedd902	feat: Docker integration	2026-02-22 12:30:37 -05:00
0xWheatyz	a65c267687	chore: update Nix shell prompt configuration Replace PS1 export with NIX_PROJECT_SHELL environment variable for better integration with shell prompt configurations. Also add trailing newline to flake.nix for proper formatting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:27:16 -05:00
0xWheatyz	a498b6f525	docs: update documentation for OpenRouter migration Update all user-facing documentation to reflect the migration from Anthropic API to OpenRouter. Changes: - Update README.md to reference OpenRouter instead of Anthropic in: - Features section - Architecture diagram comments - Configuration instructions - API key acquisition links - Update main.py docstring to use OPENROUTER_API_KEY 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:27:06 -05:00
0xWheatyz	af4114969a	feat: migrate from Anthropic API to OpenRouter Replace direct Anthropic API integration with OpenRouter to enable more flexible LLM provider access while maintaining Claude 3.5 Sonnet. Changes: - Replace anthropic package with openai in requirements.txt - Update config to use OPENROUTER_API_KEY instead of ANTHROPIC_API_KEY - Migrate LLMAnalyzer from Anthropic client to OpenAI client with OpenRouter base URL (https://openrouter.ai/api/v1) - Update model identifier to OpenRouter format: anthropic/claude-3.5-sonnet - Convert API calls from messages.create() to chat.completions.create() - Update response parsing to match OpenAI format - Rename API key parameter in CompanyAnalyzer from anthropic_api_key to openrouter_api_key - Update all tests to mock OpenAI client instead of Anthropic - Fix client initialization to accept direct API key parameter 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:26:56 -05:00
0xWheatyz	8971ebc913	chore: removed extra files	2026-02-19 22:46:53 -05:00
0xWheatyz	6882e53280	tests: testing modes have been added in an attempt to tune without wasting tokens.	2026-02-19 22:46:15 -05:00
0xWheatyz	b8566fc2af	docs: comprehensive README update Updated README.md with complete documentation: - Project overview and features - Architecture diagram - Installation instructions (NixOS + manual) - Configuration guide with API key setup - Usage examples (basic + single patent) - Testing instructions - How it works explanation - Updated roadmap with completed items - Development guidelines Makes the project immediately usable for other developers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:57:57 -05:00
0xWheatyz	a91c3badab	feat: implement company performance estimation orchestration Created CompanyAnalyzer class that orchestrates the complete pipeline: 1. Retrieves patents via SERP API 2. Downloads and parses PDFs 3. Minimizes content (removes bloat) 4. Analyzes portfolio with LLM 5. Returns performance estimation Features: - Full company portfolio analysis - Single patent analysis support - Robust error handling (continues on partial failures) - Progress logging for user visibility Updated main.py with clean example usage demonstrating the high-level API. Added comprehensive test suite (7 tests) covering: - Full pipeline integration - Error handling at each stage - Single patent analysis - Edge cases (no patents, all failures) All 26 tests passing. This completes the core functionality for patent-based company performance estimation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:57:10 -05:00