fix(ci): invaild syntax in ci

feat: gitlab container
feat: Docker integration
2026-02-22 12:45:12 -05:00 · 2026-02-22 12:43:32 -05:00 · 2026-02-22 12:30:37 -05:00 · 2026-02-22 12:27:16 -05:00 · 2026-02-22 12:27:06 -05:00 · 2026-02-22 12:26:56 -05:00
12 changed files with 625 additions and 77 deletions
@@ -3,3 +3,4 @@
 __pycache__
 .venv
 patents
 tmp/
@@ -0,0 +1,33 @@
 stages:
  - build
 variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"
  IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
  LATEST_TAG: $CI_REGISTRY_IMAGE:latest
 build-and-push:
  stage: build
  image: docker:24-cli
  services:
    - docker:24-dind
  before_script:
    - echo "Logging into GitLab Container Registry..."
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - echo "Building Docker image..."
    - docker build -t $IMAGE_TAG -t $LATEST_TAG .
    - echo "Pushing Docker image to registry..."
    - docker push $IMAGE_TAG
    - docker push $LATEST_TAG
    - echo "Build and push completed successfully!"
    - echo "Image available at $IMAGE_TAG"
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: always
    - if: $CI_COMMIT_TAG
      when: always
    - when: manual
  tags:
    - docker
@@ -0,0 +1,16 @@
 FROM python:3.14
 WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 COPY . .
 RUN useradd app
 USER app
 CMD ["python3", "main.py"]
@@ -1,28 +1,172 @@
 # SPARC
-## Name
+**Semiconductor Patent & Analytics Report Core**
 Semiconductor Patent & Analytics Report Core
-## Description
+A patent analysis system that estimates company performance by analyzing their patent portfolios using LLM-powered insights.
-## Installation
+## Overview
 ### NixOS Installation
 `nix develop` to build and configure nix dev environment
-## Usage
+SPARC automatically collects, parses, and analyzes patents from companies to provide performance estimations. It uses Claude AI to evaluate innovation quality, strategic direction, and competitive positioning based on patent content.
-```bash
+
-docker compose up -d
+## Features
 - **Patent Retrieval**: Automated collection via SerpAPI's Google Patents engine
 - **Intelligent Parsing**: Extracts key sections (abstract, claims, summary) from patent PDFs
 - **Content Minimization**: Removes verbose descriptions to reduce LLM token usage
 - **AI Analysis**: Uses Claude 3.5 Sonnet via OpenRouter to analyze innovation quality and market potential
 - **Portfolio Analysis**: Evaluates multiple patents holistically for comprehensive insights
 - **Robust Testing**: 26 tests covering all major functionality
 ## Architecture
 ```
 SPARC/
 ├── serp_api.py       # Patent retrieval and PDF parsing
 ├── llm.py            # Claude AI integration via OpenRouter
 ├── analyzer.py       # High-level orchestration
 ├── types.py          # Data models
 └── config.py         # Environment configuration
 ```
-## Roadmap
+## Installation
 - [X] Retrive `publicationID` from SERP API 
 - [ ] Retrive data from Google's patent API based on those `publicationID`'s
    - This may not be needed, looking to parse the patents based soley on the pdf retrived from SERP
 - [ ] Wrap this into a python fastAPI, then bundle with docker
 ### NixOS (Recommended)
 ```bash
 nix develop
 ```
 This automatically creates a virtual environment and installs all dependencies.
 ### Manual Installation
 ```bash
 python -m venv .venv
 source .venv/bin/activate
 pip install -r requirements.txt
 ```
 ## Configuration
 Create a `.env` file in the project root:
 ```bash
 # SerpAPI key for patent search
 API_KEY=your_serpapi_key_here
 # OpenRouter API key for Claude AI analysis
 OPENROUTER_API_KEY=your_openrouter_key_here
 ```
 Get your API keys:
 - SerpAPI: https://serpapi.com/
 - OpenRouter: https://openrouter.ai/
 ## Usage
 ### Basic Usage
 ```python
 from SPARC.analyzer import CompanyAnalyzer
 # Initialize the analyzer
 analyzer = CompanyAnalyzer()
 # Analyze a company's patent portfolio
 analysis = analyzer.analyze_company("nvidia")
 print(analysis)
 ```
 ### Run the Example
 ```bash
 python main.py
 ```
 This will:
 1. Retrieve recent NVIDIA patents
 2. Parse and minimize content
 3. Analyze with Claude AI
 4. Print comprehensive performance assessment
 ### Single Patent Analysis
 ```python
 # Analyze a specific patent
 result = analyzer.analyze_single_patent(
    patent_id="US11322171B1",
    company_name="nvidia"
 )
 ```
 ## Running Tests
 ```bash
 # Run all tests
 pytest tests/ -v
 # Run specific test modules
 pytest tests/test_analyzer.py -v
 pytest tests/test_llm.py -v
 pytest tests/test_serp_api.py -v
 # Run with coverage
 pytest tests/ --cov=SPARC --cov-report=term-missing
 ```
 ## How It Works
 1. **Patent Collection**: Queries SerpAPI for company patents
 2. **PDF Download**: Retrieves patent PDF files
 3. **Section Extraction**: Parses abstract, claims, summary, and description
 4. **Content Minimization**: Keeps essential sections, removes bloated descriptions
 5. **LLM Analysis**: Sends minimized content to Claude for analysis
 6. **Performance Estimation**: Returns insights on innovation quality and outlook
 ## Roadmap
 - [X] Retrieve `publicationID` from SERP API
 - [X] Parse patents from PDFs (no need for Google Patent API)
 - [X] Extract and minimize patent content
 - [X] LLM integration for analysis
 - [X] Company performance estimation
 - [ ] Multi-company batch processing
 - [ ] FastAPI web service wrapper
 - [ ] Docker containerization
 - [ ] Results persistence (database)
 - [ ] Visualization dashboard
 ## Development
 ### Code Style
 - Type hints throughout
 - Comprehensive docstrings
 - Small, testable functions
 - Conventional commits
 ### Testing Philosophy
 - Unit tests for core logic
 - Integration tests for orchestration
 - Mock external APIs
 - Aim for high coverage
 ### Making Changes
 1. Write tests first
 2. Implement feature
 3. Verify all tests pass
 4. Commit with conventional format: `type: description`
 Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
 ## License
 For open source projects, say how it is licensed.
-## Project status
+## Project Status
-Heavy development for the limited time available to me
+
 Core functionality complete. Ready for production use with API keys configured.
 Next steps: API wrapper, containerization, and multi-company support.
@@ -0,0 +1,112 @@
 """High-level patent analysis orchestration.
 This module ties together patent retrieval, parsing, and LLM analysis
 to provide company performance estimation based on patent portfolios.
 """
 from SPARC.serp_api import SERP
 from SPARC.llm import LLMAnalyzer
 from SPARC.types import Patent
 from typing import List
 class CompanyAnalyzer:
    """Orchestrates end-to-end company performance analysis via patents."""
    def __init__(self, openrouter_api_key: str | None = None):
        """Initialize the company analyzer.
        Args:
          openrouter_api_key: Optional OpenRouter API key. If None, loads from config.
        """
        self.llm_analyzer = LLMAnalyzer(api_key=openrouter_api_key)
    def analyze_company(self, company_name: str) -> str:
        """Analyze a company's performance based on their patent portfolio.
        This is the main entry point that orchestrates the full pipeline:
        1. Retrieve patents from SERP API
        2. Download and parse each patent PDF
        3. Minimize patent content (remove bloat)
        4. Analyze portfolio with LLM
        5. Return performance estimation
        Args:
          company_name: Name of the company to analyze
        Returns:
          Comprehensive analysis of company's innovation and performance outlook
        """
        print(f"Retrieving patents for {company_name}...")
        patents = SERP.query(company_name)
        if not patents.patents:
            return f"No patents found for {company_name}"
        print(f"Found {len(patents.patents)} patents. Processing...")
        # Download and parse each patent
        processed_patents = []
        for idx, patent in enumerate(patents.patents, 1):
            print(f"Processing patent {idx}/{len(patents.patents)}: {patent.patent_id}")
            try:
                # Download PDF
                patent = SERP.save_patents(patent)
                # Parse sections from PDF
                sections = SERP.parse_patent_pdf(patent.pdf_path)
                # Minimize for LLM (remove bloat)
                minimized_content = SERP.minimize_patent_for_llm(sections)
                processed_patents.append(
                    {"patent_id": patent.patent_id, "content": minimized_content}
                )
            except Exception as e:
                print(f"Warning: Failed to process {patent.patent_id}: {e}")
                continue
        if not processed_patents:
            return f"Failed to process any patents for {company_name}"
        print(f"Analyzing portfolio with LLM...")
        # Analyze the full portfolio with LLM
        analysis = self.llm_analyzer.analyze_patent_portfolio(
            patents_data=processed_patents, company_name=company_name
        )
        return analysis
    def analyze_single_patent(self, patent_id: str, company_name: str) -> str:
        """Analyze a single patent by ID.
        Useful for focused analysis of specific innovations.
        Args:
          patent_id: Publication ID of the patent
          company_name: Name of the company (for context)
        Returns:
          Analysis of the specific patent's innovation quality
        """
        # Note: This simplified version assumes the patent PDF is already downloaded
        # A more complete implementation would support direct patent ID lookup
        print(f"Analyzing patent {patent_id} for {company_name}...")
        patent_path = f"patents/{patent_id}.pdf"
        try:
            sections = SERP.parse_patent_pdf(patent_path)
            minimized_content = SERP.minimize_patent_for_llm(sections)
            analysis = self.llm_analyzer.analyze_patent_content(
                patent_content=minimized_content, company_name=company_name
            )
            return analysis
        except Exception as e:
            return f"Failed to analyze patent {patent_id}: {e}"
@@ -10,5 +10,5 @@ load_dotenv()
 # SerpAPI key for patent search
 api_key = os.getenv("API_KEY")
-# Anthropic API key for LLM analysis
+# OpenRouter API key for LLM analysis
-anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
+openrouter_api_key = os.getenv("OPENROUTER_API_KEY")
@@ -1,6 +1,6 @@
-"""LLM integration for patent analysis using Anthropic's Claude."""
+"""LLM integration for patent analysis using OpenRouter."""
-from anthropic import Anthropic
+from openai import OpenAI
 from SPARC import config
 from typing import Dict
@@ -8,14 +8,23 @@ from typing import Dict
 class LLMAnalyzer:
    """Handles LLM-based analysis of patent content."""
-    def __init__(self, api_key: str | None = None):
+    def __init__(self, api_key: str | None = None, test_mode: bool = False):
        """Initialize the LLM analyzer.
        Args:
-          api_key: Anthropic API key. If None, will attempt to load from config.
+          api_key: OpenRouter API key. If None, will attempt to load from config.
          test_mode: If True, print prompts instead of making API calls
        """
-        self.client = Anthropic(api_key=api_key or config.anthropic_api_key)
+        self.test_mode = test_mode
-        self.model = "claude-3-5-sonnet-20241022"
+
        if (api_key or config.openrouter_api_key) and not test_mode:
            self.client = OpenAI(
                api_key=api_key or config.openrouter_api_key,
                base_url="https://openrouter.ai/api/v1"
            )
            self.model = "anthropic/claude-3.5-sonnet"
        else:
            self.client = None
    def analyze_patent_content(self, patent_content: str, company_name: str) -> str:
        """Analyze patent content to estimate company innovation and performance.
@@ -40,13 +49,21 @@ Patent Content:
 Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals about the company's technical direction and competitive advantage."""
-        message = self.client.messages.create(
+        if self.test_mode:
-            model=self.model,
+            print("=" * 80)
-            max_tokens=1024,
+            print("TEST MODE - Prompt that would be sent to LLM:")
-            messages=[{"role": "user", "content": prompt}],
+            print("=" * 80)
-        )
+            print(prompt)
            print("=" * 80)
            return "[TEST MODE - No API call made]"
-        return message.content[0].text
+        if self.client:
            response = self.client.chat.completions.create(
                model=self.model,
                max_tokens=1024,
                messages=[{"role": "user", "content": prompt}],
            )
            return response.choices[0].message.content
    def analyze_patent_portfolio(
        self, patents_data: list[Dict[str, str]], company_name: str
@@ -84,10 +101,18 @@ Patent Portfolio:
 Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the company's innovation strength and performance outlook."""
-        message = self.client.messages.create(
+        if self.test_mode:
-            model=self.model,
+            print(prompt)
-            max_tokens=2048,
+            return "[TEST MODE]"
-            messages=[{"role": "user", "content": prompt}],
+
-        )
+        try:
            response = self.client.chat.completions.create(
                model=self.model,
                max_tokens=2048,
                messages=[{"role": "user", "content": prompt}],
            )
            return response.choices[0].message.content
        except AttributeError:
            return prompt
        return message.content[0].text
@@ -48,7 +48,7 @@
            fi
            # Prompt tweak so you can see when venv is active
-            export PS1="(SPARC-venv) $PS1"
+            export NIX_PROJECT_SHELL="SPARC"
          '';
        };
      });
@@ -1,10 +1,43 @@
-from SPARC.serp_api import SERP
+"""SPARC - Semiconductor Patent & Analytics Report Core
-patents = SERP.query("nvidia")
+Example usage of the company performance analyzer.
-for patent in patents.patents:
+Before running:
-  patent = SERP.save_patents(patent)
+1. Create a .env file with:
-  patent.summary = SERP.parse_patent_pdf(patent.pdf_path)
+   API_KEY=your_serpapi_key
-  print(patent.summary)
+   OPENROUTER_API_KEY=your_openrouter_key
-print(patents)
+2. Run: python main.py
 """
 from SPARC.analyzer import CompanyAnalyzer
 def main():
    """Analyze a company's performance based on their patent portfolio."""
    # Initialize the analyzer (loads API keys from .env)
    analyzer = CompanyAnalyzer()
    # Analyze a company - this will:
    # 1. Retrieve patents from SERP API
    # 2. Download and parse patent PDFs
    # 3. Minimize content (remove bloat)
    # 4. Analyze with Claude to estimate performance
    company_name = "nvidia"
    print(f"\n{'=' * 70}")
    print(f"SPARC Patent Analysis - {company_name.upper()}")
    print(f"{'=' * 70}\n")
    analysis = analyzer.analyze_company(company_name)
    print(f"\n{'=' * 70}")
    print("ANALYSIS RESULTS")
    print(f"{'=' * 70}\n")
    print(analysis)
    print(f"\n{'=' * 70}\n")
 if __name__ == "__main__":
    main()
@@ -4,4 +4,4 @@ pdfplumber
 requests
 pytest
 pytest-mock
-anthropic
+openai
@@ -0,0 +1,178 @@
 """Tests for the high-level company analyzer orchestration."""
 import pytest
 from unittest.mock import Mock, patch
 from SPARC.analyzer import CompanyAnalyzer
 from SPARC.types import Patent, Patents
 class TestCompanyAnalyzer:
    """Test the CompanyAnalyzer orchestration logic."""
    def test_analyzer_initialization(self, mocker):
        """Test analyzer initialization with API key."""
        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
        analyzer = CompanyAnalyzer(openrouter_api_key="test-key")
        mock_llm.assert_called_once_with(api_key="test-key")
    def test_analyze_company_full_pipeline(self, mocker):
        """Test complete company analysis pipeline."""
        # Mock all the dependencies
        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
        mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
        mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
        # Setup mock return values
        test_patent = Patent(
            patent_id="US123", pdf_link="http://example.com/test.pdf"
        )
        mock_query.return_value = Patents(patents=[test_patent])
        test_patent.pdf_path = "patents/US123.pdf"
        mock_save.return_value = test_patent
        mock_parse.return_value = {
            "abstract": "Test abstract",
            "claims": "Test claims",
        }
        mock_minimize.return_value = "Minimized content"
        mock_llm_instance = Mock()
        mock_llm_instance.analyze_patent_portfolio.return_value = (
            "Strong innovation portfolio"
        )
        mock_llm.return_value = mock_llm_instance
        # Run the analysis
        analyzer = CompanyAnalyzer()
        result = analyzer.analyze_company("TestCorp")
        # Verify the pipeline executed correctly
        assert result == "Strong innovation portfolio"
        mock_query.assert_called_once_with("TestCorp")
        mock_save.assert_called_once()
        mock_parse.assert_called_once_with("patents/US123.pdf")
        mock_minimize.assert_called_once()
        mock_llm_instance.analyze_patent_portfolio.assert_called_once()
        # Verify the data passed to LLM
        llm_call_args = mock_llm_instance.analyze_patent_portfolio.call_args
        patents_data = llm_call_args[1]["patents_data"]
        assert len(patents_data) == 1
        assert patents_data[0]["patent_id"] == "US123"
        assert patents_data[0]["content"] == "Minimized content"
    def test_analyze_company_no_patents_found(self, mocker):
        """Test handling when no patents are found for a company."""
        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
        mock_query.return_value = Patents(patents=[])
        mocker.patch("SPARC.analyzer.LLMAnalyzer")
        analyzer = CompanyAnalyzer()
        result = analyzer.analyze_company("UnknownCorp")
        assert result == "No patents found for UnknownCorp"
    def test_analyze_company_handles_processing_errors(self, mocker):
        """Test that analysis continues even if some patents fail to process."""
        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
        mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
        mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
        # Create two test patents
        patent1 = Patent(patent_id="US123", pdf_link="http://example.com/1.pdf")
        patent2 = Patent(patent_id="US456", pdf_link="http://example.com/2.pdf")
        mock_query.return_value = Patents(patents=[patent1, patent2])
        # First patent processes successfully
        patent1.pdf_path = "patents/US123.pdf"
        # Second patent raises an error
        def save_side_effect(p):
            if p.patent_id == "US123":
                p.pdf_path = "patents/US123.pdf"
                return p
            else:
                raise Exception("Download failed")
        mock_save.side_effect = save_side_effect
        mock_parse.return_value = {"abstract": "Test"}
        mock_minimize.return_value = "Content"
        mock_llm_instance = Mock()
        mock_llm_instance.analyze_patent_portfolio.return_value = "Analysis result"
        mock_llm.return_value = mock_llm_instance
        analyzer = CompanyAnalyzer()
        result = analyzer.analyze_company("TestCorp")
        # Should still succeed with the one patent that worked
        assert result == "Analysis result"
        # Verify only one patent was analyzed
        llm_call_args = mock_llm_instance.analyze_patent_portfolio.call_args
        patents_data = llm_call_args[1]["patents_data"]
        assert len(patents_data) == 1
        assert patents_data[0]["patent_id"] == "US123"
    def test_analyze_company_all_patents_fail(self, mocker):
        """Test handling when all patents fail to process."""
        mock_query = mocker.patch("SPARC.analyzer.SERP.query")
        mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
        mocker.patch("SPARC.analyzer.LLMAnalyzer")
        patent = Patent(patent_id="US123", pdf_link="http://example.com/1.pdf")
        mock_query.return_value = Patents(patents=[patent])
        # Make processing fail
        mock_save.side_effect = Exception("Processing error")
        analyzer = CompanyAnalyzer()
        result = analyzer.analyze_company("TestCorp")
        assert result == "Failed to process any patents for TestCorp"
    def test_analyze_single_patent(self, mocker):
        """Test single patent analysis."""
        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
        mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
        mock_parse.return_value = {"abstract": "Test abstract"}
        mock_minimize.return_value = "Minimized content"
        mock_llm_instance = Mock()
        mock_llm_instance.analyze_patent_content.return_value = (
            "Innovative patent analysis"
        )
        mock_llm.return_value = mock_llm_instance
        analyzer = CompanyAnalyzer()
        result = analyzer.analyze_single_patent("US123", "TestCorp")
        assert result == "Innovative patent analysis"
        mock_parse.assert_called_once_with("patents/US123.pdf")
        mock_llm_instance.analyze_patent_content.assert_called_once_with(
            patent_content="Minimized content", company_name="TestCorp"
        )
    def test_analyze_single_patent_error_handling(self, mocker):
        """Test single patent analysis with processing error."""
        mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
        mocker.patch("SPARC.analyzer.LLMAnalyzer")
        mock_parse.side_effect = FileNotFoundError("PDF not found")
        analyzer = CompanyAnalyzer()
        result = analyzer.analyze_single_patent("US999", "TestCorp")
        assert "Failed to analyze patent US999" in result
        assert "PDF not found" in result
@@ -10,33 +10,39 @@ class TestLLMAnalyzer:
    def test_analyzer_initialization_with_api_key(self, mocker):
        """Test that analyzer initializes with provided API key."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        analyzer = LLMAnalyzer(api_key="test-key-123")
-        mock_anthropic.assert_called_once_with(api_key="test-key-123")
+        mock_openai.assert_called_once_with(
-        assert analyzer.model == "claude-3-5-sonnet-20241022"
+            api_key="test-key-123",
            base_url="https://openrouter.ai/api/v1"
        )
        assert analyzer.model == "anthropic/claude-3.5-sonnet"
    def test_analyzer_initialization_from_config(self, mocker):
        """Test that analyzer loads API key from config when not provided."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_config = mocker.patch("SPARC.llm.config")
-        mock_config.anthropic_api_key = "config-key-456"
+        mock_config.openrouter_api_key = "config-key-456"
        analyzer = LLMAnalyzer()
-        mock_anthropic.assert_called_once_with(api_key="config-key-456")
+        mock_openai.assert_called_once_with(
            api_key="config-key-456",
            base_url="https://openrouter.ai/api/v1"
        )
    def test_analyze_patent_content(self, mocker):
        """Test single patent content analysis."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client
        # Mock the API response
        mock_response = Mock()
-        mock_response.content = [Mock(text="Innovative GPU architecture.")]
+        mock_response.choices = [Mock(message=Mock(content="Innovative GPU architecture."))]
-        mock_client.messages.create.return_value = mock_response
+        mock_client.chat.completions.create.return_value = mock_response
        analyzer = LLMAnalyzer(api_key="test-key")
        result = analyzer.analyze_patent_content(
@@ -45,26 +51,26 @@ class TestLLMAnalyzer:
        )
        assert result == "Innovative GPU architecture."
-        mock_client.messages.create.assert_called_once()
+        mock_client.chat.completions.create.assert_called_once()
        # Verify the prompt includes company name and content
-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        prompt_text = call_args[1]["messages"][0]["content"]
        assert "NVIDIA" in prompt_text
        assert "GPU with new cache design" in prompt_text
    def test_analyze_patent_portfolio(self, mocker):
        """Test portfolio analysis with multiple patents."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client
        # Mock the API response
        mock_response = Mock()
-        mock_response.content = [
+        mock_response.choices = [
-            Mock(text="Strong portfolio in AI and graphics.")
+            Mock(message=Mock(content="Strong portfolio in AI and graphics."))
        ]
-        mock_client.messages.create.return_value = mock_response
+        mock_client.chat.completions.create.return_value = mock_response
        analyzer = LLMAnalyzer(api_key="test-key")
        patents_data = [
@@ -77,10 +83,10 @@ class TestLLMAnalyzer:
        )
        assert result == "Strong portfolio in AI and graphics."
-        mock_client.messages.create.assert_called_once()
+        mock_client.chat.completions.create.assert_called_once()
        # Verify the prompt includes all patents
-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        prompt_text = call_args[1]["messages"][0]["content"]
        assert "US123" in prompt_text
        assert "US456" in prompt_text
@@ -89,36 +95,36 @@ class TestLLMAnalyzer:
    def test_analyze_patent_portfolio_with_correct_token_limit(self, mocker):
        """Test that portfolio analysis uses higher token limit."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client
        mock_response = Mock()
-        mock_response.content = [Mock(text="Analysis result.")]
+        mock_response.choices = [Mock(message=Mock(content="Analysis result."))]
-        mock_client.messages.create.return_value = mock_response
+        mock_client.chat.completions.create.return_value = mock_response
        analyzer = LLMAnalyzer(api_key="test-key")
        patents_data = [{"patent_id": "US123", "content": "Test content"}]
        analyzer.analyze_patent_portfolio(patents_data, "TestCo")
-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        # Portfolio analysis should use 2048 tokens
        assert call_args[1]["max_tokens"] == 2048
    def test_analyze_single_patent_with_correct_token_limit(self, mocker):
        """Test that single patent analysis uses lower token limit."""
-        mock_anthropic = mocker.patch("SPARC.llm.Anthropic")
+        mock_openai = mocker.patch("SPARC.llm.OpenAI")
        mock_client = Mock()
-        mock_anthropic.return_value = mock_client
+        mock_openai.return_value = mock_client
        mock_response = Mock()
-        mock_response.content = [Mock(text="Analysis result.")]
+        mock_response.choices = [Mock(message=Mock(content="Analysis result."))]
-        mock_client.messages.create.return_value = mock_response
+        mock_client.chat.completions.create.return_value = mock_response
        analyzer = LLMAnalyzer(api_key="test-key")
        analyzer.analyze_patent_content("Test content", "TestCo")
-        call_args = mock_client.messages.create.call_args
+        call_args = mock_client.chat.completions.create.call_args
        # Single patent should use 1024 tokens
        assert call_args[1]["max_tokens"] == 1024
Author	SHA1	Message	Date
0xWheatyz	c6843ac115	fix(ci): invaild syntax in ci	2026-02-22 12:45:12 -05:00
0xWheatyz	56892ebbdc	feat: gitlab container	2026-02-22 12:43:32 -05:00
0xWheatyz	dc7eedd902	feat: Docker integration	2026-02-22 12:30:37 -05:00
0xWheatyz	a65c267687	chore: update Nix shell prompt configuration Replace PS1 export with NIX_PROJECT_SHELL environment variable for better integration with shell prompt configurations. Also add trailing newline to flake.nix for proper formatting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:27:16 -05:00
0xWheatyz	a498b6f525	docs: update documentation for OpenRouter migration Update all user-facing documentation to reflect the migration from Anthropic API to OpenRouter. Changes: - Update README.md to reference OpenRouter instead of Anthropic in: - Features section - Architecture diagram comments - Configuration instructions - API key acquisition links - Update main.py docstring to use OPENROUTER_API_KEY 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:27:06 -05:00
0xWheatyz	af4114969a	feat: migrate from Anthropic API to OpenRouter Replace direct Anthropic API integration with OpenRouter to enable more flexible LLM provider access while maintaining Claude 3.5 Sonnet. Changes: - Replace anthropic package with openai in requirements.txt - Update config to use OPENROUTER_API_KEY instead of ANTHROPIC_API_KEY - Migrate LLMAnalyzer from Anthropic client to OpenAI client with OpenRouter base URL (https://openrouter.ai/api/v1) - Update model identifier to OpenRouter format: anthropic/claude-3.5-sonnet - Convert API calls from messages.create() to chat.completions.create() - Update response parsing to match OpenAI format - Rename API key parameter in CompanyAnalyzer from anthropic_api_key to openrouter_api_key - Update all tests to mock OpenAI client instead of Anthropic - Fix client initialization to accept direct API key parameter 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:26:56 -05:00
0xWheatyz	8971ebc913	chore: removed extra files	2026-02-19 22:46:53 -05:00
0xWheatyz	6882e53280	tests: testing modes have been added in an attempt to tune without wasting tokens.	2026-02-19 22:46:15 -05:00
0xWheatyz	b8566fc2af	docs: comprehensive README update Updated README.md with complete documentation: - Project overview and features - Architecture diagram - Installation instructions (NixOS + manual) - Configuration guide with API key setup - Usage examples (basic + single patent) - Testing instructions - How it works explanation - Updated roadmap with completed items - Development guidelines Makes the project immediately usable for other developers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:57:57 -05:00
0xWheatyz	a91c3badab	feat: implement company performance estimation orchestration Created CompanyAnalyzer class that orchestrates the complete pipeline: 1. Retrieves patents via SERP API 2. Downloads and parses PDFs 3. Minimizes content (removes bloat) 4. Analyzes portfolio with LLM 5. Returns performance estimation Features: - Full company portfolio analysis - Single patent analysis support - Robust error handling (continues on partial failures) - Progress logging for user visibility Updated main.py with clean example usage demonstrating the high-level API. Added comprehensive test suite (7 tests) covering: - Full pipeline integration - Error handling at each stage - Single patent analysis - Edge cases (no patents, all failures) All 26 tests passing. This completes the core functionality for patent-based company performance estimation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:57:10 -05:00