merge: resolve conflicts for S3 storage branch with main

Integrates S3/MinIO storage backend with structured logging changes from main. Both boto3 and apscheduler retained in requirements.txt.
Merge pull request 'feat: add webhook notification support for job completion' (#66 ) from feature/webhooks into main
2026-03-26 12:09:24 +00:00 · 2026-03-26 12:08:08 +00:00 · 2026-03-26 12:07:46 +00:00 · 2026-03-26 12:06:57 +00:00 · 2026-03-26 12:06:33 +00:00 · 2026-03-26 12:06:04 +00:00
41 changed files with 6795 additions and 304 deletions
@@ -1,21 +1,60 @@
 # SPARC Configuration
 # ---- Application Environment ----
 # Set to "production" or "staging" in deployed environments.
 # The API will refuse to start with the default JWT secret unless APP_ENV=development.
 APP_ENV=development
 # ---- API Keys ----
 # SerpAPI key for patent search
 API_KEY=your_serpapi_key_here
 # OpenRouter API key for LLM analysis
 OPENROUTER_API_KEY=your_openrouter_key_here
-# Database configuration
+# ---- Database ----
 # All messages are stored in the database for persistence and caching
 DATABASE_URL=postgresql://postgres:postgres@localhost:5432/sparc
-# Cache configuration
+# PostgreSQL credentials (used by docker-compose)
-# When USE_CACHE=true: check database for cached responses before making API calls
+POSTGRES_USER=postgres
-# When USE_CACHE=false: always make fresh API calls (still stores results in database)
+POSTGRES_PASSWORD=change-me-to-a-secure-password
-# Default: true
+POSTGRES_DB=sparc
 USE_CACHE=true
-# JWT Secret for authentication
+# Full database URL (must match the credentials above)
 DATABASE_URL=postgresql://postgres:change-me-to-a-secure-password@localhost:5432/sparc
 # ---- Authentication ----
 # JWT Secret for signing tokens
 # IMPORTANT: Change this to a secure random string in production
 JWT_SECRET=your-secure-jwt-secret-change-in-production
 # ---- CORS ----
 # Comma-separated list of allowed origins for CORS
 # Defaults to http://localhost:3000,http://localhost:5173 when unset
 # CORS_ORIGINS=https://sparc.example.com,https://app.example.com
 # ---- Storage ----
 # Backend for patent PDF storage: "local" (default) or "s3"
 STORAGE_BACKEND=local
 # S3/MinIO settings (only used when STORAGE_BACKEND=s3)
 # S3_BUCKET=sparc-patents
 # S3_ENDPOINT_URL=http://localhost:9000
 # AWS_ACCESS_KEY_ID=minioadmin
 # AWS_SECRET_ACCESS_KEY=minioadmin
 # To start MinIO locally: docker compose --profile s3 up -d minio
 # ---- Cache ----
 # When USE_CACHE=true: check database for cached responses before making API calls
 # When USE_CACHE=false: always make fresh API calls (still stores results in database)
 USE_CACHE=true
 # ---- Webhooks ----
 # Comma-separated list of webhook URLs for job completion and alert notifications
 # Supports generic HTTP POST and Slack/Discord incoming webhooks
 # WEBHOOK_URLS=https://hooks.slack.com/services/XXX,https://example.com/webhook
@@ -9,7 +9,43 @@ on:
  workflow_dispatch:
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: Install system dependencies
        shell: sh
        run: |
          apk add --no-cache git python3 py3-pip gcc musl-dev libpq-dev python3-dev
      - name: Checkout code
        shell: sh
        run: |
          git clone http://gitea.gitea.svc.cluster.local/${{ gitea.repository }}.git .
          git checkout ${{ gitea.sha }}
      - name: Install Python dependencies
        shell: sh
        run: |
          pip3 install --break-system-packages -r requirements.txt ruff
      - name: Run ruff linter
        shell: sh
        run: |
          ruff check SPARC/ tests/
      - name: Run pytest
        shell: sh
        env:
          DATABASE_URL: "sqlite://"
          API_KEY: "test-key"
          OPENROUTER_API_KEY: "test-key"
          JWT_SECRET: "test-secret-for-ci"
          APP_ENV: "development"
        run: |
          python3 -m pytest tests/ -v --tb=short -x
  build-api:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - name: Install dependencies
@@ -81,6 +117,7 @@ jobs:
          echo "API image available at ${{ steps.tags.outputs.IMAGE_TAG }}"
  build-frontend:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - name: Install dependencies
@@ -0,0 +1,57 @@
 name: Test and Lint
 on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main
  workflow_dispatch:
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - name: Install system dependencies
        shell: sh
        run: |
          apk add --no-cache git python3 py3-pip gcc musl-dev libpq-dev python3-dev
      - name: Checkout code
        shell: sh
        run: |
          git clone http://gitea.gitea.svc.cluster.local/${{ gitea.repository }}.git .
          git checkout ${{ gitea.sha }}
      - name: Install Python dependencies
        shell: sh
        run: |
          pip3 install --break-system-packages -r requirements.txt ruff
      - name: Run ruff linter
        shell: sh
        run: |
          ruff check SPARC/ tests/
      - name: Install Node.js and frontend dependencies
        shell: sh
        run: |
          apk add --no-cache nodejs npm
          cd frontend && npm ci
      - name: Run TypeScript type check
        shell: sh
        run: |
          cd frontend && npx tsc --noEmit
      - name: Run pytest
        shell: sh
        env:
          DATABASE_URL: "sqlite://"
          API_KEY: "test-key"
          OPENROUTER_API_KEY: "test-key"
          JWT_SECRET: "test-secret-for-ci"
          APP_ENV: "development"
        run: |
          python3 -m pytest tests/ -v --tb=short -x
@@ -54,6 +54,21 @@ docker-compose up -d
 # - API Docs: http://localhost:8000/docs
 ```
 #### Patent PDF Storage
 The API stores downloaded patent PDFs in a `patents/` directory. In Docker,
 this is mounted as a bind mount (`./patents:/app/patents`) so that PDFs persist
 across container restarts.
 If you deploy to a different environment, ensure the `patents/` directory is a
 persistent volume. Without it, PDFs will be re-downloaded on every analysis.
 ```yaml
 # docker-compose.yml excerpt
 volumes:
  - ./patents:/app/patents
 ```
 ### NixOS
 ```bash
@@ -1,3 +1,4 @@
-from .types import Patents, Patent
+from .types import Patent as Patent
 from .types import Patents as Patents
-all = ["Patents", "Patent"]
+__all__ = ["Patents", "Patent"]
@@ -5,14 +5,17 @@ to provide company performance estimation based on patent portfolios.
 """
 import hashlib
 import logging
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from typing import Callable
 from SPARC import config
 logger = logging.getLogger(__name__)
 from SPARC.database import DatabaseClient
 from SPARC.serp_api import SERP
 from SPARC.llm import LLMAnalyzer
-from SPARC.types import Patent, Patents, CompanyAnalysisResult, BatchAnalysisResult
+from SPARC.serp_api import SERP
 from SPARC.types import BatchAnalysisResult, CompanyAnalysisResult, Patent, Patents
 class CompanyAnalyzer:
@@ -52,13 +55,13 @@ class CompanyAnalyzer:
            query_hash = hashlib.sha256(company_name.lower().encode()).hexdigest()
            cached_ids = self.db.get_cached_serp_query(query_hash)
            if cached_ids is not None:
-                print(f"Using cached SERP results for {company_name} ({len(cached_ids)} patents)")
+                logger.info("Using cached SERP results for %s (%d patents)", company_name, len(cached_ids))
                patents = Patents(patents=[
                    Patent(patent_id=pid, pdf_link="")
                    for pid in cached_ids
                ])
            else:
-                print(f"Retrieving patents for {company_name}...")
+                logger.info("Retrieving patents for %s...", company_name)
                patents = SERP.query(company_name)
                # Cache the SERP results
                if patents.patents:
@@ -66,12 +69,13 @@ class CompanyAnalyzer:
                        company_name=company_name,
                        query_hash=query_hash,
                        patent_ids=[p.patent_id for p in patents.patents],
                        ttl_hours=config.serp_cache_ttl_hours,
                    )
        if not patents.patents:
            return f"No patents found for {company_name}"
-        print(f"Found {len(patents.patents)} patents. Processing...")
+        logger.info("Found %d patents. Processing...", len(patents.patents))
        # Download, parse, and minimize patents in parallel
        processed_patents = []
@@ -87,12 +91,12 @@ class CompanyAnalyzer:
                    if result:
                        processed_patents.append(result)
                except Exception as e:
-                    print(f"Warning: Failed to process {patent.patent_id}: {e}")
+                    logger.warning("Failed to process %s: %s", patent.patent_id, e)
        if not processed_patents:
            return f"Failed to process any patents for {company_name}"
-        print(f"Analyzing portfolio with LLM...")
+        logger.info("Analyzing portfolio with LLM...")
        # Analyze the full portfolio with LLM
        analysis = self.llm_analyzer.analyze_patent_portfolio(
@@ -104,21 +108,44 @@ class CompanyAnalyzer:
    def analyze_single_patent(self, patent_id: str, company_name: str) -> str:
        """Analyze a single patent by ID.
-        Useful for focused analysis of specific innovations.
+        If the patent PDF is not already on disk, this method attempts to
        download it automatically by looking up the PDF link in the database
        cache. If the link is not cached either, a ``FileNotFoundError`` is
        raised with instructions on how to obtain the PDF.
        Args:
-          patent_id: Publication ID of the patent
+          patent_id: Publication ID of the patent (e.g. "US-11234567-B2")
          company_name: Name of the company (for context)
        Returns:
          Analysis of the specific patent's innovation quality
        Raises:
          FileNotFoundError: If the patent PDF cannot be found or downloaded.
        """
-        # Note: This simplified version assumes the patent PDF is already downloaded
+        import os
-        # A more complete implementation would support direct patent ID lookup
+        logger.info("Analyzing patent %s for %s...", patent_id, company_name)
        print(f"Analyzing patent {patent_id} for {company_name}...")
        patent_path = f"patents/{patent_id}.pdf"
        if not os.path.exists(patent_path):
            # Attempt to download the PDF automatically from cached metadata
            cached = self.db.get_cached_patent(patent_id)
            pdf_link = cached.get("pdf_link") if cached else None
            if pdf_link:
                logger.info("PDF not on disk; downloading %s from cached link", patent_id)
                patent = SERP.save_patents(
                    Patent(patent_id=patent_id, pdf_link=pdf_link)
                )
                patent_path = patent.pdf_path
            else:
                raise FileNotFoundError(
                    f"Patent PDF not found at '{patent_path}' and no download link is "
                    f"cached for '{patent_id}'. Run a company analysis first to populate "
                    f"the cache, or call SERP.save_patents() with the patent's PDF link."
                )
        try:
            sections = SERP.parse_patent_pdf(patent_path)
            minimized_content = SERP.minimize_patent_for_llm(sections)
@@ -129,6 +156,8 @@ class CompanyAnalyzer:
            return analysis
        except FileNotFoundError:
            raise
        except Exception as e:
            return f"Failed to analyze patent {patent_id}: {e}"
@@ -169,7 +198,7 @@ class CompanyAnalyzer:
            return {"patent_id": patent.patent_id, "content": minimized_content}
        except Exception as e:
-            print(f"Warning: Failed to process {patent.patent_id}: {e}")
+            logger.warning("Failed to process %s: %s", patent.patent_id, e)
            return None
    def _analyze_company_safe(self, company_name: str) -> CompanyAnalysisResult:
@@ -240,7 +269,7 @@ class CompanyAnalyzer:
        results: list[CompanyAnalysisResult] = []
        total = len(companies)
-        print(f"Starting batch analysis of {total} companies...")
+        logger.info("Starting batch analysis of %d companies...", total)
        with ThreadPoolExecutor(max_workers=max_workers) as executor:
            future_to_company = {
@@ -257,8 +286,8 @@ class CompanyAnalyzer:
                    result = future.result()
                    results.append(result)
-                    status = "✓" if result.success else "✗"
+                    status = "OK" if result.success else "FAIL"
-                    print(f"[{completed}/{total}] {status} {company}")
+                    logger.info("[%d/%d] %s %s", completed, total, status, company)
                    if progress_callback:
                        progress_callback(company, completed, total)
@@ -273,12 +302,12 @@ class CompanyAnalyzer:
                            error=str(e),
                        )
                    )
-                    print(f"[{completed}/{total}] ✗ {company}: {e}")
+                    logger.error("[%d/%d] FAIL %s: %s", completed, total, company, e)
        successful = sum(1 for r in results if r.success)
        failed = total - successful
-        print(f"\nBatch complete: {successful} succeeded, {failed} failed")
+        logger.info("Batch complete: %d succeeded, %d failed", successful, failed)
        return BatchAnalysisResult(
            results=results,
@@ -304,20 +333,20 @@ class CompanyAnalyzer:
        results: list[CompanyAnalysisResult] = []
        total = len(companies)
-        print(f"Starting sequential analysis of {total} companies...")
+        logger.info("Starting sequential analysis of %d companies...", total)
        for idx, company in enumerate(companies, 1):
-            print(f"\n[{idx}/{total}] Analyzing {company}...")
+            logger.info("[%d/%d] Analyzing %s...", idx, total, company)
            result = self._analyze_company_safe(company)
            results.append(result)
-            status = "✓" if result.success else "✗"
+            status = "OK" if result.success else "FAIL"
-            print(f"[{idx}/{total}] {status} {company}")
+            logger.info("[%d/%d] %s %s", idx, total, status, company)
        successful = sum(1 for r in results if r.success)
        failed = total - successful
-        print(f"\nBatch complete: {successful} succeeded, {failed} failed")
+        logger.info("Batch complete: %d succeeded, %d failed", successful, failed)
        return BatchAnalysisResult(
            results=results,
@@ -7,20 +7,27 @@ from contextlib import asynccontextmanager
 from datetime import datetime
 from typing import Annotated, List
-from fastapi import BackgroundTasks, Depends, FastAPI, HTTPException, Query
+from fastapi import BackgroundTasks, Depends, FastAPI, HTTPException, Query, Request
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import JSONResponse, StreamingResponse
 from pydantic import BaseModel, EmailStr, Field
 from slowapi import Limiter
 from slowapi.errors import RateLimitExceeded
 from slowapi.util import get_remote_address
 from SPARC import config
 from SPARC.analyzer import CompanyAnalyzer
 from SPARC.auth import (
    TokenResponse,
    UserResponse,
    check_jwt_secret,
    close_db_client,
    create_tokens,
    decode_token,
    get_current_admin,
    get_current_user,
    get_db_client,
    init_db_client,
 )
 from SPARC.types import BatchAnalysisResult, CompanyAnalysisResult
@@ -70,6 +77,13 @@ class JobStatus(BaseModel):
    error: str | None = None
 class PaginatedJobsResponse(BaseModel):
    """Paginated response for job listings."""
    items: list["JobStatus"]
    next_cursor: str | None = None
 class HealthResponse(BaseModel):
    """Health check response."""
@@ -149,6 +163,8 @@ _analyzer: CompanyAnalyzer | None = None
 async def lifespan(app: FastAPI):
    """Initialize resources on startup, clean up on shutdown."""
    global _analyzer
    check_jwt_secret()
    init_db_client()
    _analyzer = CompanyAnalyzer()
    # Mark any jobs that were running/pending before the restart as failed
    from SPARC.database import DatabaseClient
@@ -160,9 +176,13 @@ async def lifespan(app: FastAPI):
        import logging
        logging.getLogger(__name__).warning("Marked %d stale jobs as failed on startup", stale)
    _db.close()
    # Start scheduled analysis if tracked companies are configured
    from SPARC.scheduler import start_scheduler
    start_scheduler()
    yield
-    # Cleanup if needed
+    # Cleanup
    _analyzer = None
    close_db_client()
 app = FastAPI(
@@ -173,10 +193,26 @@ app = FastAPI(
    root_path=config.root_path,
 )
 # Rate limiter (in-memory storage, suitable for single-instance deployments)
 limiter = Limiter(key_func=get_remote_address)
 app.state.limiter = limiter
@app.exception_handler(RateLimitExceeded)
 async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
    """Return 429 with Retry-After header when rate limit is exceeded."""
    retry_after = getattr(exc, "retry_after", 60)
    return JSONResponse(
        status_code=429,
        content={"detail": "Rate limit exceeded. Please try again later."},
        headers={"Retry-After": str(retry_after)},
    )
 # Add CORS middleware for React frontend
 app.add_middleware(
    CORSMiddleware,
-    allow_origins=["http://localhost:3000", "http://localhost:5173"],
+    allow_origins=config.cors_origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
@@ -187,7 +223,8 @@ app.add_middleware(
@app.post("/auth/register", response_model=UserResponse, tags=["Auth"])
-async def register(request: RegisterRequest):
+@limiter.limit("5/minute")
 async def register(request: Request, body: RegisterRequest):
    """Register a new user.
    The first registered user automatically becomes an admin.
@@ -199,8 +236,8 @@ async def register(request: RegisterRequest):
    role = "admin" if user_count == 0 else "user"
    user = db.create_user(
-        email=request.email,
+        email=body.email,
-        password=request.password,
+        password=body.password,
        role=role,
    )
@@ -219,11 +256,12 @@ async def register(request: RegisterRequest):
@app.post("/auth/login", response_model=TokenResponse, tags=["Auth"])
-async def login(request: LoginRequest):
+@limiter.limit("10/minute")
 async def login(request: Request, body: LoginRequest):
    """Authenticate user and return JWT tokens."""
    db = get_db_client()
-    user = db.authenticate_user(request.email, request.password)
+    user = db.authenticate_user(body.email, body.password)
    if not user:
        raise HTTPException(
@@ -341,6 +379,60 @@ async def delete_user(
    return {"message": "User deleted"}
 # ============== Tracked Companies Endpoints ==============
 class TrackCompanyRequest(BaseModel):
    """Request to add a company to tracking."""
    company_name: str = Field(..., min_length=1, max_length=255)
@app.get("/admin/tracked", tags=["Admin"])
 async def list_tracked_companies(
    _: UserResponse = Depends(get_current_admin),
 ):
    """List all tracked companies (admin only)."""
    db = get_db_client()
    return db.list_tracked_companies()
@app.post("/admin/tracked", tags=["Admin"])
 async def add_tracked_company(
    request: TrackCompanyRequest,
    _: UserResponse = Depends(get_current_admin),
 ):
    """Add a company to the tracked list (admin only)."""
    db = get_db_client()
    result = db.add_tracked_company(request.company_name)
    if not result:
        raise HTTPException(status_code=409, detail="Company already tracked")
    return result
@app.delete("/admin/tracked/{company_name}", tags=["Admin"])
 async def remove_tracked_company(
    company_name: str,
    _: UserResponse = Depends(get_current_admin),
 ):
    """Remove a company from the tracked list (admin only)."""
    db = get_db_client()
    removed = db.remove_tracked_company(company_name)
    if not removed:
        raise HTTPException(status_code=404, detail="Company not found in tracking list")
    return {"message": f"Stopped tracking {company_name}"}
@app.get("/admin/alerts", tags=["Admin"])
 async def list_alerts(
    limit: int = Query(default=50, ge=1, le=200),
    _: UserResponse = Depends(get_current_admin),
 ):
    """List recent alerts from scheduled analysis (admin only)."""
    db = get_db_client()
    return db.list_alerts(limit=limit)
 # ============== Analytics Endpoint ==============
@@ -361,6 +453,61 @@ async def get_analytics(
    )
 # ============== Export Endpoints ==============
@app.get("/export/{company_name}", tags=["Export"])
 async def export_company_csv(
    company_name: str,
    _: UserResponse = Depends(get_current_user),
 ):
    """Export analysis results for a company as a CSV file.
    Returns all stored analysis records for the given company, including
    analysis type, model used, response text, and timestamp.
    Args:
        company_name: Company name to export results for
    Returns:
        CSV file download
    """
    import csv
    import io
    db = get_db_client()
    # Query all non-cached analysis results for this company
    with db.get_conn() as conn:
        with conn.cursor() as cur:
            cur.execute(
                """
                SELECT company_name, analysis_type, model, response, timestamp
                FROM llm_messages
                WHERE LOWER(company_name) = LOWER(%s) AND is_cached = FALSE
                ORDER BY timestamp DESC
                """,
                (company_name,),
            )
            rows = cur.fetchall()
    if not rows:
        raise HTTPException(status_code=404, detail=f"No analysis results found for '{company_name}'")
    output = io.StringIO()
    writer = csv.writer(output)
    writer.writerow(["company_name", "analysis_type", "model", "analysis", "timestamp"])
    for row in rows:
        writer.writerow(row)
    output.seek(0)
    safe_name = company_name.replace(" ", "_").lower()
    return StreamingResponse(
        iter([output.getvalue()]),
        media_type="text/csv",
        headers={"Content-Disposition": f'attachment; filename="sparc_{safe_name}_export.csv"'},
    )
 # ============== System Endpoints ==============
@@ -401,6 +548,38 @@ async def analyze_company(
    return _convert_result(result)
@app.get(
    "/analyze/patent/{patent_id}",
    tags=["Analysis"],
 )
 async def analyze_single_patent(
    patent_id: str,
    company_name: str = Query(description="Company name for analysis context"),
    _: UserResponse = Depends(get_current_user),
 ):
    """Analyze a single patent by its publication ID.
    If the patent PDF is not already cached locally, the system will attempt
    to download it automatically from a previously cached link. If no link
    is available, a 404 error is returned.
    Args:
        patent_id: Patent publication ID (e.g. "US-11234567-B2")
        company_name: Company name for analysis context
    Returns:
        Analysis text for the patent
    """
    if not _analyzer:
        raise HTTPException(status_code=503, detail="Analyzer not initialized")
    try:
        analysis = _analyzer.analyze_single_patent(patent_id, company_name)
        return {"patent_id": patent_id, "company_name": company_name, "analysis": analysis}
    except FileNotFoundError as e:
        raise HTTPException(status_code=404, detail=str(e))
@app.post(
    "/analyze/batch",
    response_model=BatchAnalysisResponse,
@@ -491,8 +670,25 @@ def _run_batch_job(job_id: str, companies: list[str], max_workers: int):
            progress=100,
            result_json=_json.dumps(batch_response.model_dump(), default=str),
        )
        # Fire webhook notification
        from SPARC.webhooks import notify_job_completed
        notify_job_completed(
            job_id=job_id,
            status="completed",
            total_companies=result.total_companies,
            successful=result.successful,
            failed=result.failed,
        )
    except Exception as e:
        db.update_job(job_id, status="failed", error=str(e))
        from SPARC.webhooks import notify_job_completed
        notify_job_completed(
            job_id=job_id,
            status="failed",
            total_companies=len(companies),
            successful=0,
            failed=len(companies),
        )
@app.post("/analyze/batch/async", response_model=JobStatus, tags=["Analysis"])
@@ -549,24 +745,51 @@ async def get_job_status(
    return _job_row_to_status(job_row)
-@app.get("/jobs", response_model=list[JobStatus], tags=["Jobs"])
+@app.get("/jobs", response_model=PaginatedJobsResponse, tags=["Jobs"])
 async def list_jobs(
    status: Annotated[
        str | None,
        Query(description="Filter by status: pending, running, completed, failed"),
    ] = None,
    limit: Annotated[int, Query(ge=1, le=100)] = 10,
    cursor: Annotated[
        str | None,
        Query(description="Opaque cursor from a previous response's next_cursor field"),
    ] = None,
    _: UserResponse = Depends(get_current_user),
 ):
-    """List all analysis jobs.
+    """List analysis jobs with cursor-based pagination.
    Pass ``limit`` to control page size. The response includes a ``next_cursor``
    field; pass it back as the ``cursor`` query parameter to fetch the next page.
    When ``next_cursor`` is ``null``, there are no more results.
    Existing clients that use only ``limit`` (without ``cursor``) continue to
    work without modification.
    Args:
        status: Optional filter by job status
        limit: Maximum number of jobs to return (default 10, max 100)
        cursor: Opaque pagination cursor from a previous response
    Returns:
-        List of job statuses
+        Paginated list of job statuses
    """
    db = _get_job_db()
-    job_rows = db.list_jobs(status=status, limit=limit)
+    # Fetch one extra to determine if there is a next page
-    return [_job_row_to_status(row) for row in job_rows]
+    job_rows = db.list_jobs(status=status, limit=limit + 1, cursor=cursor)
    has_next = len(job_rows) > limit
    if has_next:
        job_rows = job_rows[:limit]
    items = [_job_row_to_status(row) for row in job_rows]
    next_cursor = None
    if has_next and job_rows:
        last = job_rows[-1]
        created = last["created_at"]
        ts = created.isoformat() if hasattr(created, "isoformat") else str(created)
        next_cursor = f"{ts}|{last['job_id']}"
    return PaginatedJobsResponse(items=items, next_cursor=next_cursor)
@@ -13,11 +13,25 @@ from SPARC import config
 from SPARC.database import DatabaseClient
 # JWT Configuration
-JWT_SECRET = os.getenv("JWT_SECRET", "sparc-secret-key-change-in-production")
+_DEFAULT_JWT_SECRET = "sparc-secret-key-change-in-production"
 JWT_SECRET = os.getenv("JWT_SECRET", _DEFAULT_JWT_SECRET)
 JWT_ALGORITHM = "HS256"
 ACCESS_TOKEN_EXPIRE_MINUTES = 30
 REFRESH_TOKEN_EXPIRE_DAYS = 7
 def check_jwt_secret() -> None:
    """Refuse to start with the default JWT secret in non-development environments.
    Raises:
        RuntimeError: If JWT_SECRET is the default value and APP_ENV is not 'development'.
    """
    if JWT_SECRET == _DEFAULT_JWT_SECRET and config.app_env != "development":
        raise RuntimeError(
            f"FATAL: JWT_SECRET is set to the default value and APP_ENV={config.app_env!r}. "
            "Set a secure JWT_SECRET environment variable before running in non-development environments."
        )
 security = HTTPBearer()
@@ -132,11 +146,36 @@ def decode_token(token: str) -> Optional[TokenPayload]:
        return None
 # Shared database client singleton, initialized at startup via init_db_client()
 _db_client: DatabaseClient | None = None
 def init_db_client() -> None:
    """Initialize the shared database client. Call once at app startup."""
    global _db_client
    _db_client = DatabaseClient(config.database_url)
    _db_client.connect()
 def close_db_client() -> None:
    """Close the shared database client. Call at app shutdown."""
    global _db_client
    if _db_client:
        _db_client.close()
        _db_client = None
 def get_db_client() -> DatabaseClient:
-    """Get database client for auth operations."""
+    """Get the shared pooled database client for auth operations.
-    client = DatabaseClient(config.database_url)
+
-    client.connect()
+    Returns the module-level singleton DatabaseClient. If not yet initialized
-    return client
+    (e.g., during tests), creates a new instance as a fallback.
    """
    global _db_client
    if _db_client is None:
        _db_client = DatabaseClient(config.database_url)
        _db_client.connect()
    return _db_client
 async def get_current_user(
@@ -2,11 +2,20 @@
 Loads environment variables from .env file for API keys and other secrets.
 """
-from dotenv import load_dotenv
+import logging
 import os
 from dotenv import load_dotenv
 load_dotenv()
 # Logging configuration
 log_level = os.getenv("LOG_LEVEL", "INFO").upper()
 logging.basicConfig(
    level=getattr(logging, log_level, logging.INFO),
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
 )
 # SerpAPI key for patent search
 api_key = os.getenv("API_KEY")
@@ -30,6 +39,32 @@ use_database = os.getenv("USE_DATABASE", "false").lower() in ("true", "1", "yes"
 patent_search_days = int(os.getenv("PATENT_SEARCH_DAYS", "90"))
 patent_thread_workers = int(os.getenv("PATENT_THREAD_WORKERS", "5"))
 # LLM model to use via OpenRouter (e.g. "anthropic/claude-3.5-sonnet", "openai/gpt-4o")
 model = os.getenv("MODEL", "anthropic/claude-3.5-sonnet")
 # SERP cache TTL in hours (how long cached search results are considered fresh)
 serp_cache_ttl_hours = int(os.getenv("SERP_CACHE_TTL_HOURS", "24"))
 # Root path for running behind a reverse proxy (e.g., "/api" when served at /api/)
 # This ensures OpenAPI docs work correctly when accessed via the proxy
 root_path = os.getenv("ROOT_PATH", "")
 # Application environment: "development", "staging", or "production"
 # Used for safety checks (e.g., refusing default JWT secret in production)
 app_env = os.getenv("APP_ENV", "development")
 # Storage backend: "local" (default) or "s3" for S3/MinIO object storage
 storage_backend = os.getenv("STORAGE_BACKEND", "local")
 s3_bucket = os.getenv("S3_BUCKET", "sparc-patents")
 s3_endpoint_url = os.getenv("S3_ENDPOINT_URL", "")
 s3_access_key = os.getenv("AWS_ACCESS_KEY_ID", "")
 s3_secret_key = os.getenv("AWS_SECRET_ACCESS_KEY", "")
 # CORS allowed origins (comma-separated)
 # Defaults to localhost dev origins when unset
 _cors_origins_raw = os.getenv("CORS_ORIGINS", "")
 cors_origins: list[str] = (
    [o.strip() for o in _cors_origins_raw.split(",") if o.strip()]
    if _cors_origins_raw
    else ["http://localhost:3000", "http://localhost:5173"]
 )
@@ -1,14 +1,15 @@
 """Database client for storing and retrieving LLM messages and user authentication."""
 import contextlib
 import psycopg2
 from psycopg2.pool import ThreadedConnectionPool
 from psycopg2.extras import RealDictCursor
 from typing import Dict, List, Optional
 from datetime import datetime, timedelta
 import json
 import hashlib
 import json
 from datetime import datetime, timedelta
 from typing import Dict, List, Optional
 import bcrypt
 import psycopg2
 from psycopg2.extras import RealDictCursor
 from psycopg2.pool import ThreadedConnectionPool
 class DatabaseClient:
@@ -191,6 +192,35 @@ class DatabaseClient:
                ON jobs(status)
            """)
            # Create tracked companies table for scheduled analysis
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS tracked_companies (
                    id SERIAL PRIMARY KEY,
                    company_name VARCHAR(255) UNIQUE NOT NULL,
                    last_patent_count INTEGER DEFAULT 0,
                    last_analysis_at TIMESTAMP,
                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
                )
            """)
            # Create alerts table for significant changes
            cursor.execute("""
                CREATE TABLE IF NOT EXISTS alerts (
                    id SERIAL PRIMARY KEY,
                    company_name VARCHAR(255) NOT NULL,
                    alert_type VARCHAR(50) NOT NULL,
                    message TEXT NOT NULL,
                    old_value NUMERIC,
                    new_value NUMERIC,
                    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
                )
            """)
            cursor.execute("""
                CREATE INDEX IF NOT EXISTS idx_alerts_company
                ON alerts(company_name)
            """)
            self.conn.commit()
    @staticmethod
@@ -221,8 +251,6 @@ class DatabaseClient:
        Returns:
            Cached message dict if found, None otherwise
        """
        self.connect()
        prompt_hash = self.hash_prompt(prompt)
        query = """
@@ -245,10 +273,11 @@ class DatabaseClient:
        query += " ORDER BY timestamp DESC LIMIT 1"
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+        with self.get_conn() as conn:
-            cursor.execute(query, params)
+            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-            result = cursor.fetchone()
+                cursor.execute(query, params)
-            return dict(result) if result else None
+                result = cursor.fetchone()
                return dict(result) if result else None
    def store_message(
        self,
@@ -276,33 +305,32 @@ class DatabaseClient:
        Returns:
            The ID of the inserted record
        """
        self.connect()
        prompt_hash = self.hash_prompt(prompt)
-        with self.conn.cursor() as cursor:
+        with self.get_conn() as conn:
-            cursor.execute(
+            with conn.cursor() as cursor:
-                """
+                cursor.execute(
-                INSERT INTO llm_messages
+                    """
-                (prompt, prompt_hash, response, company_name, analysis_type, model, metadata, token_usage, is_cached)
+                    INSERT INTO llm_messages
-                VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)
+                    (prompt, prompt_hash, response, company_name, analysis_type, model, metadata, token_usage, is_cached)
-                RETURNING id
+                    VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)
-                """,
+                    RETURNING id
-                (
+                    """,
-                    prompt,
+                    (
-                    prompt_hash,
+                        prompt,
-                    response,
+                        prompt_hash,
-                    company_name,
+                        response,
-                    analysis_type,
+                        company_name,
-                    model,
+                        analysis_type,
-                    json.dumps(metadata) if metadata else None,
+                        model,
-                    json.dumps(token_usage) if token_usage else None,
+                        json.dumps(metadata) if metadata else None,
-                    is_cached,
+                        json.dumps(token_usage) if token_usage else None,
-                ),
+                        is_cached,
-            )
+                    ),
                )
-            message_id = cursor.fetchone()[0]
+                message_id = cursor.fetchone()[0]
-            self.conn.commit()
+            conn.commit()
            return message_id
@@ -324,8 +352,6 @@ class DatabaseClient:
        Returns:
            List of message dictionaries
        """
        self.connect()
        query = "SELECT * FROM llm_messages WHERE 1=1"
        params = []
@@ -340,9 +366,10 @@ class DatabaseClient:
        query += " ORDER BY timestamp DESC LIMIT %s OFFSET %s"
        params.extend([limit, offset])
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+        with self.get_conn() as conn:
-            cursor.execute(query, params)
+            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-            return [dict(row) for row in cursor.fetchall()]
+                cursor.execute(query, params)
                return [dict(row) for row in cursor.fetchall()]
    def get_analytics(self, days: int = 30) -> Dict:
        """Get analytics on message usage.
@@ -353,53 +380,52 @@ class DatabaseClient:
        Returns:
            Dictionary with analytics data
        """
-        self.connect()
+        with self.get_conn() as conn:
            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
                # Total messages
                cursor.execute(
                    """
                    SELECT COUNT(*) as total_messages
                    FROM llm_messages
                    WHERE timestamp >= NOW() - INTERVAL '%s days'
                    """,
                    (days,),
                )
                total = cursor.fetchone()["total_messages"]
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+                # Messages by company
-            # Total messages
+                cursor.execute(
-            cursor.execute(
+                    """
-                """
+                    SELECT company_name, COUNT(*) as count
-                SELECT COUNT(*) as total_messages
+                    FROM llm_messages
-                FROM llm_messages
+                    WHERE timestamp >= NOW() - INTERVAL '%s days'
-                WHERE timestamp >= NOW() - INTERVAL '%s days'
+                    GROUP BY company_name
-                """,
+                    ORDER BY count DESC
-                (days,),
+                    LIMIT 10
-            )
+                    """,
-            total = cursor.fetchone()["total_messages"]
+                    (days,),
                )
                by_company = cursor.fetchall()
-            # Messages by company
+                # Messages by type
-            cursor.execute(
+                cursor.execute(
-                """
+                    """
-                SELECT company_name, COUNT(*) as count
+                    SELECT analysis_type, COUNT(*) as count
-                FROM llm_messages
+                    FROM llm_messages
-                WHERE timestamp >= NOW() - INTERVAL '%s days'
+                    WHERE timestamp >= NOW() - INTERVAL '%s days'
-                GROUP BY company_name
+                    GROUP BY analysis_type
-                ORDER BY count DESC
+                    ORDER BY count DESC
-                LIMIT 10
+                    """,
-                """,
+                    (days,),
-                (days,),
+                )
-            )
+                by_type = cursor.fetchall()
            by_company = cursor.fetchall()
-            # Messages by type
+                return {
-            cursor.execute(
+                    "total_messages": total,
-                """
+                    "by_company": [dict(row) for row in by_company],
-                SELECT analysis_type, COUNT(*) as count
+                    "by_type": [dict(row) for row in by_type],
-                FROM llm_messages
+                    "period_days": days,
-                WHERE timestamp >= NOW() - INTERVAL '%s days'
+                }
                GROUP BY analysis_type
                ORDER BY count DESC
                """,
                (days,),
            )
            by_type = cursor.fetchall()
            return {
                "total_messages": total,
                "by_company": [dict(row) for row in by_company],
                "by_type": [dict(row) for row in by_type],
                "period_days": days,
            }
    # Patent Cache Methods
@@ -571,20 +597,45 @@ class DatabaseClient:
        self,
        status: Optional[str] = None,
        limit: int = 10,
        cursor: Optional[str] = None,
    ) -> List[Dict]:
-        """List jobs, optionally filtered by status."""
+        """List jobs with optional status filter and cursor-based pagination.
-        query = "SELECT * FROM jobs"
+
        Args:
            status: Optional status filter (pending, running, completed, failed).
            limit: Maximum number of jobs to return.
            cursor: Opaque cursor (``created_at|job_id``) from a previous
                response. When provided, only jobs older than the cursor are
                returned.
        Returns:
            List of job dicts ordered by created_at descending.
        """
        conditions: list[str] = []
        params: list = []
        if status:
-            query += " WHERE status = %s"
+            conditions.append("status = %s")
            params.append(status)
-        query += " ORDER BY created_at DESC LIMIT %s"
+
        if cursor:
            try:
                ts_str, cursor_job_id = cursor.rsplit("|", 1)
                conditions.append("(created_at, job_id) < (%s, %s)")
                params.extend([ts_str, cursor_job_id])
            except ValueError:
                pass  # Ignore malformed cursors; return from start
        query = "SELECT * FROM jobs"
        if conditions:
            query += " WHERE " + " AND ".join(conditions)
        query += " ORDER BY created_at DESC, job_id DESC LIMIT %s"
        params.append(limit)
        with self.get_conn() as conn:
-            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
+            with conn.cursor(cursor_factory=RealDictCursor) as cur:
-                cursor.execute(query, params)
+                cur.execute(query, params)
-                return [dict(row) for row in cursor.fetchall()]
+                return [dict(row) for row in cur.fetchall()]
    def mark_stale_jobs_failed(self) -> int:
        """Mark any jobs in 'running' or 'pending' state as 'failed'.
@@ -650,25 +701,23 @@ class DatabaseClient:
        Returns:
            Created user dict or None if email exists
        """
        self.connect()
        password_hash = self.hash_password(password)
        try:
-            with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+            with self.get_conn() as conn:
-                cursor.execute(
+                with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-                    """
+                    cursor.execute(
-                    INSERT INTO users (email, password_hash, role)
+                        """
-                    VALUES (%s, %s, %s)
+                        INSERT INTO users (email, password_hash, role)
-                    RETURNING id, email, role, created_at
+                        VALUES (%s, %s, %s)
-                    """,
+                        RETURNING id, email, role, created_at
-                    (email, password_hash, role),
+                        """,
-                )
+                        (email, password_hash, role),
-                user = cursor.fetchone()
+                    )
-                self.conn.commit()
+                    user = cursor.fetchone()
                conn.commit()
                return dict(user) if user else None
        except psycopg2.errors.UniqueViolation:
            self.conn.rollback()
            return None
    def authenticate_user(self, email: str, password: str) -> Optional[Dict]:
@@ -681,23 +730,22 @@ class DatabaseClient:
        Returns:
            User dict if authenticated, None otherwise
        """
-        self.connect()
+        with self.get_conn() as conn:
            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
                cursor.execute(
                    "SELECT * FROM users WHERE email = %s",
                    (email,),
                )
                user = cursor.fetchone()
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+                if user and self.verify_password(password, user["password_hash"]):
-            cursor.execute(
+                    return {
-                "SELECT * FROM users WHERE email = %s",
+                        "id": user["id"],
-                (email,),
+                        "email": user["email"],
-            )
+                        "role": user["role"],
-            user = cursor.fetchone()
+                        "created_at": user["created_at"],
-
+                    }
-            if user and self.verify_password(password, user["password_hash"]):
+                return None
                return {
                    "id": user["id"],
                    "email": user["email"],
                    "role": user["role"],
                    "created_at": user["created_at"],
                }
            return None
    def get_user_by_id(self, user_id: int) -> Optional[Dict]:
        """Get a user by ID.
@@ -708,15 +756,14 @@ class DatabaseClient:
        Returns:
            User dict or None
        """
-        self.connect()
+        with self.get_conn() as conn:
-
+            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+                cursor.execute(
-            cursor.execute(
+                    "SELECT id, email, role, created_at FROM users WHERE id = %s",
-                "SELECT id, email, role, created_at FROM users WHERE id = %s",
+                    (user_id,),
-                (user_id,),
+                )
-            )
+                user = cursor.fetchone()
-            user = cursor.fetchone()
+                return dict(user) if user else None
            return dict(user) if user else None
    def get_user_by_email(self, email: str) -> Optional[Dict]:
        """Get a user by email.
@@ -727,15 +774,14 @@ class DatabaseClient:
        Returns:
            User dict or None
        """
-        self.connect()
+        with self.get_conn() as conn:
-
+            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+                cursor.execute(
-            cursor.execute(
+                    "SELECT id, email, role, created_at FROM users WHERE email = %s",
-                "SELECT id, email, role, created_at FROM users WHERE email = %s",
+                    (email,),
-                (email,),
+                )
-            )
+                user = cursor.fetchone()
-            user = cursor.fetchone()
+                return dict(user) if user else None
            return dict(user) if user else None
    def get_all_users(self, limit: int = 100, offset: int = 0) -> List[Dict]:
        """Get all users (admin only).
@@ -747,19 +793,18 @@ class DatabaseClient:
        Returns:
            List of user dicts
        """
-        self.connect()
+        with self.get_conn() as conn:
-
+            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+                cursor.execute(
-            cursor.execute(
+                    """
-                """
+                    SELECT id, email, role, created_at
-                SELECT id, email, role, created_at
+                    FROM users
-                FROM users
+                    ORDER BY created_at DESC
-                ORDER BY created_at DESC
+                    LIMIT %s OFFSET %s
-                LIMIT %s OFFSET %s
+                    """,
-                """,
+                    (limit, offset),
-                (limit, offset),
+                )
-            )
+                return [dict(row) for row in cursor.fetchall()]
            return [dict(row) for row in cursor.fetchall()]
    def update_user_role(self, user_id: int, role: str) -> Optional[Dict]:
        """Update a user's role (admin only).
@@ -771,20 +816,19 @@ class DatabaseClient:
        Returns:
            Updated user dict or None
        """
-        self.connect()
+        with self.get_conn() as conn:
-
+            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
-        with self.conn.cursor(cursor_factory=RealDictCursor) as cursor:
+                cursor.execute(
-            cursor.execute(
+                    """
-                """
+                    UPDATE users
-                UPDATE users
+                    SET role = %s, updated_at = CURRENT_TIMESTAMP
-                SET role = %s, updated_at = CURRENT_TIMESTAMP
+                    WHERE id = %s
-                WHERE id = %s
+                    RETURNING id, email, role, created_at
-                RETURNING id, email, role, created_at
+                    """,
-                """,
+                    (role, user_id),
-                (role, user_id),
+                )
-            )
+                user = cursor.fetchone()
-            user = cursor.fetchone()
+            conn.commit()
            self.conn.commit()
            return dict(user) if user else None
    def delete_user(self, user_id: int) -> bool:
@@ -796,12 +840,11 @@ class DatabaseClient:
        Returns:
            True if deleted
        """
-        self.connect()
+        with self.get_conn() as conn:
-
+            with conn.cursor() as cursor:
-        with self.conn.cursor() as cursor:
+                cursor.execute("DELETE FROM users WHERE id = %s", (user_id,))
-            cursor.execute("DELETE FROM users WHERE id = %s", (user_id,))
+                deleted = cursor.rowcount > 0
-            deleted = cursor.rowcount > 0
+            conn.commit()
            self.conn.commit()
            return deleted
    def get_user_count(self) -> int:
@@ -810,8 +853,85 @@ class DatabaseClient:
        Returns:
            Number of users
        """
-        self.connect()
+        with self.get_conn() as conn:
            with conn.cursor() as cursor:
                cursor.execute("SELECT COUNT(*) FROM users")
                return cursor.fetchone()[0]
-        with self.conn.cursor() as cursor:
+    # Tracked Companies Methods
-            cursor.execute("SELECT COUNT(*) FROM users")
+
-            return cursor.fetchone()[0]
+    def add_tracked_company(self, company_name: str) -> Optional[Dict]:
        """Add a company to the tracking list."""
        with self.get_conn() as conn:
            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
                try:
                    cursor.execute(
                        "INSERT INTO tracked_companies (company_name) VALUES (%s) RETURNING *",
                        (company_name,),
                    )
                    row = cursor.fetchone()
                    conn.commit()
                    return dict(row) if row else None
                except Exception:
                    conn.rollback()
                    return None
    def remove_tracked_company(self, company_name: str) -> bool:
        """Remove a company from the tracking list."""
        with self.get_conn() as conn:
            with conn.cursor() as cursor:
                cursor.execute(
                    "DELETE FROM tracked_companies WHERE LOWER(company_name) = LOWER(%s)",
                    (company_name,),
                )
                conn.commit()
                return cursor.rowcount > 0
    def list_tracked_companies(self) -> List[Dict]:
        """List all tracked companies."""
        with self.get_conn() as conn:
            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
                cursor.execute("SELECT * FROM tracked_companies ORDER BY company_name")
                return [dict(row) for row in cursor.fetchall()]
    def update_tracked_company(
        self, company_name: str, patent_count: int
    ) -> None:
        """Update the last analysis stats for a tracked company."""
        with self.get_conn() as conn:
            with conn.cursor() as cursor:
                cursor.execute(
                    """UPDATE tracked_companies
                       SET last_patent_count = %s, last_analysis_at = CURRENT_TIMESTAMP
                       WHERE LOWER(company_name) = LOWER(%s)""",
                    (patent_count, company_name),
                )
                conn.commit()
    def store_alert(
        self,
        company_name: str,
        alert_type: str,
        message: str,
        old_value: float | None = None,
        new_value: float | None = None,
    ) -> None:
        """Record an alert for a significant change."""
        with self.get_conn() as conn:
            with conn.cursor() as cursor:
                cursor.execute(
                    """INSERT INTO alerts (company_name, alert_type, message, old_value, new_value)
                       VALUES (%s, %s, %s, %s, %s)""",
                    (company_name, alert_type, message, old_value, new_value),
                )
                conn.commit()
    def list_alerts(self, limit: int = 50) -> List[Dict]:
        """List recent alerts."""
        with self.get_conn() as conn:
            with conn.cursor(cursor_factory=RealDictCursor) as cursor:
                cursor.execute(
                    "SELECT * FROM alerts ORDER BY created_at DESC LIMIT %s",
                    (limit,),
                )
                return [dict(row) for row in cursor.fetchall()]
@@ -1,9 +1,14 @@
 """LLM integration for patent analysis using OpenRouter."""
 import logging
 from typing import Dict
 from openai import OpenAI
 from SPARC import config
 from SPARC.database import DatabaseClient
-from typing import Dict
+
 logger = logging.getLogger(__name__)
 class LLMAnalyzer:
@@ -20,7 +25,7 @@ class LLMAnalyzer:
        """
        self.test_mode = test_mode
        self.use_cache = use_cache if use_cache is not None else config.use_cache
-        self.model = "anthropic/claude-3.5-sonnet"
+        self.model = config.model
        # Always initialize database client for storage and caching
        self.db_client = DatabaseClient(config.database_url)
@@ -59,11 +64,7 @@ Patent Content:
 Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals about the company's technical direction and competitive advantage."""
        if self.test_mode:
-            print("=" * 80)
+            logger.debug("TEST MODE - Prompt that would be sent to LLM:\n%s", prompt)
            print("TEST MODE - Prompt that would be sent to LLM:")
            print("=" * 80)
            print(prompt)
            print("=" * 80)
            return "[TEST MODE - No API call made]"
        # Check cache first
@@ -165,7 +166,7 @@ Patent Portfolio:
 Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the company's innovation strength and performance outlook."""
        if self.test_mode:
-            print(prompt)
+            logger.debug("TEST MODE - Portfolio prompt:\n%s", prompt)
            return "[TEST MODE]"
        metadata = {
@@ -0,0 +1,109 @@
 """Scheduled patent analysis for tracked companies.
 Uses APScheduler to periodically re-analyze tracked companies and
 detect significant changes in patent counts.
 """
 import logging
 import os
 from SPARC import config
 from SPARC.analyzer import CompanyAnalyzer
 from SPARC.database import DatabaseClient
 logger = logging.getLogger(__name__)
 # Configurable via environment variable (in hours, default 24)
 SCHEDULE_INTERVAL_HOURS = int(os.getenv("SCHEDULE_INTERVAL_HOURS", "24"))
 # Patent count change threshold (percentage) to trigger an alert
 CHANGE_THRESHOLD_PERCENT = int(os.getenv("CHANGE_THRESHOLD_PERCENT", "20"))
 def run_scheduled_analysis() -> None:
    """Re-analyze all tracked companies and check for significant changes."""
    db = DatabaseClient(config.database_url)
    db.connect()
    db.initialize_schema()
    tracked = db.list_tracked_companies()
    if not tracked:
        logger.info("No tracked companies configured; skipping scheduled analysis")
        return
    logger.info("Running scheduled analysis for %d tracked companies", len(tracked))
    analyzer = CompanyAnalyzer(db_client=db)
    for company_row in tracked:
        name = company_row["company_name"]
        old_count = company_row.get("last_patent_count", 0) or 0
        try:
            result = analyzer._analyze_company_safe(name)
            if result.success:
                new_count = result.patent_count
                # Update tracking record
                db.update_tracked_company(name, new_count)
                # Check for significant change
                if old_count > 0:
                    delta_pct = abs(new_count - old_count) / old_count * 100
                    if delta_pct >= CHANGE_THRESHOLD_PERCENT:
                        direction = "increased" if new_count > old_count else "decreased"
                        message = (
                            f"Patent count for {name} {direction} by {delta_pct:.0f}% "
                            f"({old_count} -> {new_count})"
                        )
                        logger.warning("ALERT: %s", message)
                        db.store_alert(
                            company_name=name,
                            alert_type="patent_count_change",
                            message=message,
                            old_value=old_count,
                            new_value=new_count,
                        )
                elif new_count > 0:
                    # First analysis -- record baseline
                    logger.info("Baseline for %s: %d patents", name, new_count)
            else:
                logger.warning("Scheduled analysis failed for %s: %s", name, result.error)
        except Exception as e:
            logger.error("Error analyzing tracked company %s: %s", name, e)
    db.close()
    logger.info("Scheduled analysis complete")
 def start_scheduler() -> None:
    """Start the APScheduler background scheduler.
    Safe to call at application startup. If apscheduler is not installed,
    the function logs a warning and returns without starting anything.
    """
    try:
        from apscheduler.schedulers.background import BackgroundScheduler
    except ImportError:
        logger.warning(
            "apscheduler not installed; scheduled analysis disabled. "
            "Install with: pip install apscheduler"
        )
        return
    scheduler = BackgroundScheduler()
    scheduler.add_job(
        run_scheduled_analysis,
        "interval",
        hours=SCHEDULE_INTERVAL_HOURS,
        id="scheduled_patent_analysis",
        replace_existing=True,
    )
    scheduler.start()
    logger.info(
        "Scheduled patent analysis started (every %d hours, threshold %d%%)",
        SCHEDULE_INTERVAL_HOURS,
        CHANGE_THRESHOLD_PERCENT,
    )
@@ -1,12 +1,29 @@
-import os
+import io
-import serpapi
+import logging
 from SPARC import config
 import re
 import pdfplumber  # pip install pdfplumber
 import requests
 from datetime import datetime, timedelta
 from typing import Dict
-from SPARC.types import Patents, Patent
+
 import pdfplumber  # pip install pdfplumber
 import requests
 import serpapi
 from SPARC import config
 from SPARC.storage import StorageBackend, get_storage_backend
 from SPARC.types import Patent, Patents
 logger = logging.getLogger(__name__)
 # Module-level storage instance (lazy-initialized)
 _storage: StorageBackend | None = None
 def _get_storage() -> StorageBackend:
    global _storage
    if _storage is None:
        _storage = get_storage_backend()
    return _storage
 class SERP:
  def query(company: str, days_back: int = None) -> Patents:
@@ -41,6 +58,7 @@ class SERP:
      "tbs": date_filter,
      "api_key": config.api_key,
    }
    logger.info("Querying Google Patents for '%s' (last %d days)", company, days_back)
    search = serpapi.search(params)
    # Convert results to Patent objects, skipping any without PDF links
    patent_ids = []
@@ -49,13 +67,16 @@ class SERP:
        pdf_link = patent.get("pdf")
        if pdf_link:
            patent_ids.append(Patent(patent_id=patent["publication_number"], pdf_link=pdf_link, summary=None))
-        # Patents without PDF links are skipped (see docstring for details)
+        else:
            logger.debug("Skipping patent %s (no PDF link)", patent.get("publication_number", "unknown"))
    logger.info("Found %d patents with PDF links for '%s'", len(patent_ids), company)
    return Patents(patents=patent_ids)
  def save_patents(patent: Patent) -> Patent:
-    """
+    """Save the patent PDF to storage, skipping download if already cached.
-    Save the patent PDF to the patents folder, skipping download if already cached.
+
    Uses the configured storage backend (local filesystem or S3).
    Args:
      patent: Patent object
@@ -63,35 +84,51 @@ class SERP:
    Returns:
      Patent object with updated PDF path
    """
-    pdf_path = f"patents/{patent.patent_id}.pdf"
+    storage = _get_storage()
-    os.makedirs("patents", exist_ok=True)
+    key = f"{patent.patent_id}.pdf"
-    if not (os.path.exists(pdf_path) and os.path.getsize(pdf_path) > 0):
+    if not storage.exists(key):
      logger.info("Downloading PDF for %s", patent.patent_id)
      response = requests.get(patent.pdf_link)
-      with open(pdf_path, "wb") as f:
+      storage.write(key, response.content)
-        f.write(response.content)
+      logger.debug("Saved %d bytes for %s", len(response.content), patent.patent_id)
    else:
      logger.debug("Using cached PDF for %s", patent.patent_id)
-    patent.pdf_path = pdf_path
+    patent.pdf_path = storage.path_for(key)
    return patent
  def parse_patent_pdf(pdf_path: str) -> Dict:
    """Extract structured sections from patent PDF.
    Extracts all major sections from a patent PDF including abstract,
-    claims, summary, and detailed description.
+    claims, summary, and detailed description. Supports both local file
    paths and S3 URIs (s3://bucket/key).
    Args:
-      pdf_path: Path to the patent PDF file
+      pdf_path: Local path or S3 URI to the patent PDF file
    Returns:
      Dictionary containing all extracted sections
    """
    logger.debug("Parsing patent PDF: %s", pdf_path)
-    with pdfplumber.open(pdf_path) as pdf:
+    if pdf_path.startswith("s3://"):
      # Read from S3 via storage backend
      storage = _get_storage()
      # Extract key from "s3://bucket/key"
      key = pdf_path.split("/", 3)[-1]
      data = storage.read(key)
      pdf_file: io.BytesIO | str = io.BytesIO(data)
    else:
      pdf_file = pdf_path
    with pdfplumber.open(pdf_file) as pdf:
      # Extract all text
      full_text = ""
      for page in pdf.pages:
        full_text += page.extract_text() + "\n"
      logger.debug("Extracted text from %d pages (%d chars)", len(pdf.pages), len(full_text))
    # Define section patterns (common in patents)
    sections = {
@@ -0,0 +1,171 @@
 """Patent PDF storage abstraction.
 Provides a unified interface for reading and writing patent PDF files,
 with pluggable backends for local filesystem and S3-compatible object
 storage (e.g., MinIO, AWS S3).
 """
 import logging
 import os
 from abc import ABC, abstractmethod
 from SPARC import config
 logger = logging.getLogger(__name__)
 class StorageBackend(ABC):
    """Abstract base class for patent PDF storage."""
    @abstractmethod
    def read(self, key: str) -> bytes:
        """Read a file by key.
        Args:
            key: Storage key (e.g., "US-12345678-B2.pdf")
        Returns:
            File contents as bytes.
        Raises:
            FileNotFoundError: If the file does not exist.
        """
    @abstractmethod
    def write(self, key: str, data: bytes) -> None:
        """Write data to storage.
        Args:
            key: Storage key (e.g., "US-12345678-B2.pdf")
            data: File contents as bytes.
        """
    @abstractmethod
    def exists(self, key: str) -> bool:
        """Check if a file exists in storage.
        Args:
            key: Storage key.
        Returns:
            True if the file exists and has non-zero size.
        """
    @abstractmethod
    def path_for(self, key: str) -> str:
        """Return a path or URI suitable for downstream consumers.
        For local storage this is a filesystem path; for S3 it is the
        object key (callers that need a local file should use read()
        and write to a temporary location).
        """
 class LocalStorageBackend(StorageBackend):
    """Store patent PDFs on the local filesystem under a directory."""
    def __init__(self, base_dir: str = "patents"):
        self.base_dir = base_dir
        os.makedirs(self.base_dir, exist_ok=True)
    def _full_path(self, key: str) -> str:
        return os.path.join(self.base_dir, key)
    def read(self, key: str) -> bytes:
        path = self._full_path(key)
        if not os.path.exists(path):
            raise FileNotFoundError(f"File not found: {path}")
        with open(path, "rb") as f:
            return f.read()
    def write(self, key: str, data: bytes) -> None:
        path = self._full_path(key)
        os.makedirs(os.path.dirname(path) or self.base_dir, exist_ok=True)
        with open(path, "wb") as f:
            f.write(data)
        logger.debug("Wrote %d bytes to %s", len(data), path)
    def exists(self, key: str) -> bool:
        path = self._full_path(key)
        return os.path.exists(path) and os.path.getsize(path) > 0
    def path_for(self, key: str) -> str:
        return self._full_path(key)
 class S3StorageBackend(StorageBackend):
    """Store patent PDFs in an S3-compatible bucket."""
    def __init__(
        self,
        bucket: str,
        endpoint_url: str = "",
        access_key: str = "",
        secret_key: str = "",
    ):
        import boto3
        kwargs: dict = {}
        if endpoint_url:
            kwargs["endpoint_url"] = endpoint_url
        if access_key and secret_key:
            kwargs["aws_access_key_id"] = access_key
            kwargs["aws_secret_access_key"] = secret_key
        self.s3 = boto3.client("s3", **kwargs)
        self.bucket = bucket
        # Ensure bucket exists (useful for MinIO local dev)
        try:
            self.s3.head_bucket(Bucket=self.bucket)
        except Exception:
            try:
                self.s3.create_bucket(Bucket=self.bucket)
                logger.info("Created S3 bucket: %s", self.bucket)
            except Exception as e:
                logger.warning("Could not create bucket %s: %s", self.bucket, e)
    def read(self, key: str) -> bytes:
        try:
            response = self.s3.get_object(Bucket=self.bucket, Key=key)
            return response["Body"].read()
        except self.s3.exceptions.NoSuchKey:
            raise FileNotFoundError(f"S3 object not found: s3://{self.bucket}/{key}")
        except Exception as e:
            if "NoSuchKey" in str(e) or "404" in str(e):
                raise FileNotFoundError(f"S3 object not found: s3://{self.bucket}/{key}")
            raise
    def write(self, key: str, data: bytes) -> None:
        self.s3.put_object(
            Bucket=self.bucket,
            Key=key,
            Body=data,
            ContentType="application/pdf",
        )
        logger.debug("Wrote %d bytes to s3://%s/%s", len(data), self.bucket, key)
    def exists(self, key: str) -> bool:
        try:
            response = self.s3.head_object(Bucket=self.bucket, Key=key)
            return response["ContentLength"] > 0
        except Exception:
            return False
    def path_for(self, key: str) -> str:
        return f"s3://{self.bucket}/{key}"
 def get_storage_backend() -> StorageBackend:
    """Factory: return the configured storage backend instance."""
    backend = config.storage_backend.lower()
    if backend == "s3":
        logger.info("Using S3 storage backend (bucket=%s)", config.s3_bucket)
        return S3StorageBackend(
            bucket=config.s3_bucket,
            endpoint_url=config.s3_endpoint_url,
            access_key=config.s3_access_key,
            secret_key=config.s3_secret_key,
        )
    logger.info("Using local storage backend")
    return LocalStorageBackend()
@@ -4,7 +4,7 @@ from datetime import datetime
@dataclass
 class Patent:
-    patent_id: int
+    patent_id: str
    pdf_link: str
    pdf_path: str | None = None
    summary: dict | None = None
@@ -0,0 +1,139 @@
 """Webhook notifications for job completion and alert events.
 Sends JSON payloads to configured webhook URLs with retry logic.
 Supports generic HTTP POST and Slack-compatible text payloads.
 """
 import logging
 import os
 import time
 from datetime import datetime
 from typing import Any
 import requests
 logger = logging.getLogger(__name__)
 # Comma-separated list of webhook URLs (env var based config)
 _WEBHOOK_URLS_RAW = os.getenv("WEBHOOK_URLS", "")
 WEBHOOK_URLS: list[str] = [
    url.strip() for url in _WEBHOOK_URLS_RAW.split(",") if url.strip()
 ]
 MAX_RETRIES = 3
 BACKOFF_BASE = 2  # seconds
 def _is_slack_url(url: str) -> bool:
    """Check if a URL looks like a Slack incoming webhook."""
    return "hooks.slack.com" in url or "discord.com/api/webhooks" in url
 def _build_payload(event_type: str, data: dict[str, Any], slack: bool = False) -> dict:
    """Build the webhook payload.
    Args:
        event_type: Type of event (e.g., "job_completed", "alert")
        data: Event-specific data
        slack: If True, wrap in Slack-compatible ``text`` format
    Returns:
        JSON-serializable payload dict
    """
    payload = {
        "event": event_type,
        "timestamp": datetime.utcnow().isoformat() + "Z",
        **data,
    }
    if slack:
        # Build a human-readable summary for Slack/Discord
        lines = [f"*[SPARC] {event_type}*"]
        for key, value in data.items():
            lines.append(f"  {key}: {value}")
        return {"text": "\n".join(lines)}
    return payload
 def _send_with_retry(url: str, payload: dict) -> bool:
    """Send a POST request with exponential backoff retry.
    Args:
        url: Webhook URL
        payload: JSON payload to send
    Returns:
        True if delivered successfully, False after all retries exhausted
    """
    for attempt in range(1, MAX_RETRIES + 1):
        try:
            response = requests.post(url, json=payload, timeout=10)
            if response.status_code < 300:
                logger.debug("Webhook delivered to %s (attempt %d)", url, attempt)
                return True
            logger.warning(
                "Webhook %s returned %d (attempt %d/%d)",
                url, response.status_code, attempt, MAX_RETRIES,
            )
        except requests.RequestException as e:
            logger.warning(
                "Webhook delivery failed for %s (attempt %d/%d): %s",
                url, attempt, MAX_RETRIES, e,
            )
        if attempt < MAX_RETRIES:
            wait = BACKOFF_BASE ** attempt
            time.sleep(wait)
    logger.error("Webhook permanently failed for %s after %d attempts", url, MAX_RETRIES)
    return False
 def notify(event_type: str, data: dict[str, Any]) -> None:
    """Fire all configured webhooks for an event.
    Safe to call even when no webhooks are configured (returns immediately).
    Args:
        event_type: Event identifier (e.g., "job_completed", "patent_alert")
        data: Event data to include in the payload
    """
    if not WEBHOOK_URLS:
        return
    for url in WEBHOOK_URLS:
        slack = _is_slack_url(url)
        payload = _build_payload(event_type, data, slack=slack)
        _send_with_retry(url, payload)
 def notify_job_completed(
    job_id: str,
    status: str,
    total_companies: int,
    successful: int,
    failed: int,
 ) -> None:
    """Send notification when a batch job completes."""
    notify("job_completed", {
        "job_id": job_id,
        "status": status,
        "total_companies": total_companies,
        "successful": successful,
        "failed": failed,
        "summary": f"Batch job {job_id}: {successful}/{total_companies} succeeded",
    })
 def notify_alert(
    company_name: str,
    alert_type: str,
    message: str,
 ) -> None:
    """Send notification for a tracked company alert."""
    notify("patent_alert", {
        "company_name": company_name,
        "alert_type": alert_type,
        "message": message,
    })
@@ -3,15 +3,15 @@ services:
    image: postgres:16-alpine
    container_name: sparc-postgres
    environment:
-      POSTGRES_USER: postgres
+      POSTGRES_USER: ${POSTGRES_USER}
-      POSTGRES_PASSWORD: postgres
+      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
-      POSTGRES_DB: sparc
+      POSTGRES_DB: ${POSTGRES_DB}
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
-      test: ["CMD-SHELL", "pg_isready -U postgres"]
+      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
      interval: 5s
      timeout: 5s
      retries: 5
@@ -22,7 +22,7 @@ services:
    container_name: sparc-init-db
    command: python scripts/init_database.py
    environment:
-      DATABASE_URL: postgresql://postgres:postgres@postgres:5432/sparc
+      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
    depends_on:
      postgres:
        condition: service_healthy
@@ -35,9 +35,11 @@ services:
    environment:
      API_KEY: ${API_KEY}
      OPENROUTER_API_KEY: ${OPENROUTER_API_KEY}
-      DATABASE_URL: postgresql://postgres:postgres@postgres:5432/sparc
+      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
      USE_CACHE: "true"
      JWT_SECRET: ${JWT_SECRET:-sparc-secret-key-change-in-production}
      CORS_ORIGINS: ${CORS_ORIGINS:-}
      APP_ENV: ${APP_ENV:-development}
      ROOT_PATH: /api
    ports:
      - "8000:8000"
@@ -50,6 +52,29 @@ services:
      - ./patents:/app/patents
    restart: unless-stopped
  # Optional: MinIO for S3-compatible local object storage
  # Enable by setting STORAGE_BACKEND=s3 in .env
  minio:
    image: minio/minio:latest
    container_name: sparc-minio
    command: server /data --console-address ":9001"
    environment:
      MINIO_ROOT_USER: ${AWS_ACCESS_KEY_ID:-minioadmin}
      MINIO_ROOT_PASSWORD: ${AWS_SECRET_ACCESS_KEY:-minioadmin}
    ports:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - minio_data:/data
    healthcheck:
      test: ["CMD", "mc", "ready", "local"]
      interval: 10s
      timeout: 5s
      retries: 3
    restart: unless-stopped
    profiles:
      - s3
  dashboard:
    build: ./frontend
    container_name: sparc-dashboard
@@ -61,3 +86,4 @@ services:
 volumes:
  postgres_data:
  minio_data:
@@ -7,6 +7,15 @@
    <title>SPARC Dashboard</title>
  </head>
  <body>
    <script>
      // Prevent FOUC: apply saved theme before first render
      (function() {
        var theme = localStorage.getItem('theme');
        if (theme === 'dark' || (!theme && window.matchMedia('(prefers-color-scheme: dark)').matches)) {
          document.documentElement.classList.add('dark');
        }
      })();
    </script>
    <div id="root"></div>
    <script type="module" src="/src/main.tsx"></script>
  </body>
@@ -7,12 +7,13 @@
    "dev": "vite",
    "build": "tsc -b && vite build",
    "lint": "eslint .",
    "typecheck": "tsc --noEmit",
    "preview": "vite preview"
  },
  "dependencies": {
    "@tanstack/react-query": "^5.51.0",
    "axios": "^1.7.2",
-    "lucide-react": "^0.400.0",
+    "lucide-react": "^1.7.0",
    "react": "^18.3.1",
    "react-dom": "^18.3.1",
    "react-router-dom": "^6.24.0",
@@ -1,6 +1,7 @@
 import { BrowserRouter, Routes, Route, Navigate } from 'react-router-dom';
 import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
 import { AuthProvider } from './context/AuthContext';
 import { ThemeProvider } from './context/ThemeContext';
 import { Layout } from './components/Layout';
 import { ProtectedRoute } from './components/ProtectedRoute';
 import { Login } from './pages/Login';
@@ -22,6 +23,7 @@ const queryClient = new QueryClient({
 function App() {
  return (
    <ThemeProvider>
    <QueryClientProvider client={queryClient}>
      <AuthProvider>
        <BrowserRouter>
@@ -61,6 +63,7 @@ function App() {
        </BrowserRouter>
      </AuthProvider>
    </QueryClientProvider>
    </ThemeProvider>
  );
 }
@@ -126,6 +126,23 @@ export const analysisApi = {
  },
 };
 // Export API
 export const exportApi = {
  exportCsv: async (companyName: string): Promise<void> => {
    const response = await api.get(`/export/${encodeURIComponent(companyName)}`, {
      responseType: 'blob',
    });
    const url = window.URL.createObjectURL(new Blob([response.data]));
    const link = document.createElement('a');
    link.href = url;
    link.setAttribute('download', `sparc_${companyName.toLowerCase().replace(/\s+/g, '_')}_export.csv`);
    document.body.appendChild(link);
    link.click();
    link.remove();
    window.URL.revokeObjectURL(url);
  },
 };
 // Analytics API
 export const analyticsApi = {
  getAnalytics: async (days = 30): Promise<Analytics> => {
@@ -1,9 +1,11 @@
 import { Outlet, NavLink, useNavigate } from 'react-router-dom';
 import { useAuth } from '../context/AuthContext';
-import { Search, Layers, BarChart3, Info, Users, LogOut } from 'lucide-react';
+import { useTheme } from '../context/ThemeContext';
 import { Search, Layers, BarChart3, Info, Users, LogOut, Sun, Moon } from 'lucide-react';
 export function Layout() {
  const { user, isAdmin, logout } = useAuth();
  const { theme, toggleTheme } = useTheme();
  const navigate = useNavigate();
  const handleLogout = () => {
@@ -23,7 +25,7 @@ export function Layout() {
  }
  return (
-    <div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950">
+    <div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950">
      {/* Header */}
      <header className="bg-bg-card/80 backdrop-blur-lg border-b border-primary/20">
        <div className="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8">
@@ -63,6 +65,13 @@ export function Layout() {
            {/* User menu */}
            <div className="flex items-center gap-4">
              <button
                onClick={toggleTheme}
                className="p-2 rounded-lg text-text-secondary hover:text-text-primary hover:bg-bg-card-hover transition-all"
                aria-label={theme === 'dark' ? 'Switch to light mode' : 'Switch to dark mode'}
              >
                {theme === 'dark' ? <Sun size={18} /> : <Moon size={18} />}
              </button>
              <div className="text-right hidden sm:block">
                <div className="text-sm font-medium text-text-primary">{user?.email}</div>
                <div className="text-xs text-text-secondary capitalize">{user?.role}</div>
@@ -12,7 +12,7 @@ export function ProtectedRoute({ children, requireAdmin = false }: ProtectedRout
  if (isLoading) {
    return (
-      <div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950 flex items-center justify-center">
+      <div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950 flex items-center justify-center">
        <div className="animate-spin rounded-full h-12 w-12 border-t-2 border-b-2 border-primary"></div>
      </div>
    );
@@ -0,0 +1,48 @@
 import { createContext, useContext, useEffect, useState } from 'react';
 type Theme = 'light' | 'dark';
 interface ThemeContextType {
  theme: Theme;
  toggleTheme: () => void;
 }
 const ThemeContext = createContext<ThemeContextType | undefined>(undefined);
 function getInitialTheme(): Theme {
  const stored = localStorage.getItem('theme');
  if (stored === 'light' || stored === 'dark') return stored;
  return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
 }
 export function ThemeProvider({ children }: { children: React.ReactNode }) {
  const [theme, setTheme] = useState<Theme>(getInitialTheme);
  useEffect(() => {
    const root = document.documentElement;
    if (theme === 'dark') {
      root.classList.add('dark');
    } else {
      root.classList.remove('dark');
    }
    localStorage.setItem('theme', theme);
  }, [theme]);
  const toggleTheme = () => {
    setTheme((prev) => (prev === 'dark' ? 'light' : 'dark'));
  };
  return (
    <ThemeContext.Provider value={{ theme, toggleTheme }}>
      {children}
    </ThemeContext.Provider>
  );
 }
 export function useTheme() {
  const context = useContext(ThemeContext);
  if (!context) {
    throw new Error('useTheme must be used within a ThemeProvider');
  }
  return context;
 }
@@ -2,6 +2,26 @@
@tailwind components;
@tailwind utilities;
 /* Light mode (default) */
 :root {
  --color-bg-dark: #f1f5f9;
  --color-bg-card: #ffffff;
  --color-bg-card-hover: #e2e8f0;
  --color-text-primary: #0f172a;
  --color-text-secondary: #475569;
  --color-border: #cbd5e1;
 }
 /* Dark mode */
 .dark {
  --color-bg-dark: #0f172a;
  --color-bg-card: #1e293b;
  --color-bg-card-hover: #334155;
  --color-text-primary: #f8fafc;
  --color-text-secondary: #94a3b8;
  --color-border: #334155;
 }
 body {
  font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
  -webkit-font-smoothing: antialiased;
@@ -15,7 +35,7 @@ body {
 }
 ::-webkit-scrollbar-track {
-  background: #1e293b;
+  background: var(--color-bg-card);
 }
 ::-webkit-scrollbar-thumb {
@@ -30,5 +50,5 @@ body {
 /* Selection */
 ::selection {
  background: rgba(99, 102, 241, 0.3);
-  color: #f8fafc;
+  color: var(--color-text-primary);
 }
@@ -1,7 +1,7 @@
 import { useState } from 'react';
 import { useMutation } from '@tanstack/react-query';
-import { analysisApi } from '../api/client';
+import { analysisApi, exportApi } from '../api/client';
-import { Search, CheckCircle, AlertCircle, Clock, FileText } from 'lucide-react';
+import { Search, CheckCircle, AlertCircle, Clock, FileText, Download } from 'lucide-react';
 import type { CompanyAnalysis } from '../types';
 export function Analysis() {
@@ -106,9 +106,18 @@ export function Analysis() {
          {/* Analysis Content */}
          {result.success && result.analysis && (
            <div className="bg-bg-card/60 backdrop-blur-lg border border-primary/15 rounded-2xl p-6">
-              <h3 className="text-lg font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-4">
+              <div className="flex items-center justify-between border-b-2 border-primary/30 pb-2 mb-4">
-                AI Analysis Results
+                <h3 className="text-lg font-semibold text-text-primary">
-              </h3>
+                  AI Analysis Results
                </h3>
                <button
                  onClick={() => exportApi.exportCsv(result.company_name)}
                  className="flex items-center gap-2 text-sm bg-primary/20 hover:bg-primary/30 text-primary font-medium px-3 py-1.5 rounded-lg transition-colors"
                >
                  <Download size={14} />
                  Export CSV
                </button>
              </div>
              <div className="prose prose-invert max-w-none">
                <div className="text-text-primary whitespace-pre-wrap leading-relaxed">
                  {result.analysis}
@@ -9,15 +9,38 @@ const COLORS = ['#6366f1', '#0ea5e9', '#10b981', '#f59e0b', '#ef4444', '#8b5cf6'
 export function AnalyticsPage() {
  const [days, setDays] = useState(30);
-  const { data, isLoading, isError } = useQuery({
+  const { data, isLoading, isError, refetch } = useQuery({
    queryKey: ['analytics', days],
    queryFn: () => analyticsApi.getAnalytics(days),
  });
  if (isLoading) {
    return (
-      <div className="flex items-center justify-center min-h-[400px]">
+      <div className="space-y-6">
-        <div className="animate-spin rounded-full h-12 w-12 border-t-2 border-b-2 border-primary"></div>
+        <div>
          <h2 className="text-xl font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-2">
            Analytics Dashboard
          </h2>
          <p className="text-text-secondary">Loading analytics data...</p>
        </div>
        {/* Skeleton cards */}
        <div className="grid grid-cols-1 md:grid-cols-3 gap-4">
          {[1, 2, 3].map((i) => (
            <div key={i} className="bg-gradient-to-br from-primary/10 to-secondary/10 border border-primary/20 rounded-xl p-5 text-center animate-pulse">
              <div className="h-9 w-16 bg-primary/20 rounded mx-auto mb-2" />
              <div className="h-4 w-24 bg-primary/10 rounded mx-auto" />
            </div>
          ))}
        </div>
        {/* Skeleton charts */}
        <div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
          {[1, 2].map((i) => (
            <div key={i} className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6 animate-pulse">
              <div className="h-5 w-40 bg-primary/20 rounded mb-4" />
              <div className="h-[300px] bg-primary/5 rounded" />
            </div>
          ))}
        </div>
      </div>
    );
  }
@@ -33,15 +56,18 @@ export function AnalyticsPage() {
        <div className="bg-gradient-to-br from-primary/10 to-secondary/5 border border-primary/20 rounded-xl p-6">
          <div className="flex items-center gap-3 text-warning mb-2">
            <Database size={24} />
-            <span className="font-semibold">Database Not Connected</span>
+            <span className="font-semibold">Unable to Load Analytics</span>
          </div>
          <p className="text-text-secondary">
-            Set <code className="bg-bg-card px-2 py-1 rounded">USE_DATABASE=true</code> in your .env file to enable analytics tracking.
+            Could not connect to the analytics database. Ensure PostgreSQL is running and
            <code className="bg-bg-card px-2 py-1 rounded mx-1">DATABASE_URL</code> is configured correctly.
          </p>
-        </div>
+          <button
-        <div className="flex items-center gap-2 bg-secondary/10 border border-secondary/20 text-secondary rounded-xl px-4 py-3">
+            onClick={() => refetch()}
-          <AlertCircle size={18} />
+            className="mt-3 text-sm bg-primary/20 hover:bg-primary/30 text-primary font-medium px-4 py-2 rounded-lg transition-colors"
-          <span>Analytics features require storing analysis results in PostgreSQL for historical tracking.</span>
+          >
            Retry
          </button>
        </div>
      </div>
    );
@@ -114,9 +114,21 @@ export function Batch() {
      {/* Error */}
      {mutation.isError && (
-        <div className="flex items-center gap-2 bg-error/10 border border-error/20 text-error rounded-xl px-4 py-3">
+        <div className="bg-error/10 border border-error/20 rounded-xl px-4 py-3">
-          <AlertCircle size={18} />
+          <div className="flex items-center gap-2 text-error">
-          <span>Batch analysis failed. Please try again.</span>
+            <AlertCircle size={18} />
            <span className="font-semibold">Batch analysis failed</span>
          </div>
          <p className="text-text-secondary text-sm mt-1 ml-7">
            {mutation.error instanceof Error ? mutation.error.message : 'An unexpected error occurred.'}
            {' '}Check your connection and try again.
          </p>
          <button
            onClick={() => mutation.reset()}
            className="ml-7 mt-2 text-sm text-primary hover:text-primary-dark underline"
          >
            Dismiss
          </button>
        </div>
      )}
@@ -31,7 +31,7 @@ export function Login() {
  };
  return (
-    <div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950 flex items-center justify-center px-4">
+    <div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950 flex items-center justify-center px-4">
      <div className="w-full max-w-md">
        {/* Brand */}
        <div className="text-center mb-8">
@@ -40,7 +40,7 @@ export function Register() {
  };
  return (
-    <div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950 flex items-center justify-center px-4">
+    <div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950 flex items-center justify-center px-4">
      <div className="w-full max-w-md">
        {/* Brand */}
        <div className="text-center mb-8">
@@ -4,6 +4,7 @@ export default {
    "./index.html",
    "./src/**/*.{js,ts,jsx,tsx}",
  ],
  darkMode: 'class',
  theme: {
    extend: {
      colors: {
@@ -16,15 +17,15 @@ export default {
        warning: '#f59e0b',
        error: '#ef4444',
        bg: {
-          dark: '#0f172a',
+          dark: 'var(--color-bg-dark)',
-          card: '#1e293b',
+          card: 'var(--color-bg-card)',
-          'card-hover': '#334155',
+          'card-hover': 'var(--color-bg-card-hover)',
        },
        text: {
-          primary: '#f8fafc',
+          primary: 'var(--color-text-primary)',
-          secondary: '#94a3b8',
+          secondary: 'var(--color-text-secondary)',
        },
-        border: '#334155',
+        border: 'var(--color-border)',
      },
    },
  },
@@ -14,3 +14,6 @@ numpy
 pandas
 bcrypt
 PyJWT
 slowapi
 apscheduler
 boto3
@@ -0,0 +1,8 @@
 [lint]
 select = ["E", "F", "I"]
 ignore = [
    "E501",  # line too long (handled by formatter)
 ]
 [lint.per-file-ignores]
 "tests/*" = ["E402", "F841"]  # allow import not at top of file, unused vars (mocks) in tests
@@ -1,9 +1,11 @@
 """Tests for the high-level company analyzer orchestration."""
 from unittest.mock import MagicMock, Mock
 import pytest
-from unittest.mock import Mock, patch, call, MagicMock
+
 from SPARC.analyzer import CompanyAnalyzer
-from SPARC.types import Patent, Patents, CompanyAnalysisResult, BatchAnalysisResult
+from SPARC.types import BatchAnalysisResult, Patent, Patents
@pytest.fixture(autouse=True)
@@ -24,7 +26,7 @@ class TestCompanyAnalyzer:
        """Test analyzer initialization with API key."""
        mock_llm = mocker.patch("SPARC.analyzer.LLMAnalyzer")
-        analyzer = CompanyAnalyzer(openrouter_api_key="test-key")
+        _analyzer = CompanyAnalyzer(openrouter_api_key="test-key")  # noqa: F841
        mock_llm.assert_called_once_with(api_key="test-key")
@@ -1,12 +1,13 @@
 """Tests for FastAPI web service endpoints."""
 import pytest
 from datetime import datetime
-from unittest.mock import Mock, patch
+from unittest.mock import Mock
 import pytest
 from fastapi.testclient import TestClient
 from SPARC.api import app
-from SPARC.types import CompanyAnalysisResult, BatchAnalysisResult
+from SPARC.types import BatchAnalysisResult, CompanyAnalysisResult
@pytest.fixture
@@ -0,0 +1,302 @@
 """Tests for JWT authentication flow: register, login, protected routes, refresh, admin access."""
 from datetime import datetime, timezone
 from unittest.mock import MagicMock, patch
 import pytest
 from fastapi.testclient import TestClient
 from SPARC.api import app
 from SPARC.auth import create_access_token, create_refresh_token
@pytest.fixture
 def client():
    """Create test client."""
    return TestClient(app)
@pytest.fixture(autouse=True)
 def mock_db(monkeypatch):
    """Mock the database client used by auth endpoints.
    Returns a MagicMock with all DB methods pre-configured.
    """
    db = MagicMock()
    # Default: no users exist
    db.get_user_count.return_value = 0
    db.get_user_by_id.return_value = None
    db.get_user_by_email.return_value = None
    db.authenticate_user.return_value = None
    db.create_user.return_value = None
    db.get_all_users.return_value = []
    db.update_user_role.return_value = None
    db.delete_user.return_value = False
    with patch("SPARC.api.get_db_client", return_value=db), \
         patch("SPARC.auth.get_db_client", return_value=db):
        yield db
 def _make_admin_user():
    return {
        "id": 1,
        "email": "admin@test.com",
        "role": "admin",
        "created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
    }
 def _make_regular_user():
    return {
        "id": 2,
        "email": "user@test.com",
        "role": "user",
        "created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
    }
 def _auth_header(user_dict):
    """Create an Authorization header with a valid access token for the given user."""
    token = create_access_token(user_dict["id"], user_dict["email"], user_dict["role"])
    return {"Authorization": f"Bearer {token}"}
 class TestRegister:
    """POST /auth/register"""
    def test_register_first_user_becomes_admin(self, client, mock_db):
        """First registered user should get admin role."""
        mock_db.get_user_count.return_value = 0
        mock_db.create_user.return_value = {
            "id": 1,
            "email": "admin@test.com",
            "role": "admin",
            "created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
        }
        response = client.post(
            "/auth/register",
            json={"email": "admin@test.com", "password": "securepass123"},
        )
        assert response.status_code == 200
        data = response.json()
        assert data["email"] == "admin@test.com"
        assert data["role"] == "admin"
        mock_db.create_user.assert_called_once_with(
            email="admin@test.com", password="securepass123", role="admin"
        )
    def test_register_subsequent_user_gets_user_role(self, client, mock_db):
        """Non-first user should get regular user role."""
        mock_db.get_user_count.return_value = 1
        mock_db.create_user.return_value = _make_regular_user()
        response = client.post(
            "/auth/register",
            json={"email": "user@test.com", "password": "securepass123"},
        )
        assert response.status_code == 200
        data = response.json()
        assert data["role"] == "user"
    def test_register_duplicate_email_returns_400(self, client, mock_db):
        """Registering with an existing email should return 400."""
        mock_db.get_user_count.return_value = 1
        mock_db.create_user.return_value = None  # indicates duplicate
        response = client.post(
            "/auth/register",
            json={"email": "existing@test.com", "password": "securepass123"},
        )
        assert response.status_code == 400
        assert "already registered" in response.json()["detail"].lower()
 class TestLogin:
    """POST /auth/login"""
    def test_login_valid_credentials_returns_tokens(self, client, mock_db):
        """Valid credentials should return access and refresh tokens."""
        user = _make_regular_user()
        mock_db.authenticate_user.return_value = user
        response = client.post(
            "/auth/login",
            json={"email": "user@test.com", "password": "correctpassword"},
        )
        assert response.status_code == 200
        data = response.json()
        assert "access_token" in data
        assert "refresh_token" in data
        assert data["token_type"] == "bearer"
    def test_login_invalid_credentials_returns_401(self, client, mock_db):
        """Invalid credentials should return 401."""
        mock_db.authenticate_user.return_value = None
        response = client.post(
            "/auth/login",
            json={"email": "user@test.com", "password": "wrongpassword"},
        )
        assert response.status_code == 401
        assert "invalid" in response.json()["detail"].lower()
 class TestGetMe:
    """GET /auth/me"""
    def test_valid_access_token_returns_user(self, client, mock_db):
        """A valid access token should return the user's data."""
        user = _make_regular_user()
        mock_db.get_user_by_id.return_value = user
        response = client.get("/auth/me", headers=_auth_header(user))
        assert response.status_code == 200
        data = response.json()
        assert data["email"] == "user@test.com"
        assert data["id"] == 2
    def test_missing_token_returns_401(self, client):
        """No token should return 401 (403 from HTTPBearer)."""
        response = client.get("/auth/me")
        assert response.status_code in (401, 403)
    def test_expired_token_returns_401(self, client, mock_db):
        """An expired token should return 401."""
        # Create a token that has already expired
        from datetime import timedelta
        import jwt as pyjwt
        from SPARC.auth import JWT_ALGORITHM, JWT_SECRET
        payload = {
            "sub": "1",
            "email": "user@test.com",
            "role": "user",
            "exp": datetime.now(timezone.utc) - timedelta(hours=1),
            "type": "access",
        }
        expired_token = pyjwt.encode(payload, JWT_SECRET, algorithm=JWT_ALGORITHM)
        response = client.get(
            "/auth/me", headers={"Authorization": f"Bearer {expired_token}"}
        )
        assert response.status_code == 401
    def test_refresh_token_as_access_returns_401(self, client, mock_db):
        """Using a refresh token as an access token should return 401."""
        user = _make_regular_user()
        refresh_token = create_refresh_token(user["id"], user["email"], user["role"])
        response = client.get(
            "/auth/me", headers={"Authorization": f"Bearer {refresh_token}"}
        )
        assert response.status_code == 401
 class TestRefreshToken:
    """POST /auth/refresh"""
    def test_valid_refresh_token_returns_new_tokens(self, client, mock_db):
        """A valid refresh token should issue new access and refresh tokens."""
        user = _make_regular_user()
        mock_db.get_user_by_id.return_value = user
        refresh = create_refresh_token(user["id"], user["email"], user["role"])
        response = client.post(
            "/auth/refresh", json={"refresh_token": refresh}
        )
        assert response.status_code == 200
        data = response.json()
        assert "access_token" in data
        assert "refresh_token" in data
    def test_invalid_refresh_token_returns_401(self, client, mock_db):
        """An invalid refresh token should return 401."""
        response = client.post(
            "/auth/refresh", json={"refresh_token": "invalid-token-string"}
        )
        assert response.status_code == 401
    def test_access_token_as_refresh_returns_401(self, client, mock_db):
        """Using an access token as a refresh token should return 401."""
        user = _make_regular_user()
        access = create_access_token(user["id"], user["email"], user["role"])
        response = client.post(
            "/auth/refresh", json={"refresh_token": access}
        )
        assert response.status_code == 401
 class TestAdminUsers:
    """GET /admin/users and PATCH /admin/users/{id}/role"""
    def test_admin_can_list_users(self, client, mock_db):
        """Admin token should allow listing users."""
        admin = _make_admin_user()
        mock_db.get_user_by_id.return_value = admin
        mock_db.get_all_users.return_value = [admin, _make_regular_user()]
        response = client.get("/admin/users", headers=_auth_header(admin))
        assert response.status_code == 200
        data = response.json()
        assert len(data) == 2
    def test_regular_user_cannot_list_users(self, client, mock_db):
        """Regular user token should be rejected with 403."""
        user = _make_regular_user()
        mock_db.get_user_by_id.return_value = user
        response = client.get("/admin/users", headers=_auth_header(user))
        assert response.status_code == 403
    def test_no_token_cannot_list_users(self, client):
        """No token should be rejected."""
        response = client.get("/admin/users")
        assert response.status_code in (401, 403)
    def test_admin_can_change_user_role(self, client, mock_db):
        """Admin should be able to change another user's role."""
        admin = _make_admin_user()
        mock_db.get_user_by_id.return_value = admin
        mock_db.update_user_role.return_value = {
            "id": 2,
            "email": "user@test.com",
            "role": "admin",
            "created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
        }
        response = client.patch(
            "/admin/users/2/role",
            json={"role": "admin"},
            headers=_auth_header(admin),
        )
        assert response.status_code == 200
        assert response.json()["role"] == "admin"
    def test_admin_cannot_change_own_role(self, client, mock_db):
        """Admin should not be able to change their own role."""
        admin = _make_admin_user()
        mock_db.get_user_by_id.return_value = admin
        response = client.patch(
            "/admin/users/1/role",
            json={"role": "user"},
            headers=_auth_header(admin),
        )
        assert response.status_code == 400
        assert "own role" in response.json()["detail"].lower()
@@ -1,7 +1,9 @@
 """Tests for LLM analysis functionality."""
 from unittest.mock import Mock
 import pytest
-from unittest.mock import Mock, MagicMock, patch
+
 from SPARC.llm import LLMAnalyzer
@@ -0,0 +1,97 @@
 """Tests for rate limiting on auth endpoints."""
 import pytest
 from unittest.mock import Mock, patch, MagicMock
 from fastapi.testclient import TestClient
 from SPARC.api import app
@pytest.fixture
 def client():
    """Create test client with rate limiter enabled."""
    return TestClient(app)
@pytest.fixture(autouse=True)
 def reset_limiter():
    """Reset rate limiter storage between tests."""
    from SPARC.api import limiter
    limiter.reset()
    yield
 class TestRateLimiting:
    """Test rate limiting on login and register endpoints."""
    @patch("SPARC.api.get_db_client")
    def test_login_allows_requests_under_limit(self, mock_db_client, client):
        """Login endpoint allows requests under the rate limit."""
        mock_db = MagicMock()
        mock_db.authenticate_user.return_value = None
        mock_db_client.return_value = mock_db
        # Should allow at least a few requests
        for _ in range(5):
            response = client.post(
                "/auth/login",
                json={"email": "test@example.com", "password": "password123"},
            )
            # 401 is expected (invalid credentials), not 429
            assert response.status_code == 401
    @patch("SPARC.api.get_db_client")
    def test_login_rate_limited_after_threshold(self, mock_db_client, client):
        """Login endpoint returns 429 after exceeding rate limit."""
        mock_db = MagicMock()
        mock_db.authenticate_user.return_value = None
        mock_db_client.return_value = mock_db
        # Send more than the limit (10/minute)
        statuses = []
        for _ in range(15):
            response = client.post(
                "/auth/login",
                json={"email": "test@example.com", "password": "password123"},
            )
            statuses.append(response.status_code)
        # At least one should be 429
        assert 429 in statuses, f"Expected 429 in statuses but got: {set(statuses)}"
    @patch("SPARC.api.get_db_client")
    def test_register_rate_limited_after_threshold(self, mock_db_client, client):
        """Register endpoint returns 429 after exceeding rate limit."""
        mock_db = MagicMock()
        mock_db.get_user_count.return_value = 1
        mock_db.create_user.return_value = None  # triggers 400 (email exists)
        mock_db_client.return_value = mock_db
        # Send more than the limit (5/minute)
        statuses = []
        for _ in range(10):
            response = client.post(
                "/auth/register",
                json={"email": "test@example.com", "password": "password123"},
            )
            statuses.append(response.status_code)
        # At least one should be 429
        assert 429 in statuses, f"Expected 429 in statuses but got: {set(statuses)}"
    @patch("SPARC.api.get_db_client")
    def test_rate_limit_returns_retry_after_header(self, mock_db_client, client):
        """Rate limited responses include a Retry-After header."""
        mock_db = MagicMock()
        mock_db.authenticate_user.return_value = None
        mock_db_client.return_value = mock_db
        # Exhaust the limit
        for _ in range(15):
            response = client.post(
                "/auth/login",
                json={"email": "test@example.com", "password": "password123"},
            )
            if response.status_code == 429:
                assert "Retry-After" in response.headers
                break
@@ -0,0 +1,116 @@
 """Tests for security hardening: JWT secret startup check, CORS config, credential handling."""
 import os
 from unittest.mock import patch
 import pytest
 class TestJWTSecretStartupCheck:
    """Test the startup guard that refuses default JWT secret in non-dev environments."""
    def test_default_secret_in_production_raises(self):
        """Starting with default secret and APP_ENV=production must raise RuntimeError."""
        with patch.dict(os.environ, {"APP_ENV": "production"}):
            # Reload config to pick up the new APP_ENV
            import importlib
            import SPARC.config
            importlib.reload(SPARC.config)
            from SPARC.auth import _DEFAULT_JWT_SECRET, check_jwt_secret
            # Patch JWT_SECRET to the default
            with patch("SPARC.auth.JWT_SECRET", _DEFAULT_JWT_SECRET):
                with pytest.raises(RuntimeError, match="FATAL.*JWT_SECRET"):
                    check_jwt_secret()
            # Restore config
            with patch.dict(os.environ, {"APP_ENV": "development"}):
                importlib.reload(SPARC.config)
    def test_default_secret_in_development_succeeds(self):
        """Starting with default secret and APP_ENV=development must not raise."""
        with patch.dict(os.environ, {"APP_ENV": "development"}):
            import importlib
            import SPARC.config
            importlib.reload(SPARC.config)
            from SPARC.auth import _DEFAULT_JWT_SECRET, check_jwt_secret
            with patch("SPARC.auth.JWT_SECRET", _DEFAULT_JWT_SECRET):
                # Should not raise
                check_jwt_secret()
            # Restore
            importlib.reload(SPARC.config)
    def test_custom_secret_in_production_succeeds(self):
        """Starting with a custom secret in production must not raise."""
        with patch.dict(os.environ, {"APP_ENV": "production"}):
            import importlib
            import SPARC.config
            importlib.reload(SPARC.config)
            from SPARC.auth import check_jwt_secret
            with patch("SPARC.auth.JWT_SECRET", "my-secure-random-secret-abc123"):
                # Should not raise
                check_jwt_secret()
            with patch.dict(os.environ, {"APP_ENV": "development"}):
                importlib.reload(SPARC.config)
    def test_default_secret_unset_env_succeeds(self):
        """When APP_ENV is unset (defaults to development), default secret is allowed."""
        with patch.dict(os.environ, {}, clear=False):
            # Remove APP_ENV if present
            env = os.environ.copy()
            env.pop("APP_ENV", None)
            with patch.dict(os.environ, env, clear=True):
                import importlib
                import SPARC.config
                importlib.reload(SPARC.config)
                from SPARC.auth import _DEFAULT_JWT_SECRET, check_jwt_secret
                with patch("SPARC.auth.JWT_SECRET", _DEFAULT_JWT_SECRET):
                    # Should not raise (defaults to development)
                    check_jwt_secret()
                with patch.dict(os.environ, {"APP_ENV": "development"}):
                    importlib.reload(SPARC.config)
 class TestCORSConfig:
    """Test that CORS origins are configurable via environment variable."""
    def test_default_cors_origins(self):
        """When CORS_ORIGINS is unset, defaults to localhost origins."""
        with patch.dict(os.environ, {"CORS_ORIGINS": ""}):
            import importlib
            import SPARC.config
            importlib.reload(SPARC.config)
            assert SPARC.config.cors_origins == [
                "http://localhost:3000",
                "http://localhost:5173",
            ]
    def test_custom_cors_origins(self):
        """Setting CORS_ORIGINS configures allowed origins."""
        with patch.dict(os.environ, {"CORS_ORIGINS": "https://sparc.example.com,https://app.example.com"}):
            import importlib
            import SPARC.config
            importlib.reload(SPARC.config)
            assert SPARC.config.cors_origins == [
                "https://sparc.example.com",
                "https://app.example.com",
            ]
            # Restore
            with patch.dict(os.environ, {"CORS_ORIGINS": ""}):
                importlib.reload(SPARC.config)
    def test_single_cors_origin(self):
        """A single origin without comma works correctly."""
        with patch.dict(os.environ, {"CORS_ORIGINS": "https://sparc.example.com"}):
            import importlib
            import SPARC.config
            importlib.reload(SPARC.config)
            assert SPARC.config.cors_origins == ["https://sparc.example.com"]
            with patch.dict(os.environ, {"CORS_ORIGINS": ""}):
                importlib.reload(SPARC.config)
@@ -1,9 +1,8 @@
 """Tests for SERP API patent retrieval and parsing functionality."""
 import os
 import pytest
 from unittest.mock import patch, Mock
 from datetime import datetime, timedelta
 from unittest.mock import Mock
 from SPARC.serp_api import SERP
 from SPARC.types import Patent
Author	SHA1	Message	Date
agent-company	a6c92fde9f	merge: resolve conflicts for S3 storage branch with main Integrates S3/MinIO storage backend with structured logging changes from main. Both boto3 and apscheduler retained in requirements.txt.	2026-03-26 12:09:24 +00:00
AI-Manager	a4db9439f5	Merge pull request 'feat: add webhook notification support for job completion' (#66 ) from feature/webhooks into main	2026-03-26 12:08:08 +00:00
AI-Manager	bbea16387d	Merge pull request 'feat: implement scheduled/recurring analysis with change alerting' (#65 ) from feature/scheduled-analysis into main	2026-03-26 12:07:46 +00:00
AI-Manager	4e2bcae18a	Merge pull request 'feat: add CSV export for company analysis results' (#60 ) from feature/export-csv into main	2026-03-26 12:06:57 +00:00
AI-Manager	b66b8332b6	Merge pull request 'feat: add dark/light mode toggle with localStorage persistence' (#57 ) from feature/dark-mode into main	2026-03-26 12:06:33 +00:00
AI-Manager	c42bf5bf71	Merge pull request 'feat: add cursor-based pagination to /jobs endpoint' (#59 ) from feature/cursor-pagination into main	2026-03-26 12:06:04 +00:00
AI-Manager	02991b6648	Merge pull request 'feat: add loading skeletons and error retry to Batch and Analytics' (#56 ) from feature/loading-error-states into main	2026-03-26 12:05:41 +00:00
AI-Manager	ab74904845	Merge pull request 'fix: auto-download patent PDF in analyze_single_patent' (#55 ) from feature/fix-single-patent-download into main	2026-03-26 12:05:10 +00:00
AI-Manager	92197440bf	Merge pull request 'feat: add structured logging to serp_api.py' (#54 ) from feature/structured-logging into main	2026-03-26 12:04:59 +00:00
AI-Manager	301a773622	Merge pull request 'ci: add tsc --noEmit TypeScript type checking to CI pipeline' (#53 ) from feature/ci-tsc-lint into main	2026-03-26 12:04:39 +00:00
agent-company	2e6b8c7445	feat: add webhook notification support for job completion and alerts Send HTTP POST notifications to configured webhook URLs when batch jobs complete or when scheduled analysis detects significant changes. - Add SPARC/webhooks.py with retry logic (3 attempts, exponential backoff) - Support generic HTTP POST and Slack-compatible text payloads - Integrate into batch job completion handler in api.py - Configure via WEBHOOK_URLS env var (comma-separated) - Payload includes event type, job ID, status, and summary Closes leeworks-agents/SPARC#23 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:32:07 +00:00
agent-company	f33447eef8	feat: implement scheduled/recurring analysis with change alerting Add APScheduler-based background task that periodically re-analyzes tracked companies and alerts on significant patent count changes. - Add tracked_companies and alerts tables to database schema - Add SPARC/scheduler.py with configurable interval and threshold - Add admin endpoints: GET/POST/DELETE /admin/tracked, GET /admin/alerts - Scheduler starts at app startup; interval via SCHEDULE_INTERVAL_HOURS - Change threshold configurable via CHANGE_THRESHOLD_PERCENT env var - apscheduler is optional; graceful fallback if not installed Closes leeworks-agents/SPARC#22 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:30:43 +00:00
agent-company	1bd9dccdb8	feat: add CSV export for company analysis results Add GET /export/{company_name} backend endpoint that returns analysis records as a downloadable CSV file. Add Export CSV button to the Analysis page that triggers the download via the API. Closes leeworks-agents/SPARC#20 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:20:51 +00:00
agent-company	3b6411869d	feat: add cursor-based pagination to /jobs endpoint Add a cursor query parameter to GET /jobs and return a next_cursor field in the response envelope. Existing clients using only limit continue to work without modification. The cursor is an opaque token encoding created_at and job_id for stable keyset pagination. Closes leeworks-agents/SPARC#25 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:19:01 +00:00
agent-company	9a43f85259	feat: add S3/MinIO object storage support for patent PDFs Introduce a StorageBackend abstraction (local filesystem and S3) for patent PDF storage. When STORAGE_BACKEND=s3, PDFs are read/written via boto3 to an S3-compatible bucket instead of the local filesystem. - Add SPARC/storage.py with LocalStorageBackend and S3StorageBackend - Update serp_api.py save_patents and parse_patent_pdf to use storage - Add storage config vars to config.py and .env.example - Add optional MinIO service to docker-compose.yml (--profile s3) - Add boto3 to requirements.txt Closes leeworks-agents/SPARC#38 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:17:24 +00:00
agent-company	a4aa968434	feat: add dark/light mode toggle with localStorage persistence - Enable Tailwind "class" dark mode strategy - Use CSS custom properties for theme colors (bg, text, border) - Add ThemeProvider context with toggle and localStorage persistence - Add Sun/Moon toggle button in the header navigation - Inline script in index.html prevents FOUC on page load - All pages (Layout, Login, Register, ProtectedRoute) support both modes - Default theme follows system preference (prefers-color-scheme) Closes leeworks-agents/SPARC#33 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:15:11 +00:00
agent-company	153eb3b968	feat: improve loading and error states on Batch and Analytics pages Analytics page now shows skeleton loaders (cards and chart placeholders) while data loads, and displays a retry button when the API call fails. Batch page error state now shows the actual error message and suggests user action. Closes leeworks-agents/SPARC#16 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:11:47 +00:00
agent-company	ecc2c37bcd	fix: auto-download patent PDF in analyze_single_patent before reading When the PDF is not on disk, analyze_single_patent now looks up the cached PDF link from the database and downloads it automatically. If no link is cached, a clear FileNotFoundError is raised. Also adds a GET /analyze/patent/{patent_id} API endpoint that exposes this functionality and returns 404 when the PDF cannot be obtained. Closes leeworks-agents/SPARC#36 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:08:34 +00:00
agent-company	0b4d712fc5	feat: add structured logging to serp_api.py Add module-level logger to serp_api.py with INFO-level messages for patent queries and PDF downloads, and DEBUG-level messages for cache hits and parsing details. All three target files (analyzer.py, serp_api.py, llm.py) now use structured logging with no print() calls. Closes leeworks-agents/SPARC#46 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:07:07 +00:00
agent-company	4696838fb8	ci: add tsc --noEmit TypeScript type checking to CI pipeline Upgrade lucide-react to v1.7.0 for proper TypeScript declarations and add a TypeScript type check step to the test workflow. Both ruff (Python) and tsc --noEmit (TypeScript) now block merging on failure. Closes leeworks-agents/SPARC#52 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:05:55 +00:00
AI-Manager	55c131cb32	Merge pull request 'ci: add pytest and ruff linting to CI workflow' (#32 ) from feature/ci-testing-linting into main	2026-03-26 07:04:31 +00:00
agent-company	fbb72fe2a5	ci: add pytest and ruff linting to CI, fix all lint errors - Add test job to build.yaml that runs pytest and ruff before building images - Add standalone test.yaml workflow for PRs - Add ruff.toml with E/F/I rules configured - Fix all ruff lint errors: sort imports, remove unused imports, fix re-exports - Build jobs now depend on test job passing (needs: test) Closes leeworks-agents/SPARC#18 Closes leeworks-agents/SPARC#19 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 07:04:00 +00:00
AI-Manager	e484baaf5f	Merge pull request 'feat: configurable LLM model, SERP cache TTL, structured logging, fix type' (#29 ) from feature/p2-config-improvements into main	2026-03-26 07:03:08 +00:00
AI-Manager	069f1c343c	Merge pull request 'refactor(db): shared pooled DatabaseClient singleton' (#30 ) from feature/db-client-pooling into main	2026-03-26 07:02:46 +00:00
agent-company	d366443b38	refactor(db): use shared pooled DatabaseClient singleton instead of per-call instances - Replace get_db_client() creating new DatabaseClient on every call with a module-level singleton initialized once at startup via init_db_client() - Add init_db_client() and close_db_client() lifecycle functions called from FastAPI lifespan handler - Migrate all DatabaseClient methods from legacy self.connect()/self.conn to pooled self.get_conn() context manager for thread-safe connection reuse - Pool is properly torn down on application shutdown Closes leeworks-agents/SPARC#7 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 06:03:56 +00:00
agent-company	b000146585	feat: configurable LLM model, SERP cache TTL, structured logging, fix patent_id type - Make LLM model configurable via MODEL env var, default anthropic/claude-3.5-sonnet (#12) - Expose SERP cache TTL as SERP_CACHE_TTL_HOURS env var, default 24 hours (#13) - Fix Patent.patent_id type annotation from int to str in types.py (#14) - Replace all print() calls with structured logging in analyzer.py and llm.py (#11) - Add LOG_LEVEL config with basicConfig setup in config.py - Add model and serp_cache_ttl_hours to config.py Closes leeworks-agents/SPARC#11 Closes leeworks-agents/SPARC#12 Closes leeworks-agents/SPARC#13 Closes leeworks-agents/SPARC#14 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 06:03:25 +00:00
AI-Manager	35d105b14e	Merge pull request 'feat(auth): add rate limiting to login and register endpoints' (#28 ) from feature/rate-limiting into main	2026-03-26 05:04:46 +00:00
AI-Manager	6fcf170d93	Merge pull request 'feat(jobs): persist async batch job state in PostgreSQL' (#34 ) from feature/persist-job-state into main	2026-03-26 05:04:26 +00:00
AI-Manager	5a42e216ba	Merge pull request 'docs: patent PDF storage docs, FileNotFoundError, frontend lockfile' (#31 ) from feature/p2-docs-and-lockfile into main	2026-03-26 05:04:01 +00:00
AI-Manager	24ab341d9b	Merge pull request 'test(auth): add comprehensive JWT authentication test suite' (#35 ) from feature/jwt-auth-tests into main	2026-03-26 05:03:29 +00:00
AI-Manager	878fedfbb8	Merge pull request 'feat(security): JWT startup guard, configurable CORS, externalize DB creds' (#27 ) from feature/p1-security-hardening into main	2026-03-26 05:03:16 +00:00
agent-company	ae9f257dcb	test(auth): add comprehensive JWT authentication test suite Add 17 tests in tests/test_auth.py covering all auth flows: - Registration: first user admin, subsequent user, duplicate email - Login: valid credentials, invalid credentials - Protected routes: valid token, missing token, expired token, wrong token type - Token refresh: valid refresh, invalid refresh, access-as-refresh rejected - Admin endpoints: list users, change role, own-role prevention, permission checks All tests use mocked database (no live DB required). Closes leeworks-agents/SPARC#10 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:24:12 +00:00
agent-company	3dac88ec90	docs: document patent PDF storage, add FileNotFoundError, commit lockfile - Add docstring to analyze_single_patent explaining the PDF prerequisite - Raise FileNotFoundError with helpful message when PDF is missing - Add patent PDF storage section to README with Docker volume mount example - Commit frontend/package-lock.json for reproducible builds Closes leeworks-agents/SPARC#15 Closes leeworks-agents/SPARC#17 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:17:09 +00:00
agent-company	e2d750146c	feat(auth): add rate limiting to login and register endpoints - Add slowapi rate limiter: 10 req/min for /auth/login, 5 req/min for /auth/register - Return HTTP 429 with Retry-After header when limit is exceeded - Add slowapi to requirements.txt - Add 4 passing tests for rate limit behavior Closes leeworks-agents/SPARC#9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:08:22 +00:00
agent-company	47cddcbeaf	feat(security): add JWT startup guard, configurable CORS, and externalize DB credentials - Add check_jwt_secret() that refuses default JWT secret when APP_ENV != development - Make CORS origins configurable via CORS_ORIGINS env var (comma-separated) - Replace hardcoded postgres credentials in docker-compose.yml with env var references - Add APP_ENV and cors_origins to config.py - Update .env.example with all required variables and documentation - Add tests for JWT startup guard and CORS configuration Closes leeworks-agents/SPARC#4 Closes leeworks-agents/SPARC#5 Closes leeworks-agents/SPARC#6 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:06:31 +00:00