SPARC

Author	SHA1	Message	Date
agent-company	03f8f7fa79	merge: resolve trend-charts conflicts with export and tracked endpoints Keeps both analytics/trends endpoint and export endpoint from main.	2026-03-26 12:12:09 +00:00
agent-company	a6c92fde9f	merge: resolve conflicts for S3 storage branch with main Integrates S3/MinIO storage backend with structured logging changes from main. Both boto3 and apscheduler retained in requirements.txt.	2026-03-26 12:09:24 +00:00
AI-Manager	a4db9439f5	Merge pull request 'feat: add webhook notification support for job completion' (#66 ) from feature/webhooks into main	2026-03-26 12:08:08 +00:00
AI-Manager	bbea16387d	Merge pull request 'feat: implement scheduled/recurring analysis with change alerting' (#65 ) from feature/scheduled-analysis into main	2026-03-26 12:07:46 +00:00
AI-Manager	4e2bcae18a	Merge pull request 'feat: add CSV export for company analysis results' (#60 ) from feature/export-csv into main	2026-03-26 12:06:57 +00:00
AI-Manager	c42bf5bf71	Merge pull request 'feat: add cursor-based pagination to /jobs endpoint' (#59 ) from feature/cursor-pagination into main	2026-03-26 12:06:04 +00:00
AI-Manager	ab74904845	Merge pull request 'fix: auto-download patent PDF in analyze_single_patent' (#55 ) from feature/fix-single-patent-download into main	2026-03-26 12:05:10 +00:00
agent-company	2e6b8c7445	feat: add webhook notification support for job completion and alerts Send HTTP POST notifications to configured webhook URLs when batch jobs complete or when scheduled analysis detects significant changes. - Add SPARC/webhooks.py with retry logic (3 attempts, exponential backoff) - Support generic HTTP POST and Slack-compatible text payloads - Integrate into batch job completion handler in api.py - Configure via WEBHOOK_URLS env var (comma-separated) - Payload includes event type, job ID, status, and summary Closes leeworks-agents/SPARC#23 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:32:07 +00:00
agent-company	f33447eef8	feat: implement scheduled/recurring analysis with change alerting Add APScheduler-based background task that periodically re-analyzes tracked companies and alerts on significant patent count changes. - Add tracked_companies and alerts tables to database schema - Add SPARC/scheduler.py with configurable interval and threshold - Add admin endpoints: GET/POST/DELETE /admin/tracked, GET /admin/alerts - Scheduler starts at app startup; interval via SCHEDULE_INTERVAL_HOURS - Change threshold configurable via CHANGE_THRESHOLD_PERCENT env var - apscheduler is optional; graceful fallback if not installed Closes leeworks-agents/SPARC#22 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:30:43 +00:00
agent-company	52972bbff0	feat: add patent trend charts to the Analytics page Add GET /analytics/trends endpoint returning per-company analysis counts by month and analysis type distribution over time. Render these as a line chart (analyses per company) and stacked bar chart (analysis types) on the Analytics page using recharts. Closes leeworks-agents/SPARC#24 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:23:47 +00:00
agent-company	1bd9dccdb8	feat: add CSV export for company analysis results Add GET /export/{company_name} backend endpoint that returns analysis records as a downloadable CSV file. Add Export CSV button to the Analysis page that triggers the download via the API. Closes leeworks-agents/SPARC#20 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:20:51 +00:00
agent-company	3b6411869d	feat: add cursor-based pagination to /jobs endpoint Add a cursor query parameter to GET /jobs and return a next_cursor field in the response envelope. Existing clients using only limit continue to work without modification. The cursor is an opaque token encoding created_at and job_id for stable keyset pagination. Closes leeworks-agents/SPARC#25 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:19:01 +00:00
agent-company	9a43f85259	feat: add S3/MinIO object storage support for patent PDFs Introduce a StorageBackend abstraction (local filesystem and S3) for patent PDF storage. When STORAGE_BACKEND=s3, PDFs are read/written via boto3 to an S3-compatible bucket instead of the local filesystem. - Add SPARC/storage.py with LocalStorageBackend and S3StorageBackend - Update serp_api.py save_patents and parse_patent_pdf to use storage - Add storage config vars to config.py and .env.example - Add optional MinIO service to docker-compose.yml (--profile s3) - Add boto3 to requirements.txt Closes leeworks-agents/SPARC#38 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:17:24 +00:00
agent-company	ecc2c37bcd	fix: auto-download patent PDF in analyze_single_patent before reading When the PDF is not on disk, analyze_single_patent now looks up the cached PDF link from the database and downloads it automatically. If no link is cached, a clear FileNotFoundError is raised. Also adds a GET /analyze/patent/{patent_id} API endpoint that exposes this functionality and returns 404 when the PDF cannot be obtained. Closes leeworks-agents/SPARC#36 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:08:34 +00:00
agent-company	0b4d712fc5	feat: add structured logging to serp_api.py Add module-level logger to serp_api.py with INFO-level messages for patent queries and PDF downloads, and DEBUG-level messages for cache hits and parsing details. All three target files (analyzer.py, serp_api.py, llm.py) now use structured logging with no print() calls. Closes leeworks-agents/SPARC#46 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:07:07 +00:00
agent-company	fbb72fe2a5	ci: add pytest and ruff linting to CI, fix all lint errors - Add test job to build.yaml that runs pytest and ruff before building images - Add standalone test.yaml workflow for PRs - Add ruff.toml with E/F/I rules configured - Fix all ruff lint errors: sort imports, remove unused imports, fix re-exports - Build jobs now depend on test job passing (needs: test) Closes leeworks-agents/SPARC#18 Closes leeworks-agents/SPARC#19 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 07:04:00 +00:00
AI-Manager	e484baaf5f	Merge pull request 'feat: configurable LLM model, SERP cache TTL, structured logging, fix type' (#29 ) from feature/p2-config-improvements into main	2026-03-26 07:03:08 +00:00
agent-company	d366443b38	refactor(db): use shared pooled DatabaseClient singleton instead of per-call instances - Replace get_db_client() creating new DatabaseClient on every call with a module-level singleton initialized once at startup via init_db_client() - Add init_db_client() and close_db_client() lifecycle functions called from FastAPI lifespan handler - Migrate all DatabaseClient methods from legacy self.connect()/self.conn to pooled self.get_conn() context manager for thread-safe connection reuse - Pool is properly torn down on application shutdown Closes leeworks-agents/SPARC#7 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 06:03:56 +00:00
agent-company	b000146585	feat: configurable LLM model, SERP cache TTL, structured logging, fix patent_id type - Make LLM model configurable via MODEL env var, default anthropic/claude-3.5-sonnet (#12) - Expose SERP cache TTL as SERP_CACHE_TTL_HOURS env var, default 24 hours (#13) - Fix Patent.patent_id type annotation from int to str in types.py (#14) - Replace all print() calls with structured logging in analyzer.py and llm.py (#11) - Add LOG_LEVEL config with basicConfig setup in config.py - Add model and serp_cache_ttl_hours to config.py Closes leeworks-agents/SPARC#11 Closes leeworks-agents/SPARC#12 Closes leeworks-agents/SPARC#13 Closes leeworks-agents/SPARC#14 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 06:03:25 +00:00
AI-Manager	35d105b14e	Merge pull request 'feat(auth): add rate limiting to login and register endpoints' (#28 ) from feature/rate-limiting into main	2026-03-26 05:04:46 +00:00
AI-Manager	6fcf170d93	Merge pull request 'feat(jobs): persist async batch job state in PostgreSQL' (#34 ) from feature/persist-job-state into main	2026-03-26 05:04:26 +00:00
AI-Manager	5a42e216ba	Merge pull request 'docs: patent PDF storage docs, FileNotFoundError, frontend lockfile' (#31 ) from feature/p2-docs-and-lockfile into main	2026-03-26 05:04:01 +00:00
agent-company	96d5d27b17	feat(jobs): persist async batch job state in PostgreSQL - Add jobs table to database schema (job_id, status, progress, result_json, etc.) - Add DatabaseClient methods: create_job, update_job, get_job, list_jobs - Add mark_stale_jobs_failed() called at startup to handle interrupted jobs - Refactor _run_batch_job and job endpoints to read/write from PostgreSQL - Remove in-memory _jobs dict; job state now survives API restarts - Update init_database.py to list all tables in output Closes leeworks-agents/SPARC#8 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:22:57 +00:00
agent-company	3dac88ec90	docs: document patent PDF storage, add FileNotFoundError, commit lockfile - Add docstring to analyze_single_patent explaining the PDF prerequisite - Raise FileNotFoundError with helpful message when PDF is missing - Add patent PDF storage section to README with Docker volume mount example - Commit frontend/package-lock.json for reproducible builds Closes leeworks-agents/SPARC#15 Closes leeworks-agents/SPARC#17 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:17:09 +00:00
agent-company	e2d750146c	feat(auth): add rate limiting to login and register endpoints - Add slowapi rate limiter: 10 req/min for /auth/login, 5 req/min for /auth/register - Return HTTP 429 with Retry-After header when limit is exceeded - Add slowapi to requirements.txt - Add 4 passing tests for rate limit behavior Closes leeworks-agents/SPARC#9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:08:22 +00:00
agent-company	47cddcbeaf	feat(security): add JWT startup guard, configurable CORS, and externalize DB credentials - Add check_jwt_secret() that refuses default JWT secret when APP_ENV != development - Make CORS origins configurable via CORS_ORIGINS env var (comma-separated) - Replace hardcoded postgres credentials in docker-compose.yml with env var references - Add APP_ENV and cors_origins to config.py - Update .env.example with all required variables and documentation - Add tests for JWT startup guard and CORS configuration Closes leeworks-agents/SPARC#4 Closes leeworks-agents/SPARC#5 Closes leeworks-agents/SPARC#6 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 04:06:31 +00:00
0xWheatyz	9c971dac72	fix(analyzer): route _analyze_company_safe through cache-aware path _analyze_company_safe was calling SERP.query directly, bypassing the SERP query cache in analyze_company. Now delegates fully to analyze_company() and reads patent_count from the serp_queries cache.	2026-03-24 15:02:19 -04:00
0xWheatyz	1a297eb60b	feat(analyzer): integrate DB patent and SERP query caching Before querying SERP API, check serp_queries cache (24h TTL). Before downloading/parsing each patent, check patents table for cached minimized_content. Store results after processing so repeated analyses skip all network I/O and PDF parsing entirely.	2026-03-24 14:35:24 -04:00
0xWheatyz	3154f6b732	feat(database): add patent/serp caching tables and connection pooling - Add patents table (patent_id PK, raw_sections JSONB, minimized_content) - Add serp_queries table (query_hash unique, result_patent_ids, expires_at) - Add cache methods: get/store_patent, get/store_serp_query - Replace single connection with ThreadedConnectionPool (min=2, max=10) - Add get_conn() context manager for thread-safe connection checkout - Legacy single-connection path preserved for backwards compatibility	2026-03-24 14:34:33 -04:00
0xWheatyz	b9bb3dc1cd	perf(analyzer): parallelize patent download/parse/minimize with threads Replace the sequential per-patent loop with a ThreadPoolExecutor (workers controlled by PATENT_THREAD_WORKERS config). Each patent is processed independently in _process_single_patent, which is thread-safe since SERP methods are stateless and operate on separate files.	2026-03-24 14:32:23 -04:00
0xWheatyz	90f9cfc826	fix(serp): replace hardcoded date range with rolling window The SERP query had a frozen date range (Oct-Nov 2025) that returned stale patents. Now computes a rolling window from config (PATENT_SEARCH_DAYS, default 90 days). Also adds filesystem-level PDF caching to skip re-downloading existing patent PDFs, and adds PATENT_THREAD_WORKERS config for upcoming parallel processing.	2026-03-24 14:31:43 -04:00
0xWheatyz	d387bbbdf3	fix(analyzer): eliminate double SERP.query() call per company analysis _analyze_company_safe called SERP.query() then passed the company name to analyze_company() which called SERP.query() again — doubling API usage. Now analyze_company() accepts an optional patents param so callers can pass pre-fetched results through.	2026-03-24 14:16:49 -04:00
0xWheatyz	2815deb221	fix(api): configure root_path for OpenAPI docs behind reverse proxy Add ROOT_PATH environment variable support so FastAPI generates correct URLs for Swagger UI when served behind nginx at /api. This fixes the "invalid version field" error when accessing /api/docs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-15 11:48:11 -04:00
0xWheatyz	ebba983a1d	fix(auth): ensure JWT sub claim is RFC 7519 compliant string - Change TokenPayload.sub type from int to str per JWT RFC 7519 - Add user_id property to TokenPayload for int conversion - Update token creation to serialize user_id as string - Update token consumers to use payload.user_id - Change dashboard port from 3000 to 8080 - Add pydantic[email] for email validation - Update default admin email to admin@sparc.dev 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-14 14:22:30 -04:00
0xWheatyz	9c98b948d3	feat(api): add authentication and analytics endpoints Protect all analysis endpoints with JWT authentication: - Require valid access token for analysis operations - Add CORS middleware for React frontend (localhost:3000, 5173) Add auth endpoints: - POST /auth/register - user registration (first user becomes admin) - POST /auth/login - JWT token issuance - POST /auth/refresh - token refresh - GET /auth/me - current user info Add admin endpoints: - GET /admin/users - list all users - PATCH /admin/users/{id}/role - update user role - DELETE /admin/users/{id} - delete user Add analytics endpoint: - GET /analytics - usage statistics by company and type Update .env.example with USE_CACHE and JWT_SECRET config 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-14 13:40:48 -04:00
0xWheatyz	af52107ed8	feat(backend): add response caching and user management Replace USE_DATABASE toggle with USE_CACHE for smarter LLM response handling: - Add prompt hashing for efficient cache lookups - Cache API responses in database to reduce token usage - Always store responses for analytics (cache or fresh) Add user authentication infrastructure: - User table with bcrypt password hashing - CRUD operations for user management - Role-based access control (admin/user) Dependencies: add bcrypt and PyJWT for auth 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-14 13:40:34 -04:00
0xWheatyz	0107691c90	feat(auth): add JWT authentication module Add standalone auth module with JWT token handling: - Access and refresh token generation/validation - FastAPI dependency functions for route protection - Admin role verification for privileged endpoints - Secure password handling integration with database 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-14 13:40:28 -04:00
0xWheatyz	4e419166e8	fix: skip patents without PDF links in SERP query Not all Google Patents results include PDF download links. Previously this caused a KeyError when accessing patent["pdf"]. Now patents without PDF links are gracefully skipped with documentation explaining when this occurs (recent filings, international patents, restricted access). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-13 15:37:24 -04:00
0xWheatyz	3479ba8a46	feat: add FastAPI web service wrapper - Create REST API with endpoints for single and batch analysis - Add async job support for long-running batch operations - Implement job status tracking and listing endpoints - Add 9 tests for API endpoints - Update requirements.txt with fastapi, uvicorn, httpx - Document API usage in README 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-12 23:26:26 -04:00
0xWheatyz	1c6d903301	feat: add multi-company batch processing - Add CompanyAnalysisResult and BatchAnalysisResult dataclasses - Implement analyze_companies() for concurrent batch analysis - Implement analyze_companies_sequential() for rate-limited scenarios - Add progress callback support for monitoring batch jobs - Include 5 new tests for batch processing functionality - Fix pre-existing test mock issue in test_llm.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-12 23:23:07 -04:00
0xWheatyz	44456cb073	feat: add database mode for LLM message storage and analytics Implements a database mode that stores LLM prompts and responses in PostgreSQL instead of making API calls. This enables: - Testing without consuming API credits - Collecting analytics on usage patterns - Development and debugging workflows Changes: - Added DatabaseClient class for PostgreSQL operations - Modified LLMAnalyzer to support database/API mode toggle - Added USE_DATABASE config flag to switch between modes - Included Docker Compose setup for PostgreSQL - Added utility scripts for database init and analytics viewing - Comprehensive documentation in DATABASE_MODE.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-03-10 21:13:13 -04:00
0xWheatyz	af4114969a	feat: migrate from Anthropic API to OpenRouter Replace direct Anthropic API integration with OpenRouter to enable more flexible LLM provider access while maintaining Claude 3.5 Sonnet. Changes: - Replace anthropic package with openai in requirements.txt - Update config to use OPENROUTER_API_KEY instead of ANTHROPIC_API_KEY - Migrate LLMAnalyzer from Anthropic client to OpenAI client with OpenRouter base URL (https://openrouter.ai/api/v1) - Update model identifier to OpenRouter format: anthropic/claude-3.5-sonnet - Convert API calls from messages.create() to chat.completions.create() - Update response parsing to match OpenAI format - Rename API key parameter in CompanyAnalyzer from anthropic_api_key to openrouter_api_key - Update all tests to mock OpenAI client instead of Anthropic - Fix client initialization to accept direct API key parameter 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-22 12:26:56 -05:00
0xWheatyz	6882e53280	tests: testing modes have been added in an attempt to tune without wasting tokens.	2026-02-19 22:46:15 -05:00
0xWheatyz	a91c3badab	feat: implement company performance estimation orchestration Created CompanyAnalyzer class that orchestrates the complete pipeline: 1. Retrieves patents via SERP API 2. Downloads and parses PDFs 3. Minimizes content (removes bloat) 4. Analyzes portfolio with LLM 5. Returns performance estimation Features: - Full company portfolio analysis - Single patent analysis support - Robust error handling (continues on partial failures) - Progress logging for user visibility Updated main.py with clean example usage demonstrating the high-level API. Added comprehensive test suite (7 tests) covering: - Full pipeline integration - Error handling at each stage - Single patent analysis - Edge cases (no patents, all failures) All 26 tests passing. This completes the core functionality for patent-based company performance estimation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:57:10 -05:00
0xWheatyz	d7cf80f02f	feat: add LLM integration for patent analysis Implemented LLMAnalyzer class using Anthropic's Claude API for: - Single patent content analysis - Portfolio-wide analysis across multiple patents - Configurable API key management via environment variables Key features: - Uses Claude 3.5 Sonnet for high-quality analysis - Structured prompts for innovation assessment - Token limits optimized per use case (1024 for single, 2048 for portfolio) - Analyzes: innovation quality, market potential, strategic direction Updated config.py to support ANTHROPIC_API_KEY environment variable. Added comprehensive test suite (6 tests) covering: - Initialization from config and direct API key - Single patent analysis - Portfolio analysis - Token limit validation All 19 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:55:35 -05:00
0xWheatyz	26a23c02ae	feat: add patent content minimization for LLM consumption Implemented minimize_patent_for_llm() function that reduces patent content by keeping only essential sections (abstract, claims, summary) and explicitly excludes the verbose detailed description section. This reduces token usage while preserving core innovation details needed for company performance estimation. Added comprehensive test coverage (5 new tests) for: - Essential section inclusion - Description section exclusion - Missing section handling - Empty section handling - Section separator formatting All 13 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:54:07 -05:00
0xWheatyz	58f2bdc238	refactor: remove duplicate patent_api.py module Removed SPARC/patent_api.py as it contained duplicate implementations of parse_patent_pdf, extract_section, and clean_patent_text functions that are already present in SPARC/serp_api.py as static methods. The serp_api.py implementation is actively used in main.py, while patent_api.py was unused legacy code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-19 18:49:31 -05:00
0xWheatyz	63a9889e5b	feat: patent retrival and semi-processed	2025-12-08 19:33:02 -05:00
0xWheatyz	5569f20b8b	refactor: dataclasses are now defined as types in types.py	2025-11-27 19:22:43 -05:00
0xWheatyz	f9066279af	chore: removable text	2025-11-23 23:07:38 +00:00

1 2

52 Commits