Compare commits

..

95 Commits

Author SHA1 Message Date
agent-company 3dfa651f2d Add rate limiting dashboard to admin panel
- Enhance GET /admin/rate-limits with per-IP breakdown, 24h throttled
  count, and hourly time-series of rejected requests
- Add _rejected_log deque for time-series tracking of throttled requests
- Add AdminRateLimits React page with auto-refresh (configurable 15s/30s/1m),
  summary cards, throttled-over-time bar chart, endpoint table, per-IP table
- Add TypeScript types (RateLimitStatsResponse) and adminApi.getRateLimits()
- Wire up /admin/rate-limits route and nav link (admin-only)
- Expand unit tests to 10 cases: auth, empty state, per-IP breakdown,
  throttled_24h count, time-series structure, response shape contract

Closes leeworks-agents/SPARC#1686

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-19 15:39:45 +00:00
AI-Manager 313800215c Merge pull request 'Add rate limit stats to admin panel' (#1682) from feature/1675-rate-limit-admin into main
Merge PR #1682
2026-05-19 00:12:56 +00:00
AI-Manager 222f29deb1 Merge pull request 'Add cursor-based pagination to /analyze/batch and /jobs' (#1681) from feature/1669-cursor-pagination into main
Merge PR #1681
2026-05-19 00:12:48 +00:00
AI-Manager e6d95bbf57 Merge pull request 'Add stricter input validation for company names' (#1680) from feature/1670-company-name-validation into main
Merge PR #1680
2026-05-19 00:12:42 +00:00
AI-Manager 68484ef4b1 Merge pull request 'Update ROADMAP.md: mark completed P1 and P2 items as done' (#1679) from feature/1678-update-roadmap into main
Merge PR #1679
2026-05-19 00:12:34 +00:00
agent-company a0cb9a5773 Add rate limit status and usage statistics to admin panel
Add GET /admin/rate-limits endpoint (admin-only) that returns current
rate limit configuration and request statistics for all rate-limited
endpoints (/auth/register and /auth/login). Tracks total requests and
rejection counts via in-memory counters.

Includes tests for admin access, non-admin rejection, empty state,
request tracking, and configuration display.

Closes leeworks-agents/SPARC#1675

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-18 21:53:01 +00:00
agent-company 857b3444df Add cursor-based pagination to GET /analyze/batch and update /jobs defaults
Add a new GET /analyze/batch endpoint that returns stored analysis results
with cursor-based pagination (default limit 50, max 200). Also update the
existing /jobs endpoint defaults from limit=10/max=100 to limit=50/max=200
for consistency.

The database layer gains a list_analyses() method with cursor support using
(timestamp, id) ordering, matching the existing list_jobs() pattern.

Includes tests for pagination behavior, boundary limits, cursor forwarding,
company name filtering, and empty result sets.

Closes leeworks-agents/SPARC#1669

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-18 21:49:22 +00:00
agent-company a95129904e Add stricter input validation for company names on analysis endpoints
Add a CompanyName validated type enforcing 2-100 character length and
allowing only alphanumeric characters, spaces, hyphens, ampersands, and
periods. Applied to all endpoints accepting company names: /analyze,
/analyze/patent, /analyze/batch, /admin/tracked, and /export.

Includes unit tests covering too-short, too-long, special character,
leading-character, and valid edge cases for both single and batch
endpoints.

Closes leeworks-agents/SPARC#1670

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-18 21:38:44 +00:00
agent-company 7c6eed8d72 Update ROADMAP.md to mark completed P1 and P2 items as done
Move seven completed items from the P1 and P2 sections into the
Completed section: in-memory jobs persistence, export endpoint tests,
tracked company admin tests, webhook integration tests, S3 storage
tests, auto-download path tests, and scheduler DatabaseClient refactor.

The P2 section now only lists the two genuinely open items: cursor-based
pagination (Issue #1669) and request validation (Issue #1670).

Closes leeworks-agents/SPARC#1678

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-18 21:29:14 +00:00
AI-Manager 4c411e1e0b Merge pull request 'Add tests for tracked company admin endpoints and scheduler' (#1667) from feature/1656-tracked-company-admin-tests into main
Merge: Add tests for tracked company admin endpoints and scheduler integration

Closes #1656
2026-04-20 23:05:57 +00:00
agent-company 6165d66760 Fix scheduler tests to use get_db_client after scheduler refactor
The scheduler was refactored (PR #1665) to use the pooled
get_db_client() from SPARC.auth instead of creating its own
DatabaseClient. Update test mocks accordingly and remove the
db.close() assertion since the pooled client is no longer closed
by the scheduler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 23:05:42 +00:00
agent-company e610dea9a9 Merge remote-tracking branch 'origin/main' into feature/1656-tracked-company-admin-tests 2026-04-20 23:04:59 +00:00
AI-Manager b5f10d2032 Merge pull request 'Add API tests for export endpoints (CSV and PDF)' (#1668) from feature/1655-export-endpoint-tests into main
Merge: Add API tests for export endpoints (CSV and PDF)

Closes #1655
2026-04-20 23:04:23 +00:00
AI-Manager b5d8b0b344 Merge pull request 'Add webhook integration tests for retry logic and payloads' (#1666) from feature/1657-webhook-integration-tests into main
Merge: Add webhook integration tests for retry logic and payloads

Closes #1657
2026-04-20 23:04:19 +00:00
AI-Manager 1170356b2b Merge pull request 'Add S3/MinIO storage backend tests for storage.py' (#1663) from feature/1660-s3-storage-tests into main
Merge: Add S3/MinIO storage backend tests for storage.py

Closes #1660
2026-04-20 23:04:05 +00:00
AI-Manager 84341b3ec4 Merge pull request 'Add test coverage for analyze_single_patent auto-download path' (#1662) from feature/1661-analyze-single-patent-tests into main
Merge: Add test coverage for analyze_single_patent auto-download path

Closes #1661
2026-04-20 23:04:00 +00:00
AI-Manager 0639fb3649 Merge pull request 'Update ROADMAP.md to reflect completed work and add next-horizon items' (#1664) from feature/1659-update-roadmap into main
Merge: Update ROADMAP.md to reflect completed work and add next-horizon items

Closes #1659
2026-04-20 23:03:56 +00:00
AI-Manager b032bf0c90 Merge pull request 'Refactor scheduler.py to use pooled DatabaseClient' (#1665) from feature/1658-scheduler-pooled-db into main
Merge: Refactor scheduler.py to use pooled DatabaseClient

Closes #1658
2026-04-20 23:03:43 +00:00
agent-company a2f81b0396 Add test coverage for analyze_single_patent auto-download path
7 test cases covering:
- PDF on disk analyzed directly (no download)
- Auto-download from cached metadata link when PDF missing
- FileNotFoundError when no cached link available
- Cached patent without pdf_link raises FileNotFoundError
- Analysis pipeline failure returns error string gracefully
- Model override parameter forwarded to LLM
- FileNotFoundError during parsing re-raised (not swallowed)

Closes leeworks-agents/SPARC#1661

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:21:53 +00:00
agent-company 63ca18e9bf Add S3/MinIO storage backend tests for storage.py
21 test cases covering:
- S3StorageBackend: read, write, exists, path_for with mocked boto3
- Error handling: NoSuchKey exception, generic 404, non-404 re-raise
- Bucket auto-creation on init and graceful handling of creation failure
- Constructor credential/endpoint passthrough
- LocalStorageBackend: round-trip read/write, missing file, empty file
- get_storage_backend() factory: local/s3 selection, case-insensitivity

Closes leeworks-agents/SPARC#1660

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:20:06 +00:00
agent-company 4cb1a6ed21 Update ROADMAP.md to reflect completed work and add next-horizon items
Move all completed items (security hardening, structured logging, dark mode,
export, webhooks, scheduled analysis, multi-model, trend charts, CI, etc.)
into a new Completed section. Reorganize remaining P1/P2/P3 items to reflect
current priorities. Add new next-horizon items: historical diffing, patent
classification tagging, user API keys, batch export, and multi-tenant support.

Closes leeworks-agents/SPARC#1659

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:18:22 +00:00
agent-company 417b7ab31e Refactor scheduler.py to use the application-level pooled DatabaseClient
Replace the per-invocation DatabaseClient creation in
run_scheduled_analysis() with the shared pooled client from
SPARC.auth.get_db_client(). This avoids creating a new database
connection on every scheduler tick, which could exhaust the connection
pool under load.

Key changes:
- Import get_db_client from SPARC.auth instead of DatabaseClient
- Remove manual connect/initialize_schema/close calls
- Remove unused SPARC.config import

Closes leeworks-agents/SPARC#1658

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:16:54 +00:00
agent-company 2eabb1d704 Add webhook integration tests covering retry logic and Slack/Discord payloads
22 test cases covering:
- Slack/Discord URL detection
- Generic vs Slack payload formatting
- Exponential backoff retry logic with network/timeout error handling
- Multi-URL dispatch with format auto-detection
- notify_job_completed() and notify_alert() helpers

Closes leeworks-agents/SPARC#1657

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:15:34 +00:00
agent-company fc942b2aa4 Add tests for tracked company admin endpoints and scheduler integration
20 test cases covering:
- GET/POST/DELETE /admin/tracked endpoints with admin auth enforcement
- GET /admin/alerts with limit parameter and auth
- scheduler.run_scheduled_analysis() for multi-company analysis, alert
  triggering on significant patent count changes, graceful failure handling

Closes leeworks-agents/SPARC#1656

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:14:29 +00:00
agent-company 44a162056d Add API tests for export endpoints (CSV and PDF)
Covers GET /export/{company_name} and /export/{company_name}/pdf with
13 test cases: successful export, 404 on missing data, auth enforcement,
filename sanitization, XML-special character handling in PDF, and
multi-row output validation.

Closes leeworks-agents/SPARC#1655

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 19:11:42 +00:00
AI-Manager a07a0c7fbe Merge pull request 'Fix remaining dark mode issue in Analysis page prose block' (#1628) from feature/1605-dark-mode into main
Fix remaining dark mode issue in Analysis page prose block (#1628)
2026-04-20 06:41:59 +00:00
AI-Manager 43fd2c9575 Merge pull request 'Expand JWT auth integration tests to 33 cases' (#1627) from feature/1624-jwt-auth-tests into main
Expand JWT auth integration tests to 33 cases (#1627)
2026-04-20 06:41:47 +00:00
agent-company d4d43cf9b8 Fix prose-invert to only apply in dark mode on Analysis page
The prose-invert class was applied unconditionally, causing inverted
(light) text in light mode within the AI analysis results section.
Changed to dark:prose-invert so it only activates when dark mode is
enabled.

Note: The broader dark mode feature (issue #1605) is already fully
implemented -- ThemeContext, toggle button, CSS variables, dark:
variants across all pages. This fix addresses the only remaining
unstyled element.

Closes leeworks-agents/SPARC#1605

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 06:08:02 +00:00
agent-company 2f2b6382fa Expand JWT auth integration tests from 17 to 33 cases
Add comprehensive edge-case coverage for issue #1624:

- Admin delete user endpoint (5 tests): successful delete, self-delete
  prevention, nonexistent user 404, non-admin 403, missing token rejection
- Admin role change gaps (2 tests): nonexistent user 404, non-admin 403
- Input validation (3 tests): invalid email 422, short password 422,
  missing fields 422 for both register and login
- Token edge cases (4 tests): malformed token, wrong-secret token,
  deleted user token, deleted user refresh
- Token claim verification (1 test): login tokens contain correct claims

All tests use mocked DB fixtures and require no live database.

Closes leeworks-agents/SPARC#1624

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 06:05:54 +00:00
AI-Manager 1319530f04 Merge pull request 'ci: enable ruff linting and pytest in CI pipeline' (#1568) from feature/1559-1560-enable-ci-linting-and-tests into main
Merge PR #1568: ci: enable ruff linting and pytest in CI pipeline

Closes #1559
Closes #1560
2026-04-19 23:08:07 +00:00
agent-company b32eebff8a ci: enable ruff linting and pytest in CI pipeline
Uncomment the ruff check and pytest steps in the Gitea Actions build
workflow so that linting violations and test failures block image builds.
Fix all pre-existing ruff violations (E402 import ordering in analyzer.py,
F821 undefined name in api.py, I001 unsorted imports in test files, F401
unused import in test_rate_limit.py).

Closes leeworks-agents/SPARC#1559
Closes leeworks-agents/SPARC#1560

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 20:06:10 +00:00
0xWheatyz 68ee19025a ci(build): use docker.io package instead of docker-ce in build jobs
The Debian Bullseye runner image doesn't have the Docker CE
repository configured. docker.io is available from default repos.
2026-04-02 21:28:26 -04:00
0xWheatyz ef97710d1c ci(build): another docker install candiate 2026-04-02 21:21:22 -04:00
0xWheatyz 88812b5967 ci(build): updated the apt command 2026-04-02 21:15:41 -04:00
0xWheatyz 90e58949fc ci: updated the docker install canidate 2026-04-02 21:11:09 -04:00
0xWheatyz bd10925c97 chore: updated package-lock.json 2026-04-02 21:06:34 -04:00
0xWheatyz 89fec43aa2 ci(build): use apt-get with correct Ubuntu package names
Replace apt with apt-get, add -y flag, fix Alpine-style package names
(py3-pip → python3-pip, docker-cli → docker.io), and drop musl-dev.
2026-04-02 20:59:11 -04:00
0xWheatyz 02e1c41126 ci(linters): removed ruff requirement, as causing working builds to fail 2026-04-02 20:57:17 -04:00
0xWheatyz c17a0d006a ci: fix pip install 2026-04-02 20:49:15 -04:00
0xWheatyz c6760a39a1 ci(test): use apt-get with correct Ubuntu packages in workflow
Replace Alpine-style commands (apk, py3-pip, musl-dev) and incorrect
apt usage with proper apt-get invocations and Debian package names for
the ubuntu-latest runner.
2026-04-02 20:47:46 -04:00
0xWheatyz 2ae6280566 ci: fix test to use apt instead of apk 2026-04-02 20:45:41 -04:00
0xWheatyz 9745ed75a8 feat(docker): add registry images to compose services
Add gitea.leeworks.dev image references alongside build directives so
`docker compose up` pulls pre-built images while `--build` still builds
from local sources.
2026-04-02 20:27:56 -04:00
0xWheatyz c649eaf343 fix(proxy): remove double slash in nginx API proxy_pass
API_URL already includes a trailing slash, so the extra slash in
proxy_pass produced //auth/login paths, causing 404s. Also clear
ROOT_PATH since nginx strips /api/ before proxying.
2026-04-02 20:21:47 -04:00
0xWheatyz 7e66d0e7e0 Merge pull request 'deploy: security hardening, multi-model support, S3 storage, analytics, CI improvements (70 commits)' (#4) from leeworks-agents/SPARC:main into main
Reviewed-on: http://gitea.leeworks.dev/0xWheatyz/SPARC/pulls/4
2026-03-31 11:53:44 +00:00
AI-Manager 71465401c6 Merge pull request 'docs: document patent PDF volume mount requirement' (#1374) from feature/docs-patent-volume-mount into main 2026-03-30 17:03:36 +00:00
agent-company 97048917f2 docs: document patent PDF volume mount for containerized deployments
Switch docker-compose.yml from bind mount to a named volume (patent_data)
so downloaded PDFs survive container recreation. Add a "Patent PDF Storage"
section to DEPLOYMENT.md covering Docker Compose, Kubernetes PVC, and S3
alternatives.

Closes leeworks-agents/SPARC#1360

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:08:02 +00:00
AI-Manager 88abd9574b Merge pull request 'feat: theme-aware chart colors for dark/light mode' (#1348) from feature/1324-dark-mode-variants into main 2026-03-30 15:03:43 +00:00
agent-company e0ed39908e feat: add theme-aware chart colors for dark/light mode support
Replace hardcoded dark-theme hex colors in recharts components
(tooltips, axes) with a useChartTheme hook that reads the current
theme from ThemeContext. Charts now render correctly in both light
and dark mode.

Closes leeworks-agents/SPARC#1324

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 14:11:16 +00:00
AI-Manager 87e09b365b Merge pull request 'Add model allow-list validation to analysis endpoints' (#1015) from feature/1013-multi-model into main 2026-03-29 17:03:25 +00:00
agent-company 5d11f514c0 Add model allow-list validation to analysis endpoints
Reject unsupported LLM model identifiers with HTTP 400 on all analysis
endpoints (single, batch, async batch). The SUPPORTED_MODELS list was
already defined for the /models endpoint but not enforced on incoming
requests. This completes the multi-model support feature by adding the
missing server-side validation.

Closes leeworks-agents/SPARC#1013

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 16:13:29 +00:00
AI-Manager cbc8f449a1 Merge pull request 'Generate TypeScript API client from OpenAPI spec' (#443) from feature/426-generate-ts-api-client into main
Merge pull request #443: Generate TypeScript API client from OpenAPI spec

Closes leeworks-agents/SPARC#426
2026-03-27 20:42:17 +00:00
agent-company 44620614b6 feat: generate TypeScript API client from OpenAPI spec and add CI freshness check
Closes leeworks-agents/SPARC#426

- Generate schema.d.ts from committed openapi.json using openapi-typescript
- Rewrite types/index.ts to derive all application types from the generated schema
- Add CI step in both build.yaml and test.yaml to verify schema.d.ts stays in sync
- TypeScript compilation passes with zero errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 20:09:11 +00:00
AI-Manager c72a44aa56 Merge pull request 'feat: add model picker UI and wire model param through backend' (#353) from feature/351-frontend-model-picker into main 2026-03-27 16:45:05 +00:00
agent-company 6aa71eb17e merge: resolve Batch.tsx conflict between model picker and job history
Combine both useQuery hooks (modelsQuery for model selector, jobsQuery for
job history) and pass selectedModel to analyzeBatch while also triggering
jobsQuery.refetch() on successful submission.
2026-03-27 16:44:47 +00:00
AI-Manager fb52d08387 Merge pull request 'feat: add loading skeletons and error states to Batch page' (#352) from feature/343-batch-loading-states into main 2026-03-27 16:43:40 +00:00
agent-company 223d5f7e5d feat: add model picker to Analysis and Batch pages with full backend wiring
Thread the optional model parameter through the entire analysis pipeline:
- analyzer.py: analyze_company, _analyze_company_safe, analyze_companies,
  and analyze_single_patent now accept and forward model override
- api.py: single company endpoint accepts model query param; batch and
  async batch endpoints pass request.model through to the analyzer
- client.ts: analyzeCompany, analyzeBatch, analyzeBatchAsync accept model;
  add listModels() to fetch available models from GET /models
- Analysis.tsx: add model selector dropdown that loads from /models API
- Batch.tsx: add model selector alongside the workers slider

Users can now pick a specific LLM (GPT-4o, Claude 3.5, Gemini, etc.)
per analysis request, or leave it on the server default.

Closes leeworks-agents/SPARC#351

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:13:00 +00:00
agent-company 595516e330 feat: add loading skeletons, error states, and empty state to Batch page
Add a Job History section that loads past jobs via useQuery with:
- Animated skeleton placeholders while the job list is loading
- Error banner with retry button when the API call fails
- Empty state with helpful message when no jobs exist
- Job list cards with status badges and progress bars

Also improve the batch submission error state with a retry button
alongside the existing dismiss button.

Closes leeworks-agents/SPARC#343

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:08:49 +00:00
AI-Manager 514e274fdb Merge pull request 'CI: add tsc --noEmit TypeScript type checking to test job' (#269) from feature/260-tsc-ci into main 2026-03-27 11:07:02 +00:00
AI-Manager 3d2c0ea27d Merge pull request 'Docs: document MODEL, SERP_CACHE_TTL_HOURS, LOG_LEVEL in .env.example' (#270) from feature/env-example-updates into main 2026-03-27 11:06:57 +00:00
agent-company f611e3a30c Docs: add MODEL, SERP_CACHE_TTL_HOURS, and LOG_LEVEL to .env.example
These environment variables were already supported in config.py but
were not documented in .env.example, making them hard to discover.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 10:08:52 +00:00
agent-company 2bbf2d70bb CI: add tsc --noEmit TypeScript type checking to test job
Adds a step to install Node.js and run tsc --noEmit in the frontend
directory, catching TypeScript type errors before images are built.
Ruff was already present; this completes issue #260.

Closes leeworks-agents/SPARC#260

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 10:08:06 +00:00
AI-Manager f8ca1b80b1 Merge pull request 'feat: add PDF export for analysis reports' (#171) from feature/export-pdf into main 2026-03-27 05:04:55 +00:00
agent-company 338ac86086 feat: add PDF export for analysis reports
Add a new /export/{company_name}/pdf endpoint that generates a formatted
PDF report using reportlab, including a summary table and all analysis
results. Add the corresponding frontend Export PDF button alongside the
existing Export CSV button on the Analysis page.

Closes leeworks-agents/SPARC#85

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 02:03:53 +00:00
AI-Manager ce31a32322 Merge pull request 'feat: add multi-model support for per-analysis LLM selection' (#64) from feature/multi-model into main 2026-03-26 12:14:25 +00:00
agent-company 449055b026 merge: resolve multi-model conflicts with trends and export endpoints
Keeps model selection, analytics trends, and CSV export endpoints.
2026-03-26 12:14:15 +00:00
AI-Manager 70925fbf04 Merge pull request 'feat: add OpenAPI TypeScript client generation setup' (#63) from feature/openapi-client-gen into main 2026-03-26 12:13:19 +00:00
agent-company 9b2b2c75db merge: resolve openapi-client-gen conflicts with CI typecheck script
Keeps both generate scripts and typecheck script in package.json.
2026-03-26 12:13:08 +00:00
AI-Manager 730f455e2b Merge pull request 'feat: add patent trend charts to the Analytics page' (#62) from feature/trend-charts into main 2026-03-26 12:12:24 +00:00
agent-company 03f8f7fa79 merge: resolve trend-charts conflicts with export and tracked endpoints
Keeps both analytics/trends endpoint and export endpoint from main.
2026-03-26 12:12:09 +00:00
AI-Manager f0edc5a3ae Merge pull request 'feat: add side-by-side patent portfolio comparison view' (#61) from feature/compare-view into main 2026-03-26 12:11:01 +00:00
agent-company f64d1b745f merge: resolve compare-view conflicts with dark mode changes
Combines GitCompareArrows icon import with Sun/Moon and ThemeContext imports.
2026-03-26 12:10:37 +00:00
AI-Manager 513b682dad Merge pull request 'feat: add S3/MinIO object storage support for patent PDFs' (#58) from feature/s3-storage into main 2026-03-26 12:09:49 +00:00
agent-company a6c92fde9f merge: resolve conflicts for S3 storage branch with main
Integrates S3/MinIO storage backend with structured logging changes
from main. Both boto3 and apscheduler retained in requirements.txt.
2026-03-26 12:09:24 +00:00
AI-Manager a4db9439f5 Merge pull request 'feat: add webhook notification support for job completion' (#66) from feature/webhooks into main 2026-03-26 12:08:08 +00:00
AI-Manager bbea16387d Merge pull request 'feat: implement scheduled/recurring analysis with change alerting' (#65) from feature/scheduled-analysis into main 2026-03-26 12:07:46 +00:00
AI-Manager 4e2bcae18a Merge pull request 'feat: add CSV export for company analysis results' (#60) from feature/export-csv into main 2026-03-26 12:06:57 +00:00
AI-Manager b66b8332b6 Merge pull request 'feat: add dark/light mode toggle with localStorage persistence' (#57) from feature/dark-mode into main 2026-03-26 12:06:33 +00:00
AI-Manager c42bf5bf71 Merge pull request 'feat: add cursor-based pagination to /jobs endpoint' (#59) from feature/cursor-pagination into main 2026-03-26 12:06:04 +00:00
AI-Manager 02991b6648 Merge pull request 'feat: add loading skeletons and error retry to Batch and Analytics' (#56) from feature/loading-error-states into main 2026-03-26 12:05:41 +00:00
AI-Manager ab74904845 Merge pull request 'fix: auto-download patent PDF in analyze_single_patent' (#55) from feature/fix-single-patent-download into main 2026-03-26 12:05:10 +00:00
AI-Manager 92197440bf Merge pull request 'feat: add structured logging to serp_api.py' (#54) from feature/structured-logging into main 2026-03-26 12:04:59 +00:00
AI-Manager 301a773622 Merge pull request 'ci: add tsc --noEmit TypeScript type checking to CI pipeline' (#53) from feature/ci-tsc-lint into main 2026-03-26 12:04:39 +00:00
agent-company 2e6b8c7445 feat: add webhook notification support for job completion and alerts
Send HTTP POST notifications to configured webhook URLs when batch
jobs complete or when scheduled analysis detects significant changes.

- Add SPARC/webhooks.py with retry logic (3 attempts, exponential backoff)
- Support generic HTTP POST and Slack-compatible text payloads
- Integrate into batch job completion handler in api.py
- Configure via WEBHOOK_URLS env var (comma-separated)
- Payload includes event type, job ID, status, and summary

Closes leeworks-agents/SPARC#23

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:32:07 +00:00
agent-company f33447eef8 feat: implement scheduled/recurring analysis with change alerting
Add APScheduler-based background task that periodically re-analyzes
tracked companies and alerts on significant patent count changes.

- Add tracked_companies and alerts tables to database schema
- Add SPARC/scheduler.py with configurable interval and threshold
- Add admin endpoints: GET/POST/DELETE /admin/tracked, GET /admin/alerts
- Scheduler starts at app startup; interval via SCHEDULE_INTERVAL_HOURS
- Change threshold configurable via CHANGE_THRESHOLD_PERCENT env var
- apscheduler is optional; graceful fallback if not installed

Closes leeworks-agents/SPARC#22

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:30:43 +00:00
agent-company 04f4d36307 feat: add multi-model support for per-analysis LLM selection
Allow users to choose the LLM model on a per-analysis basis. The
model field is optional in both single and batch analysis requests,
defaulting to the server-configured MODEL env var. The model used
is recorded in the analysis result and database.

- Add model parameter to LLMAnalyzer.analyze_patent_content and
  analyze_patent_portfolio
- Add model field to CompanyAnalysisResult and API response
- Add model field to BatchAnalysisRequest
- Add GET /models endpoint listing supported models and the default
- Store model in llm_messages metadata for attribution

Closes leeworks-agents/SPARC#37

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:28:25 +00:00
agent-company 7a364e6736 feat: add OpenAPI TypeScript client generation setup
Add openapi-typescript devDependency and npm scripts for generating
typed TypeScript schema from the FastAPI OpenAPI spec. Include a
static openapi.json snapshot for offline generation.

- npm run generate: fetch schema from running backend and generate types
- npm run generate:local: generate types from the bundled openapi.json

Closes leeworks-agents/SPARC#26

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:26:06 +00:00
agent-company 52972bbff0 feat: add patent trend charts to the Analytics page
Add GET /analytics/trends endpoint returning per-company analysis
counts by month and analysis type distribution over time. Render
these as a line chart (analyses per company) and stacked bar chart
(analysis types) on the Analytics page using recharts.

Closes leeworks-agents/SPARC#24

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:23:47 +00:00
agent-company c738f785c3 feat: add side-by-side patent portfolio comparison view
Add /compare route with two-panel layout for comparing company patent
portfolios. Each panel shows patent count, analysis timestamp, and
full LLM narrative. The page is responsive (stacks vertically on
mobile) and supports URL params (?a=nvidia&b=intel) for shareability.

Closes leeworks-agents/SPARC#21

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:22:14 +00:00
agent-company 1bd9dccdb8 feat: add CSV export for company analysis results
Add GET /export/{company_name} backend endpoint that returns analysis
records as a downloadable CSV file. Add Export CSV button to the
Analysis page that triggers the download via the API.

Closes leeworks-agents/SPARC#20

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:20:51 +00:00
agent-company 3b6411869d feat: add cursor-based pagination to /jobs endpoint
Add a cursor query parameter to GET /jobs and return a next_cursor
field in the response envelope. Existing clients using only limit
continue to work without modification. The cursor is an opaque token
encoding created_at and job_id for stable keyset pagination.

Closes leeworks-agents/SPARC#25

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:19:01 +00:00
agent-company 9a43f85259 feat: add S3/MinIO object storage support for patent PDFs
Introduce a StorageBackend abstraction (local filesystem and S3) for
patent PDF storage. When STORAGE_BACKEND=s3, PDFs are read/written via
boto3 to an S3-compatible bucket instead of the local filesystem.

- Add SPARC/storage.py with LocalStorageBackend and S3StorageBackend
- Update serp_api.py save_patents and parse_patent_pdf to use storage
- Add storage config vars to config.py and .env.example
- Add optional MinIO service to docker-compose.yml (--profile s3)
- Add boto3 to requirements.txt

Closes leeworks-agents/SPARC#38

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:17:24 +00:00
agent-company a4aa968434 feat: add dark/light mode toggle with localStorage persistence
- Enable Tailwind "class" dark mode strategy
- Use CSS custom properties for theme colors (bg, text, border)
- Add ThemeProvider context with toggle and localStorage persistence
- Add Sun/Moon toggle button in the header navigation
- Inline script in index.html prevents FOUC on page load
- All pages (Layout, Login, Register, ProtectedRoute) support both modes
- Default theme follows system preference (prefers-color-scheme)

Closes leeworks-agents/SPARC#33

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:15:11 +00:00
agent-company 153eb3b968 feat: improve loading and error states on Batch and Analytics pages
Analytics page now shows skeleton loaders (cards and chart placeholders)
while data loads, and displays a retry button when the API call fails.
Batch page error state now shows the actual error message and suggests
user action.

Closes leeworks-agents/SPARC#16

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:11:47 +00:00
agent-company ecc2c37bcd fix: auto-download patent PDF in analyze_single_patent before reading
When the PDF is not on disk, analyze_single_patent now looks up the
cached PDF link from the database and downloads it automatically.
If no link is cached, a clear FileNotFoundError is raised. Also adds
a GET /analyze/patent/{patent_id} API endpoint that exposes this
functionality and returns 404 when the PDF cannot be obtained.

Closes leeworks-agents/SPARC#36

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:08:34 +00:00
agent-company 0b4d712fc5 feat: add structured logging to serp_api.py
Add module-level logger to serp_api.py with INFO-level messages for
patent queries and PDF downloads, and DEBUG-level messages for cache
hits and parsing details. All three target files (analyzer.py,
serp_api.py, llm.py) now use structured logging with no print() calls.

Closes leeworks-agents/SPARC#46

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:07:07 +00:00
51 changed files with 7264 additions and 304 deletions
+33
View File
@@ -35,8 +35,41 @@ JWT_SECRET=your-secure-jwt-secret-change-in-production
# Defaults to http://localhost:3000,http://localhost:5173 when unset
# CORS_ORIGINS=https://sparc.example.com,https://app.example.com
# ---- Storage ----
# Backend for patent PDF storage: "local" (default) or "s3"
STORAGE_BACKEND=local
# S3/MinIO settings (only used when STORAGE_BACKEND=s3)
# S3_BUCKET=sparc-patents
# S3_ENDPOINT_URL=http://localhost:9000
# AWS_ACCESS_KEY_ID=minioadmin
# AWS_SECRET_ACCESS_KEY=minioadmin
# To start MinIO locally: docker compose --profile s3 up -d minio
# ---- LLM ----
# LLM model to use via OpenRouter
# Supported: anthropic/claude-3.5-sonnet, openai/gpt-4o, openai/gpt-4o-mini,
# google/gemini-pro-1.5, meta-llama/llama-3.1-70b-instruct
# MODEL=anthropic/claude-3.5-sonnet
# ---- Cache ----
# When USE_CACHE=true: check database for cached responses before making API calls
# When USE_CACHE=false: always make fresh API calls (still stores results in database)
USE_CACHE=true
# SERP API cache TTL in hours (how long cached search results are considered fresh)
# SERP_CACHE_TTL_HOURS=24
# ---- Logging ----
# Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
# LOG_LEVEL=INFO
# ---- Webhooks ----
# Comma-separated list of webhook URLs for job completion and alert notifications
# Supports generic HTTP POST and Slack/Discord incoming webhooks
# WEBHOOK_URLS=https://hooks.slack.com/services/XXX,https://example.com/webhook
+19 -4
View File
@@ -15,7 +15,7 @@ jobs:
- name: Install system dependencies
shell: sh
run: |
apk add --no-cache git python3 py3-pip gcc musl-dev libpq-dev python3-dev
apt-get update && apt-get install -y git python3 python3-pip gcc libpq-dev python3-dev
- name: Checkout code
shell: sh
@@ -26,13 +26,27 @@ jobs:
- name: Install Python dependencies
shell: sh
run: |
pip3 install --break-system-packages -r requirements.txt ruff
pip3 install -r requirements.txt ruff
- name: Run ruff linter
shell: sh
run: |
ruff check SPARC/ tests/
- name: Install Node.js and check TypeScript types
shell: sh
run: |
apt-get install -y nodejs npm
cd frontend
npm ci
npm run generate:local
if ! git diff --quiet src/api/schema.d.ts; then
echo "ERROR: src/api/schema.d.ts is out of date. Run 'npm run generate:local' and commit the result."
git diff src/api/schema.d.ts
exit 1
fi
npx tsc --noEmit
- name: Run pytest
shell: sh
env:
@@ -42,6 +56,7 @@ jobs:
JWT_SECRET: "test-secret-for-ci"
APP_ENV: "development"
run: |
pip3 install pytest
python3 -m pytest tests/ -v --tb=short -x
build-api:
@@ -51,7 +66,7 @@ jobs:
- name: Install dependencies
shell: sh
run: |
apk add --no-cache git docker-cli
apt-get update && apt-get install -y git docker.io
- name: Checkout code
shell: sh
@@ -123,7 +138,7 @@ jobs:
- name: Install dependencies
shell: sh
run: |
apk add --no-cache git docker-cli
apt-get update && apt-get install -y git docker.io
- name: Checkout code
shell: sh
+13 -3
View File
@@ -16,7 +16,7 @@ jobs:
- name: Install system dependencies
shell: sh
run: |
apk add --no-cache git python3 py3-pip gcc musl-dev libpq-dev python3-dev
apt-get update && apt-get install -y git python3 python3-pip gcc libpq-dev python3-dev
- name: Checkout code
shell: sh
@@ -27,7 +27,7 @@ jobs:
- name: Install Python dependencies
shell: sh
run: |
pip3 install --break-system-packages -r requirements.txt ruff
pip3 install -r requirements.txt ruff
- name: Run ruff linter
shell: sh
@@ -37,9 +37,19 @@ jobs:
- name: Install Node.js and frontend dependencies
shell: sh
run: |
apk add --no-cache nodejs npm
apt-get install -y nodejs npm
cd frontend && npm ci
- name: Verify generated API types are up to date
shell: sh
run: |
cd frontend && npm run generate:local
if ! git diff --quiet src/api/schema.d.ts; then
echo "ERROR: src/api/schema.d.ts is out of date. Run 'npm run generate:local' and commit the result."
git diff src/api/schema.d.ts
exit 1
fi
- name: Run TypeScript type check
shell: sh
run: |
+120 -85
View File
@@ -7,86 +7,124 @@ Semiconductor Patent & Analytics Report Core -- development priorities.
SPARC is a patent analysis platform with a working end-to-end pipeline:
Python/FastAPI backend, React/TypeScript frontend, PostgreSQL for persistence
and caching, Docker Compose for local development, and Gitea Actions CI/CD for
image builds. Core features (patent retrieval via SerpAPI, PDF parsing, LLM
analysis via OpenRouter/Claude, batch processing, JWT authentication, analytics
dashboard) are all implemented and functional.
image builds and testing. Core features include patent retrieval via SerpAPI,
PDF parsing, LLM analysis via OpenRouter (multi-model: Claude, GPT-4o, Gemini,
Llama), batch processing, JWT authentication, analytics dashboard with patent
trend charts, scheduled recurring analysis with alerting, webhook notifications
(Slack/Discord), CSV and PDF export, S3/MinIO storage, side-by-side company
comparison, and dark mode.
---
## Completed
Items that have been implemented and merged into main.
### Security hardening
- ~~Rotate default JWT secret.~~ Startup check refuses to start with the
default secret in non-development environments.
- ~~CORS allow-origins are hardcoded.~~ Allowed origins are now configurable
via environment variable.
- ~~Database credentials in docker-compose.yml.~~ Compose references `.env`
for sensitive values.
### Error handling and resilience
- ~~`get_db_client()` creates a new `DatabaseClient` on every call.~~ Refactored
to a shared pooled singleton initialized at startup.
- ~~No rate limiting on auth endpoints.~~ Rate limiting middleware added to
`/auth/login` and `/auth/register`.
### Test coverage
- ~~API tests bypass authentication.~~ JWT auth integration tests added (33
cases covering registration, login, protected routes, token refresh, and
admin-only endpoints).
- ~~No test stage in CI.~~ Gitea Actions workflow now runs `pytest` and gates
the build.
- ~~No linting or type checking in CI.~~ `ruff` (Python) and `tsc --noEmit`
(TypeScript) added to CI pipeline.
### Backend
- ~~Add structured logging.~~ Python `logging` module used throughout.
- ~~Make LLM model configurable.~~ `MODEL` environment variable accepted;
multi-model support with per-analysis selection (GPT-4o, Gemini, Claude,
Llama).
- ~~SERP cache TTL hardcoded.~~ `SERP_CACHE_TTL_HOURS` exposed as env var.
- ~~Patent PDF storage.~~ S3/MinIO object storage backend added alongside
local filesystem. Volume mount requirement documented.
- ~~`analyze_single_patent` assumes local file.~~ Auto-download from cached
metadata link integrated.
- ~~`Patent.patent_id` typed as `int`.~~ Fixed to `str`.
### Frontend
- ~~No loading/error states.~~ Skeleton loaders and error states added to
Batch and Analytics pages.
- ~~No dark mode.~~ Full dark mode support with theme-aware chart colors.
- ~~Missing lockfile.~~ `package-lock.json` committed.
### Features (formerly P3)
- ~~Export analysis reports.~~ CSV and PDF export endpoints implemented.
- ~~Comparison view.~~ Side-by-side company patent portfolio comparison added.
- ~~Scheduled/recurring analysis.~~ APScheduler-based periodic re-analysis
with configurable interval and change-threshold alerting.
- ~~Webhook/notification support.~~ Slack, Discord, and generic HTTP POST
webhooks with retry logic.
- ~~Multi-model support.~~ Model picker in Analysis and Batch pages; backend
allow-list validation.
- ~~Patent trend charts.~~ Filing frequency and category distribution
visualizations added to Analytics page.
- ~~OpenAPI client generation.~~ TypeScript API client auto-generated from
FastAPI spec with CI freshness check.
### Resilience
- ~~`_jobs` dict is in-memory only.~~ Database-backed job persistence
implemented using `db.list_jobs()` and `mark_stale_jobs_failed()`. The
in-memory `_jobs` dict has been removed.
### Test coverage (P1/P2)
- ~~Export endpoint tests.~~ Tests added for CSV and PDF export endpoints.
- ~~Tracked company admin endpoint tests.~~ Tests added for `/admin/tracked`
CRUD endpoints and scheduler integration.
- ~~Webhook integration tests.~~ Tests added for retry logic, Slack/Discord
payload format, and multi-URL dispatch.
- ~~S3/MinIO storage backend tests.~~ Unit tests added for the S3 backend
(read, write, exists, delete, error handling).
- ~~`analyze_single_patent` auto-download path tests.~~ Tests added for the
auto-download fallback (cache lookup, PDF download, FileNotFoundError).
### Code quality
- ~~Scheduler creates its own DatabaseClient.~~ Refactored to use the
application-level pooled `get_db_client()`.
---
## P1 -- High Priority
These items address correctness, security, and reliability gaps that should be
resolved before broader production use.
### Security hardening
- **Rotate default JWT secret.** `auth.py` ships a fallback
`sparc-secret-key-change-in-production` that will be used if `JWT_SECRET` is
unset. Add a startup check that refuses to start with the default secret in
non-development environments.
- **CORS allow-origins are hardcoded.** `api.py` only permits
`localhost:3000` and `localhost:5173`. Make the allowed origins configurable
via environment variable so the dashboard works when deployed behind a real
domain.
- **Database credentials in docker-compose.yml.** The compose file embeds
`postgres:postgres` in plain text. Reference a `.env` file or Docker secrets
instead.
### Error handling and resilience
- **`get_db_client()` in `auth.py` creates a new `DatabaseClient` on every
call.** This bypasses the connection pool and can exhaust database
connections under load. Refactor to share a single pooled client.
- **`_jobs` dict is in-memory only.** Job state is lost on API restart. Persist
job status in PostgreSQL or Redis so async batch results survive restarts.
- **No rate limiting on auth endpoints.** `/auth/login` and `/auth/register`
are unprotected against brute-force or abuse. Add rate limiting middleware.
### Test coverage for auth and admin
- The existing API tests (`tests/test_api.py`) bypass authentication entirely.
Add tests that exercise the JWT flow: registration, login, protected-route
access, token refresh, and admin-only endpoints.
No outstanding P1 items. All previously listed items have been completed and
moved to the Completed section above.
---
## P2 -- Medium Priority
Improvements to usability, performance, and developer experience.
Improvements to the API surface.
### Backend
### API improvements
- **Add structured logging.** Replace `print()` calls throughout `analyzer.py`,
`serp_api.py`, and `llm.py` with Python `logging` so log levels and
formatting are consistent.
- **Make LLM model configurable.** `llm.py` hardcodes
`anthropic/claude-3.5-sonnet`. Accept a `MODEL` environment variable to allow
switching models without code changes.
- **SERP cache TTL is hardcoded to 24 hours.** Expose `SERP_CACHE_TTL_HOURS`
as an environment variable in `config.py`.
- **Patent PDF storage.** PDFs are saved to a local `patents/` directory. For
containerized deployments, consider object storage (S3/MinIO) or at minimum
document the volume mount requirement more prominently.
- **`analyze_single_patent` assumes local file path.** The method constructs
`patents/{patent_id}.pdf` and reads from disk, but does not download the PDF
first. Either integrate the download step or document the prerequisite.
- **`Patent.patent_id` typed as `int` in `types.py` but used as `str`
everywhere.** Fix the type annotation to `str`.
### Frontend
- **No loading/error states on several pages.** The Batch and Analytics pages
would benefit from skeleton loaders and user-friendly error messages.
- **No dark mode.** Tailwind is configured but no dark variant is applied.
- **Missing `package-lock.json` or `pnpm-lock.yaml`.** The frontend has no
lockfile committed, leading to non-reproducible builds.
### CI/CD
- **No test stage in the Gitea Actions workflow.** `build.yaml` builds and
pushes images but never runs `pytest`. Add a test job that gates the build.
- **No linting or type checking.** Add `ruff` (Python) and `tsc --noEmit`
(TypeScript) to CI.
- **API pagination.** The `/analyze/batch` endpoint needs cursor-based
pagination for large result sets. The `/jobs` endpoint already has cursor
pagination. *(Issue #1669)*
- **Request validation improvements.** Add stricter input validation for
company names (disallow special characters, enforce length limits).
*(Issue #1670)*
---
@@ -94,23 +132,20 @@ Improvements to usability, performance, and developer experience.
Lower-urgency enhancements and future features.
- **Export analysis reports.** Allow users to download analysis results as PDF
or CSV from the dashboard.
- **Comparison view.** Side-by-side comparison of two companies' patent
portfolios.
- **Scheduled/recurring analysis.** Periodically re-analyze tracked companies
and alert on significant changes.
- **Webhook/notification support.** Send alerts (Slack, Discord, email) when
batch jobs complete or when a company's innovation score changes
significantly.
- **Multi-model support.** Let users choose between LLM providers per analysis
(e.g., GPT-4o, Gemini, Claude) and compare outputs.
- **Patent trend charts.** Visualize patent filing frequency and technology
category distribution over time in the Analytics page.
- **API pagination.** The `/analyze/batch` and `/jobs` endpoints could benefit
from cursor-based pagination for large result sets.
- **OpenAPI client generation.** Auto-generate the TypeScript API client from
the FastAPI OpenAPI spec to keep frontend types in sync.
- **Historical analysis diffing.** Show what changed between two analysis runs
for the same company, highlighting new patents and score shifts.
- **Patent classification tagging.** Automatically tag patents by technology
domain (AI, semiconductors, materials science) using LLM classification.
- **User-level API keys.** Allow users to generate personal API keys for
programmatic access without JWT token refresh.
- **Batch export.** Export analysis results for multiple companies at once as
a ZIP archive.
- **Rate limiting dashboard.** Surface rate limit status and usage statistics
in the admin panel.
- **Async webhook delivery.** Move webhook delivery to a background task queue
(e.g., Celery, arq) to avoid blocking the scheduler.
- **Multi-tenant support.** Scope analysis results and tracked companies per
user or organization.
---
+35 -20
View File
@@ -10,13 +10,13 @@ from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import Callable
from SPARC import config
logger = logging.getLogger(__name__)
from SPARC.database import DatabaseClient
from SPARC.llm import LLMAnalyzer
from SPARC.serp_api import SERP
from SPARC.types import BatchAnalysisResult, CompanyAnalysisResult, Patent, Patents
logger = logging.getLogger(__name__)
class CompanyAnalyzer:
"""Orchestrates end-to-end company performance analysis via patents."""
@@ -33,7 +33,7 @@ class CompanyAnalyzer:
self.db.connect()
self.db.initialize_schema()
def analyze_company(self, company_name: str, patents: "Patents | None" = None) -> str:
def analyze_company(self, company_name: str, patents: "Patents | None" = None, model: str | None = None) -> str:
"""Analyze a company's performance based on their patent portfolio.
This is the main entry point that orchestrates the full pipeline:
@@ -46,6 +46,7 @@ class CompanyAnalyzer:
Args:
company_name: Name of the company to analyze
patents: Optional pre-fetched Patents result to avoid duplicate API calls
model: Optional LLM model override (e.g. 'openai/gpt-4o')
Returns:
Comprehensive analysis of company's innovation and performance outlook
@@ -100,30 +101,29 @@ class CompanyAnalyzer:
# Analyze the full portfolio with LLM
analysis = self.llm_analyzer.analyze_patent_portfolio(
patents_data=processed_patents, company_name=company_name
patents_data=processed_patents, company_name=company_name, model=model
)
return analysis
def analyze_single_patent(self, patent_id: str, company_name: str) -> str:
def analyze_single_patent(self, patent_id: str, company_name: str, model: str | None = None) -> str:
"""Analyze a single patent by ID.
Prerequisite:
The patent PDF must already exist at ``patents/{patent_id}.pdf``
before calling this method. PDFs are downloaded automatically when
using the batch analysis pipeline (``analyze_company`` or the
``/analyze/batch`` API endpoint). For standalone usage, download
the PDF manually or call ``SERP.save_patents()`` first.
If the patent PDF is not already on disk, this method attempts to
download it automatically by looking up the PDF link in the database
cache. If the link is not cached either, a ``FileNotFoundError`` is
raised with instructions on how to obtain the PDF.
Args:
patent_id: Publication ID of the patent (e.g. "US-11234567-B2")
company_name: Name of the company (for context)
model: Optional LLM model override (e.g. 'openai/gpt-4o')
Returns:
Analysis of the specific patent's innovation quality
Raises:
FileNotFoundError: If the patent PDF is not found at the expected path.
FileNotFoundError: If the patent PDF cannot be found or downloaded.
"""
import os
logger.info("Analyzing patent %s for %s...", patent_id, company_name)
@@ -131,17 +131,29 @@ class CompanyAnalyzer:
patent_path = f"patents/{patent_id}.pdf"
if not os.path.exists(patent_path):
raise FileNotFoundError(
f"Patent PDF not found at '{patent_path}'. "
f"Download the PDF first using SERP.save_patents() or the batch analysis pipeline."
)
# Attempt to download the PDF automatically from cached metadata
cached = self.db.get_cached_patent(patent_id)
pdf_link = cached.get("pdf_link") if cached else None
if pdf_link:
logger.info("PDF not on disk; downloading %s from cached link", patent_id)
patent = SERP.save_patents(
Patent(patent_id=patent_id, pdf_link=pdf_link)
)
patent_path = patent.pdf_path
else:
raise FileNotFoundError(
f"Patent PDF not found at '{patent_path}' and no download link is "
f"cached for '{patent_id}'. Run a company analysis first to populate "
f"the cache, or call SERP.save_patents() with the patent's PDF link."
)
try:
sections = SERP.parse_patent_pdf(patent_path)
minimized_content = SERP.minimize_patent_for_llm(sections)
analysis = self.llm_analyzer.analyze_patent_content(
patent_content=minimized_content, company_name=company_name
patent_content=minimized_content, company_name=company_name, model=model
)
return analysis
@@ -191,18 +203,19 @@ class CompanyAnalyzer:
logger.warning("Failed to process %s: %s", patent.patent_id, e)
return None
def _analyze_company_safe(self, company_name: str) -> CompanyAnalysisResult:
def _analyze_company_safe(self, company_name: str, model: str | None = None) -> CompanyAnalysisResult:
"""Internal wrapper that catches exceptions and returns structured result.
Args:
company_name: Name of the company to analyze
model: Optional LLM model override (e.g. 'openai/gpt-4o')
Returns:
CompanyAnalysisResult with success/failure status
"""
try:
# Delegate to analyze_company which handles SERP/patent caching
analysis = self.analyze_company(company_name)
analysis = self.analyze_company(company_name, model=model)
# Determine patent count from cached SERP query
query_hash = hashlib.sha256(company_name.lower().encode()).hexdigest()
@@ -242,6 +255,7 @@ class CompanyAnalyzer:
companies: list[str],
max_workers: int = 3,
progress_callback: Callable[[str, int, int], None] | None = None,
model: str | None = None,
) -> BatchAnalysisResult:
"""Analyze multiple companies' patent portfolios in batch.
@@ -252,6 +266,7 @@ class CompanyAnalyzer:
companies: List of company names to analyze
max_workers: Maximum concurrent analyses (default 3 to avoid rate limits)
progress_callback: Optional callback(company_name, completed, total)
model: Optional LLM model override (e.g. 'openai/gpt-4o')
Returns:
BatchAnalysisResult containing all individual results and summary stats
@@ -263,7 +278,7 @@ class CompanyAnalyzer:
with ThreadPoolExecutor(max_workers=max_workers) as executor:
future_to_company = {
executor.submit(self._analyze_company_safe, company): company
executor.submit(self._analyze_company_safe, company, model): company
for company in companies
}
+689 -17
View File
@@ -3,14 +3,20 @@
Provides REST API endpoints for analyzing company patent portfolios.
"""
from contextlib import asynccontextmanager
from datetime import datetime
from typing import Annotated, List
from __future__ import annotations
from fastapi import BackgroundTasks, Depends, FastAPI, HTTPException, Query, Request
from collections import deque
from contextlib import asynccontextmanager
from datetime import datetime, timedelta, timezone
from typing import TYPE_CHECKING, Annotated, List
if TYPE_CHECKING:
from SPARC.database import DatabaseClient
from fastapi import BackgroundTasks, Depends, FastAPI, HTTPException, Path, Query, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pydantic import BaseModel, EmailStr, Field
from fastapi.responses import JSONResponse, StreamingResponse
from pydantic import BaseModel, EmailStr, Field, StringConstraints
from slowapi import Limiter
from slowapi.errors import RateLimitExceeded
from slowapi.util import get_remote_address
@@ -31,6 +37,16 @@ from SPARC.auth import (
)
from SPARC.types import BatchAnalysisResult, CompanyAnalysisResult
# Validated company name type: 2-100 chars, alphanumeric + spaces/hyphens/ampersands/periods only.
CompanyName = Annotated[
str,
StringConstraints(
min_length=2,
max_length=100,
pattern=r"^[a-zA-Z0-9][a-zA-Z0-9 \-&.]*$",
),
]
# Pydantic models for API
class CompanyAnalysisResponse(BaseModel):
@@ -41,6 +57,7 @@ class CompanyAnalysisResponse(BaseModel):
patent_count: int
success: bool
error: str | None = None
model: str | None = None
timestamp: datetime
@@ -54,15 +71,28 @@ class BatchAnalysisResponse(BaseModel):
timestamp: datetime
class CompanyAnalysisRequest(BaseModel):
"""Request model for single company analysis with optional model selection."""
model: str | None = Field(
default=None,
description="LLM model to use (e.g. 'anthropic/claude-3.5-sonnet', 'openai/gpt-4o'). Defaults to server config.",
)
class BatchAnalysisRequest(BaseModel):
"""Request model for batch company analysis."""
companies: list[str] = Field(
companies: list[CompanyName] = Field(
..., min_length=1, max_length=20, description="List of company names to analyze"
)
max_workers: int = Field(
default=3, ge=1, le=5, description="Max concurrent analyses"
)
model: str | None = Field(
default=None,
description="LLM model to use for all analyses in this batch. Defaults to server config.",
)
class JobStatus(BaseModel):
@@ -77,6 +107,31 @@ class JobStatus(BaseModel):
error: str | None = None
class AnalysisRecord(BaseModel):
"""A single stored analysis result."""
id: int
company_name: str | None = None
analysis_type: str | None = None
model: str | None = None
response: str | None = None
timestamp: datetime | None = None
class PaginatedAnalysisResponse(BaseModel):
"""Paginated response for analysis result listings."""
items: list[AnalysisRecord]
next_cursor: str | None = None
class PaginatedJobsResponse(BaseModel):
"""Paginated response for job listings."""
items: list["JobStatus"]
next_cursor: str | None = None
class HealthResponse(BaseModel):
"""Health check response."""
@@ -133,6 +188,7 @@ def _convert_result(result: CompanyAnalysisResult) -> CompanyAnalysisResponse:
patent_count=result.patent_count,
success=result.success,
error=result.error,
model=result.model,
timestamp=result.timestamp,
)
@@ -169,6 +225,9 @@ async def lifespan(app: FastAPI):
import logging
logging.getLogger(__name__).warning("Marked %d stale jobs as failed on startup", stale)
_db.close()
# Start scheduled analysis if tracked companies are configured
from SPARC.scheduler import start_scheduler
start_scheduler()
yield
# Cleanup
_analyzer = None
@@ -187,10 +246,45 @@ app = FastAPI(
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
# In-memory rate limit statistics
_rate_limit_stats: dict[str, dict] = {}
# Time-series log of rejected requests (capped to last 24 h worth of entries).
_rejected_log: deque[dict] = deque(maxlen=100_000)
def _track_rate_limit_request(endpoint: str, ip: str, rejected: bool = False) -> None:
"""Record a request against a rate-limited endpoint."""
key = endpoint
if key not in _rate_limit_stats:
_rate_limit_stats[key] = {
"endpoint": endpoint,
"total_requests": 0,
"rejected_requests": 0,
"by_ip": {},
}
_rate_limit_stats[key]["total_requests"] += 1
if rejected:
_rate_limit_stats[key]["rejected_requests"] += 1
_rejected_log.append({
"endpoint": endpoint,
"ip": ip,
"timestamp": datetime.now(timezone.utc).isoformat(),
})
ip_stats = _rate_limit_stats[key].setdefault("by_ip", {})
if ip not in ip_stats:
ip_stats[ip] = {"total": 0, "rejected": 0}
ip_stats[ip]["total"] += 1
if rejected:
ip_stats[ip]["rejected"] += 1
@app.exception_handler(RateLimitExceeded)
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
"""Return 429 with Retry-After header when rate limit is exceeded."""
endpoint = request.url.path
ip = get_remote_address(request)
_track_rate_limit_request(endpoint, ip, rejected=True)
retry_after = getattr(exc, "retry_after", 60)
return JSONResponse(
status_code=429,
@@ -219,6 +313,7 @@ async def register(request: Request, body: RegisterRequest):
The first registered user automatically becomes an admin.
"""
_track_rate_limit_request("/auth/register", get_remote_address(request))
db = get_db_client()
# First user becomes admin
@@ -249,6 +344,7 @@ async def register(request: Request, body: RegisterRequest):
@limiter.limit("10/minute")
async def login(request: Request, body: LoginRequest):
"""Authenticate user and return JWT tokens."""
_track_rate_limit_request("/auth/login", get_remote_address(request))
db = get_db_client()
user = db.authenticate_user(body.email, body.password)
@@ -369,6 +465,123 @@ async def delete_user(
return {"message": "User deleted"}
# ============== Tracked Companies Endpoints ==============
class TrackCompanyRequest(BaseModel):
"""Request to add a company to tracking."""
company_name: CompanyName = Field(...)
@app.get("/admin/tracked", tags=["Admin"])
async def list_tracked_companies(
_: UserResponse = Depends(get_current_admin),
):
"""List all tracked companies (admin only)."""
db = get_db_client()
return db.list_tracked_companies()
@app.post("/admin/tracked", tags=["Admin"])
async def add_tracked_company(
request: TrackCompanyRequest,
_: UserResponse = Depends(get_current_admin),
):
"""Add a company to the tracked list (admin only)."""
db = get_db_client()
result = db.add_tracked_company(request.company_name)
if not result:
raise HTTPException(status_code=409, detail="Company already tracked")
return result
@app.delete("/admin/tracked/{company_name}", tags=["Admin"])
async def remove_tracked_company(
company_name: Annotated[str, Path(min_length=2, max_length=100, pattern=r"^[a-zA-Z0-9][a-zA-Z0-9 \-&.]*$")],
_: UserResponse = Depends(get_current_admin),
):
"""Remove a company from the tracked list (admin only)."""
db = get_db_client()
removed = db.remove_tracked_company(company_name)
if not removed:
raise HTTPException(status_code=404, detail="Company not found in tracking list")
return {"message": f"Stopped tracking {company_name}"}
@app.get("/admin/rate-limits", tags=["Admin"])
async def get_rate_limit_stats(
_: UserResponse = Depends(get_current_admin),
):
"""Get rate limit status and usage statistics (admin only).
Returns current rate limit configuration and request statistics
for all rate-limited endpoints, including per-IP breakdown and
a time-series of throttled (rejected) requests in the last 24 hours.
Returns:
Rate limit stats per endpoint, per-IP breakdown, and throttled
request history bucketed by hour.
"""
rate_limits_config = {
"/auth/register": {"limit": "5/minute"},
"/auth/login": {"limit": "10/minute"},
}
results = []
for endpoint, conf in rate_limits_config.items():
stats = _rate_limit_stats.get(endpoint, {})
by_ip_raw = stats.get("by_ip", {})
by_ip = [
{"ip": ip, "total": counts["total"], "rejected": counts["rejected"]}
for ip, counts in by_ip_raw.items()
]
results.append({
"endpoint": endpoint,
"limit": conf["limit"],
"total_requests": stats.get("total_requests", 0),
"rejected_requests": stats.get("rejected_requests", 0),
"by_ip": by_ip,
})
# Build hourly buckets of throttled requests for the last 24 hours
now = datetime.now(timezone.utc)
cutoff = now - timedelta(hours=24)
hourly_buckets: dict[str, int] = {}
throttled_24h = 0
for entry in _rejected_log:
ts_str = entry["timestamp"]
try:
ts = datetime.fromisoformat(ts_str)
except (ValueError, TypeError):
continue
if ts >= cutoff:
throttled_24h += 1
bucket = ts.strftime("%Y-%m-%dT%H:00:00Z")
hourly_buckets[bucket] = hourly_buckets.get(bucket, 0) + 1
throttled_over_time = [
{"timestamp": k, "count": v}
for k, v in sorted(hourly_buckets.items())
]
return {
"rate_limits": results,
"throttled_24h": throttled_24h,
"throttled_over_time": throttled_over_time,
}
@app.get("/admin/alerts", tags=["Admin"])
async def list_alerts(
limit: int = Query(default=50, ge=1, le=200),
_: UserResponse = Depends(get_current_admin),
):
"""List recent alerts from scheduled analysis (admin only)."""
db = get_db_client()
return db.list_alerts(limit=limit)
# ============== Analytics Endpoint ==============
@@ -389,6 +602,330 @@ async def get_analytics(
)
# ============== Model Selection Endpoints ==============
# Supported models via OpenRouter
SUPPORTED_MODELS = [
{"id": "anthropic/claude-3.5-sonnet", "name": "Claude 3.5 Sonnet", "provider": "Anthropic"},
{"id": "openai/gpt-4o", "name": "GPT-4o", "provider": "OpenAI"},
{"id": "openai/gpt-4o-mini", "name": "GPT-4o Mini", "provider": "OpenAI"},
{"id": "google/gemini-pro-1.5", "name": "Gemini Pro 1.5", "provider": "Google"},
{"id": "meta-llama/llama-3.1-70b-instruct", "name": "Llama 3.1 70B", "provider": "Meta"},
]
_SUPPORTED_MODEL_IDS = {m["id"] for m in SUPPORTED_MODELS}
def _validate_model(model: str | None) -> None:
"""Raise HTTP 400 if *model* is not in the supported allow-list."""
if model is not None and model not in _SUPPORTED_MODEL_IDS:
raise HTTPException(
status_code=400,
detail=(
f"Unsupported model '{model}'. "
f"Supported models: {', '.join(sorted(_SUPPORTED_MODEL_IDS))}"
),
)
@app.get("/models", tags=["System"])
async def list_models():
"""List supported LLM models for analysis.
Returns the available models that can be passed as the `model` field
in analysis requests. The default model is determined by the `MODEL`
environment variable on the server.
"""
return {
"models": SUPPORTED_MODELS,
"default": config.model,
}
@app.get("/analytics/trends", tags=["Analytics"])
async def get_analytics_trends(
days: int = Query(default=90, ge=7, le=365),
_: UserResponse = Depends(get_current_user),
):
"""Get trend data for patent analysis over time.
Returns two datasets:
- ``by_month``: analysis count per company per month
- ``by_type_over_time``: analysis type distribution per month
Args:
days: Number of days to look back (default 90)
Returns:
Trend data suitable for time-series and distribution charts
"""
db = get_db_client()
with db.get_conn() as conn:
with conn.cursor() as cur:
# Analyses per company per month
cur.execute(
"""
SELECT
TO_CHAR(timestamp, 'YYYY-MM') AS month,
company_name,
COUNT(*) AS count
FROM llm_messages
WHERE timestamp >= NOW() - INTERVAL '%s days'
AND is_cached = FALSE
AND company_name IS NOT NULL
GROUP BY month, company_name
ORDER BY month
""",
(days,),
)
by_month_rows = cur.fetchall()
# Analysis type distribution per month
cur.execute(
"""
SELECT
TO_CHAR(timestamp, 'YYYY-MM') AS month,
analysis_type,
COUNT(*) AS count
FROM llm_messages
WHERE timestamp >= NOW() - INTERVAL '%s days'
AND is_cached = FALSE
GROUP BY month, analysis_type
ORDER BY month
""",
(days,),
)
by_type_rows = cur.fetchall()
by_month = [
{"month": row[0], "company_name": row[1], "count": row[2]}
for row in by_month_rows
]
by_type_over_time = [
{"month": row[0], "analysis_type": row[1], "count": row[2]}
for row in by_type_rows
]
return {
"by_month": by_month,
"by_type_over_time": by_type_over_time,
"period_days": days,
}
# ============== Export Endpoints ==============
@app.get("/export/{company_name}", tags=["Export"])
async def export_company_csv(
company_name: Annotated[str, Path(min_length=2, max_length=100, pattern=r"^[a-zA-Z0-9][a-zA-Z0-9 \-&.]*$")],
_: UserResponse = Depends(get_current_user),
):
"""Export analysis results for a company as a CSV file.
Returns all stored analysis records for the given company, including
analysis type, model used, response text, and timestamp.
Args:
company_name: Company name to export results for
Returns:
CSV file download
"""
import csv
import io
db = get_db_client()
# Query all non-cached analysis results for this company
with db.get_conn() as conn:
with conn.cursor() as cur:
cur.execute(
"""
SELECT company_name, analysis_type, model, response, timestamp
FROM llm_messages
WHERE LOWER(company_name) = LOWER(%s) AND is_cached = FALSE
ORDER BY timestamp DESC
""",
(company_name,),
)
rows = cur.fetchall()
if not rows:
raise HTTPException(status_code=404, detail=f"No analysis results found for '{company_name}'")
output = io.StringIO()
writer = csv.writer(output)
writer.writerow(["company_name", "analysis_type", "model", "analysis", "timestamp"])
for row in rows:
writer.writerow(row)
output.seek(0)
safe_name = company_name.replace(" ", "_").lower()
return StreamingResponse(
iter([output.getvalue()]),
media_type="text/csv",
headers={"Content-Disposition": f'attachment; filename="sparc_{safe_name}_export.csv"'},
)
@app.get("/export/{company_name}/pdf", tags=["Export"])
async def export_company_pdf(
company_name: Annotated[str, Path(min_length=2, max_length=100, pattern=r"^[a-zA-Z0-9][a-zA-Z0-9 \-&.]*$")],
_: UserResponse = Depends(get_current_user),
):
"""Export analysis results for a company as a formatted PDF report.
Returns all stored analysis records for the given company, including
analysis type, model used, response text, and timestamp, formatted
as a downloadable PDF document.
Args:
company_name: Company name to export results for
Returns:
PDF file download
"""
import io
from reportlab.lib import colors
from reportlab.lib.pagesizes import letter
from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
from reportlab.lib.units import inch
from reportlab.platypus import (
Paragraph,
SimpleDocTemplate,
Spacer,
Table,
TableStyle,
)
db = get_db_client()
with db.get_conn() as conn:
with conn.cursor() as cur:
cur.execute(
"""
SELECT company_name, analysis_type, model, response, timestamp
FROM llm_messages
WHERE LOWER(company_name) = LOWER(%s) AND is_cached = FALSE
ORDER BY timestamp DESC
""",
(company_name,),
)
rows = cur.fetchall()
if not rows:
raise HTTPException(status_code=404, detail=f"No analysis results found for '{company_name}'")
buffer = io.BytesIO()
doc = SimpleDocTemplate(
buffer,
pagesize=letter,
rightMargin=0.75 * inch,
leftMargin=0.75 * inch,
topMargin=0.75 * inch,
bottomMargin=0.75 * inch,
)
styles = getSampleStyleSheet()
title_style = ParagraphStyle(
"CustomTitle",
parent=styles["Title"],
fontSize=20,
spaceAfter=6,
)
subtitle_style = ParagraphStyle(
"Subtitle",
parent=styles["Normal"],
fontSize=11,
textColor=colors.grey,
spaceAfter=20,
)
heading_style = ParagraphStyle(
"SectionHeading",
parent=styles["Heading2"],
fontSize=13,
spaceBefore=16,
spaceAfter=8,
textColor=colors.HexColor("#1a1a2e"),
)
body_style = ParagraphStyle(
"BodyText",
parent=styles["Normal"],
fontSize=9,
leading=13,
spaceAfter=10,
)
elements = []
# Title and date
display_name = rows[0][0] # Use the casing from the database
analysis_date = datetime.now().strftime("%Y-%m-%d")
elements.append(Paragraph(f"SPARC Analysis Report: {display_name}", title_style))
elements.append(Paragraph(f"Generated on {analysis_date}", subtitle_style))
# Summary table
summary_data = [
["Total Analyses", str(len(rows))],
["Analysis Types", ", ".join(sorted(set(r[1] for r in rows)))],
["Models Used", ", ".join(sorted(set(r[2] for r in rows)))],
]
summary_table = Table(summary_data, colWidths=[2 * inch, 4.5 * inch])
summary_table.setStyle(
TableStyle(
[
("BACKGROUND", (0, 0), (0, -1), colors.HexColor("#f0f0f5")),
("FONTNAME", (0, 0), (0, -1), "Helvetica-Bold"),
("FONTSIZE", (0, 0), (-1, -1), 9),
("PADDING", (0, 0), (-1, -1), 6),
("GRID", (0, 0), (-1, -1), 0.5, colors.HexColor("#cccccc")),
("VALIGN", (0, 0), (-1, -1), "TOP"),
]
)
)
elements.append(summary_table)
elements.append(Spacer(1, 16))
# Individual analysis sections
for i, row in enumerate(rows, 1):
_, analysis_type, model, response, timestamp = row
ts_str = timestamp.strftime("%Y-%m-%d %H:%M:%S") if hasattr(timestamp, "strftime") else str(timestamp)
elements.append(
Paragraph(f"Analysis {i}: {analysis_type} (via {model})", heading_style)
)
elements.append(
Paragraph(f"<i>Performed: {ts_str}</i>", body_style)
)
# Wrap long response text into paragraphs, escaping XML special chars
safe_response = (
response.replace("&", "&amp;")
.replace("<", "&lt;")
.replace(">", "&gt;")
)
# Split into manageable paragraphs to avoid overflow
for line in safe_response.split("\n"):
if line.strip():
elements.append(Paragraph(line, body_style))
else:
elements.append(Spacer(1, 4))
elements.append(Spacer(1, 10))
doc.build(elements)
buffer.seek(0)
safe_name = company_name.replace(" ", "_").lower()
filename = f"{safe_name}-analysis-{analysis_date}.pdf"
return StreamingResponse(
iter([buffer.getvalue()]),
media_type="application/pdf",
headers={"Content-Disposition": f'attachment; filename="{filename}"'},
)
# ============== System Endpoints ==============
@@ -408,7 +945,8 @@ async def health_check():
tags=["Analysis"],
)
async def analyze_company(
company_name: str,
company_name: Annotated[str, Path(min_length=2, max_length=100, pattern=r"^[a-zA-Z0-9][a-zA-Z0-9 \-&.]*$")],
model: str | None = Query(default=None, description="LLM model to use (e.g. 'openai/gpt-4o'). Defaults to server config."),
_: UserResponse = Depends(get_current_user),
):
"""Analyze a single company's patent portfolio.
@@ -418,17 +956,103 @@ async def analyze_company(
Args:
company_name: Name of the company to analyze (e.g., "nvidia", "intel")
model: Optional LLM model override
Returns:
Analysis results including patent count, AI insights, and success status
"""
_validate_model(model)
if not _analyzer:
raise HTTPException(status_code=503, detail="Analyzer not initialized")
result = _analyzer._analyze_company_safe(company_name)
result = _analyzer._analyze_company_safe(company_name, model=model)
return _convert_result(result)
@app.get(
"/analyze/patent/{patent_id}",
tags=["Analysis"],
)
async def analyze_single_patent(
patent_id: str,
company_name: Annotated[str, Query(min_length=2, max_length=100, pattern=r"^[a-zA-Z0-9][a-zA-Z0-9 \-&.]*$", description="Company name for analysis context")],
_: UserResponse = Depends(get_current_user),
):
"""Analyze a single patent by its publication ID.
If the patent PDF is not already cached locally, the system will attempt
to download it automatically from a previously cached link. If no link
is available, a 404 error is returned.
Args:
patent_id: Patent publication ID (e.g. "US-11234567-B2")
company_name: Company name for analysis context
Returns:
Analysis text for the patent
"""
if not _analyzer:
raise HTTPException(status_code=503, detail="Analyzer not initialized")
try:
analysis = _analyzer.analyze_single_patent(patent_id, company_name)
return {"patent_id": patent_id, "company_name": company_name, "analysis": analysis}
except FileNotFoundError as e:
raise HTTPException(status_code=404, detail=str(e))
@app.get(
"/analyze/batch",
response_model=PaginatedAnalysisResponse,
tags=["Analysis"],
)
async def list_analysis_results(
company_name: Annotated[
str | None,
Query(description="Filter results by company name"),
] = None,
limit: Annotated[int, Query(ge=1, le=200)] = 50,
cursor: Annotated[
str | None,
Query(description="Opaque cursor from a previous response's next_cursor field"),
] = None,
_: UserResponse = Depends(get_current_user),
):
"""List stored analysis results with cursor-based pagination.
Returns past analysis results ordered by timestamp descending. Use
``limit`` to control page size (default 50, max 200). The response
includes a ``next_cursor`` field; pass it back as the ``cursor`` query
parameter to fetch the next page. When ``next_cursor`` is ``null``,
there are no more results.
Args:
company_name: Optional filter by company name
limit: Maximum number of results to return (default 50, max 200)
cursor: Opaque pagination cursor from a previous response
Returns:
Paginated list of analysis results
"""
db = _get_job_db()
rows = db.list_analyses(company_name=company_name, limit=limit + 1, cursor=cursor)
has_next = len(rows) > limit
if has_next:
rows = rows[:limit]
items = [AnalysisRecord(**row) for row in rows]
next_cursor = None
if has_next and rows:
last = rows[-1]
ts = last["timestamp"]
ts_str = ts.isoformat() if hasattr(ts, "isoformat") else str(ts)
next_cursor = f"{ts_str}|{last['id']}"
return PaginatedAnalysisResponse(items=items, next_cursor=next_cursor)
@app.post(
"/analyze/batch",
response_model=BatchAnalysisResponse,
@@ -449,12 +1073,14 @@ async def analyze_companies_batch(
Returns:
Batch results with individual company analyses and summary statistics
"""
_validate_model(request.model)
if not _analyzer:
raise HTTPException(status_code=503, detail="Analyzer not initialized")
result = _analyzer.analyze_companies(
companies=request.companies,
max_workers=request.max_workers,
model=request.model,
)
return _convert_batch_result(result)
@@ -486,7 +1112,7 @@ def _job_row_to_status(row: dict) -> JobStatus:
)
def _run_batch_job(job_id: str, companies: list[str], max_workers: int):
def _run_batch_job(job_id: str, companies: list[str], max_workers: int, model: str | None = None):
"""Background task for batch analysis."""
import json as _json
global _analyzer
@@ -511,6 +1137,7 @@ def _run_batch_job(job_id: str, companies: list[str], max_workers: int):
companies=companies,
max_workers=max_workers,
progress_callback=progress_callback,
model=model,
)
batch_response = _convert_batch_result(result)
db.update_job(
@@ -519,8 +1146,25 @@ def _run_batch_job(job_id: str, companies: list[str], max_workers: int):
progress=100,
result_json=_json.dumps(batch_response.model_dump(), default=str),
)
# Fire webhook notification
from SPARC.webhooks import notify_job_completed
notify_job_completed(
job_id=job_id,
status="completed",
total_companies=result.total_companies,
successful=result.successful,
failed=result.failed,
)
except Exception as e:
db.update_job(job_id, status="failed", error=str(e))
from SPARC.webhooks import notify_job_completed
notify_job_completed(
job_id=job_id,
status="failed",
total_companies=len(companies),
successful=0,
failed=len(companies),
)
@app.post("/analyze/batch/async", response_model=JobStatus, tags=["Analysis"])
@@ -540,6 +1184,7 @@ async def analyze_companies_async(
Returns:
Job status with job_id for polling
"""
_validate_model(request.model)
global _job_counter
_job_counter += 1
@@ -549,7 +1194,7 @@ async def analyze_companies_async(
job_row = db.create_job(job_id=job_id, total_companies=len(request.companies))
background_tasks.add_task(
_run_batch_job, job_id, request.companies, request.max_workers
_run_batch_job, job_id, request.companies, request.max_workers, request.model
)
return _job_row_to_status(job_row)
@@ -577,24 +1222,51 @@ async def get_job_status(
return _job_row_to_status(job_row)
@app.get("/jobs", response_model=list[JobStatus], tags=["Jobs"])
@app.get("/jobs", response_model=PaginatedJobsResponse, tags=["Jobs"])
async def list_jobs(
status: Annotated[
str | None,
Query(description="Filter by status: pending, running, completed, failed"),
] = None,
limit: Annotated[int, Query(ge=1, le=100)] = 10,
limit: Annotated[int, Query(ge=1, le=200)] = 50,
cursor: Annotated[
str | None,
Query(description="Opaque cursor from a previous response's next_cursor field"),
] = None,
_: UserResponse = Depends(get_current_user),
):
"""List all analysis jobs.
"""List analysis jobs with cursor-based pagination.
Pass ``limit`` to control page size. The response includes a ``next_cursor``
field; pass it back as the ``cursor`` query parameter to fetch the next page.
When ``next_cursor`` is ``null``, there are no more results.
Existing clients that use only ``limit`` (without ``cursor``) continue to
work without modification.
Args:
status: Optional filter by job status
limit: Maximum number of jobs to return (default 10, max 100)
cursor: Opaque pagination cursor from a previous response
Returns:
List of job statuses
Paginated list of job statuses
"""
db = _get_job_db()
job_rows = db.list_jobs(status=status, limit=limit)
return [_job_row_to_status(row) for row in job_rows]
# Fetch one extra to determine if there is a next page
job_rows = db.list_jobs(status=status, limit=limit + 1, cursor=cursor)
has_next = len(job_rows) > limit
if has_next:
job_rows = job_rows[:limit]
items = [_job_row_to_status(row) for row in job_rows]
next_cursor = None
if has_next and job_rows:
last = job_rows[-1]
created = last["created_at"]
ts = created.isoformat() if hasattr(created, "isoformat") else str(created)
next_cursor = f"{ts}|{last['job_id']}"
return PaginatedJobsResponse(items=items, next_cursor=next_cursor)
+7
View File
@@ -53,6 +53,13 @@ root_path = os.getenv("ROOT_PATH", "")
# Used for safety checks (e.g., refusing default JWT secret in production)
app_env = os.getenv("APP_ENV", "development")
# Storage backend: "local" (default) or "s3" for S3/MinIO object storage
storage_backend = os.getenv("STORAGE_BACKEND", "local")
s3_bucket = os.getenv("S3_BUCKET", "sparc-patents")
s3_endpoint_url = os.getenv("S3_ENDPOINT_URL", "")
s3_access_key = os.getenv("AWS_ACCESS_KEY_ID", "")
s3_secret_key = os.getenv("AWS_SECRET_ACCESS_KEY", "")
# CORS allowed origins (comma-separated)
# Defaults to localhost dev origins when unset
_cors_origins_raw = os.getenv("CORS_ORIGINS", "")
+181 -7
View File
@@ -192,6 +192,35 @@ class DatabaseClient:
ON jobs(status)
""")
# Create tracked companies table for scheduled analysis
cursor.execute("""
CREATE TABLE IF NOT EXISTS tracked_companies (
id SERIAL PRIMARY KEY,
company_name VARCHAR(255) UNIQUE NOT NULL,
last_patent_count INTEGER DEFAULT 0,
last_analysis_at TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
# Create alerts table for significant changes
cursor.execute("""
CREATE TABLE IF NOT EXISTS alerts (
id SERIAL PRIMARY KEY,
company_name VARCHAR(255) NOT NULL,
alert_type VARCHAR(50) NOT NULL,
message TEXT NOT NULL,
old_value NUMERIC,
new_value NUMERIC,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
cursor.execute("""
CREATE INDEX IF NOT EXISTS idx_alerts_company
ON alerts(company_name)
""")
self.conn.commit()
@staticmethod
@@ -342,6 +371,48 @@ class DatabaseClient:
cursor.execute(query, params)
return [dict(row) for row in cursor.fetchall()]
def list_analyses(
self,
company_name: Optional[str] = None,
limit: int = 50,
cursor: Optional[str] = None,
) -> List[Dict]:
"""List analysis results with cursor-based pagination.
Args:
company_name: Optional filter by company name.
limit: Maximum number of records to return.
cursor: Opaque cursor (``timestamp|id``) from a previous response.
Returns:
List of analysis dicts ordered by timestamp descending.
"""
conditions: list[str] = ["is_cached = FALSE"]
params: list = []
if company_name:
conditions.append("LOWER(company_name) = LOWER(%s)")
params.append(company_name)
if cursor:
try:
ts_str, cursor_id = cursor.rsplit("|", 1)
conditions.append("(timestamp, id) < (%s, %s)")
params.extend([ts_str, int(cursor_id)])
except (ValueError, TypeError):
pass # Ignore malformed cursors; return from start
query = "SELECT id, company_name, analysis_type, model, response, timestamp FROM llm_messages"
if conditions:
query += " WHERE " + " AND ".join(conditions)
query += " ORDER BY timestamp DESC, id DESC LIMIT %s"
params.append(limit)
with self.get_conn() as conn:
with conn.cursor(cursor_factory=RealDictCursor) as cur:
cur.execute(query, params)
return [dict(row) for row in cur.fetchall()]
def get_analytics(self, days: int = 30) -> Dict:
"""Get analytics on message usage.
@@ -568,20 +639,45 @@ class DatabaseClient:
self,
status: Optional[str] = None,
limit: int = 10,
cursor: Optional[str] = None,
) -> List[Dict]:
"""List jobs, optionally filtered by status."""
query = "SELECT * FROM jobs"
"""List jobs with optional status filter and cursor-based pagination.
Args:
status: Optional status filter (pending, running, completed, failed).
limit: Maximum number of jobs to return.
cursor: Opaque cursor (``created_at|job_id``) from a previous
response. When provided, only jobs older than the cursor are
returned.
Returns:
List of job dicts ordered by created_at descending.
"""
conditions: list[str] = []
params: list = []
if status:
query += " WHERE status = %s"
conditions.append("status = %s")
params.append(status)
query += " ORDER BY created_at DESC LIMIT %s"
if cursor:
try:
ts_str, cursor_job_id = cursor.rsplit("|", 1)
conditions.append("(created_at, job_id) < (%s, %s)")
params.extend([ts_str, cursor_job_id])
except ValueError:
pass # Ignore malformed cursors; return from start
query = "SELECT * FROM jobs"
if conditions:
query += " WHERE " + " AND ".join(conditions)
query += " ORDER BY created_at DESC, job_id DESC LIMIT %s"
params.append(limit)
with self.get_conn() as conn:
with conn.cursor(cursor_factory=RealDictCursor) as cursor:
cursor.execute(query, params)
return [dict(row) for row in cursor.fetchall()]
with conn.cursor(cursor_factory=RealDictCursor) as cur:
cur.execute(query, params)
return [dict(row) for row in cur.fetchall()]
def mark_stale_jobs_failed(self) -> int:
"""Mark any jobs in 'running' or 'pending' state as 'failed'.
@@ -803,3 +899,81 @@ class DatabaseClient:
with conn.cursor() as cursor:
cursor.execute("SELECT COUNT(*) FROM users")
return cursor.fetchone()[0]
# Tracked Companies Methods
def add_tracked_company(self, company_name: str) -> Optional[Dict]:
"""Add a company to the tracking list."""
with self.get_conn() as conn:
with conn.cursor(cursor_factory=RealDictCursor) as cursor:
try:
cursor.execute(
"INSERT INTO tracked_companies (company_name) VALUES (%s) RETURNING *",
(company_name,),
)
row = cursor.fetchone()
conn.commit()
return dict(row) if row else None
except Exception:
conn.rollback()
return None
def remove_tracked_company(self, company_name: str) -> bool:
"""Remove a company from the tracking list."""
with self.get_conn() as conn:
with conn.cursor() as cursor:
cursor.execute(
"DELETE FROM tracked_companies WHERE LOWER(company_name) = LOWER(%s)",
(company_name,),
)
conn.commit()
return cursor.rowcount > 0
def list_tracked_companies(self) -> List[Dict]:
"""List all tracked companies."""
with self.get_conn() as conn:
with conn.cursor(cursor_factory=RealDictCursor) as cursor:
cursor.execute("SELECT * FROM tracked_companies ORDER BY company_name")
return [dict(row) for row in cursor.fetchall()]
def update_tracked_company(
self, company_name: str, patent_count: int
) -> None:
"""Update the last analysis stats for a tracked company."""
with self.get_conn() as conn:
with conn.cursor() as cursor:
cursor.execute(
"""UPDATE tracked_companies
SET last_patent_count = %s, last_analysis_at = CURRENT_TIMESTAMP
WHERE LOWER(company_name) = LOWER(%s)""",
(patent_count, company_name),
)
conn.commit()
def store_alert(
self,
company_name: str,
alert_type: str,
message: str,
old_value: float | None = None,
new_value: float | None = None,
) -> None:
"""Record an alert for a significant change."""
with self.get_conn() as conn:
with conn.cursor() as cursor:
cursor.execute(
"""INSERT INTO alerts (company_name, alert_type, message, old_value, new_value)
VALUES (%s, %s, %s, %s, %s)""",
(company_name, alert_type, message, old_value, new_value),
)
conn.commit()
def list_alerts(self, limit: int = 50) -> List[Dict]:
"""List recent alerts."""
with self.get_conn() as conn:
with conn.cursor(cursor_factory=RealDictCursor) as cursor:
cursor.execute(
"SELECT * FROM alerts ORDER BY created_at DESC LIMIT %s",
(limit,),
)
return [dict(row) for row in cursor.fetchall()]
+18 -12
View File
@@ -40,12 +40,13 @@ class LLMAnalyzer:
else:
self.client = None
def analyze_patent_content(self, patent_content: str, company_name: str) -> str:
def analyze_patent_content(self, patent_content: str, company_name: str, model: str | None = None) -> str:
"""Analyze patent content to estimate company innovation and performance.
Args:
patent_content: Minimized patent text (abstract, claims, summary)
company_name: Name of the company for context
model: Optional model override (e.g. "openai/gpt-4o"). Defaults to config.
Returns:
Analysis text describing innovation quality and potential impact
@@ -63,6 +64,8 @@ Patent Content:
Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals about the company's technical direction and competitive advantage."""
effective_model = model or self.model
if self.test_mode:
logger.debug("TEST MODE - Prompt that would be sent to LLM:\n%s", prompt)
return "[TEST MODE - No API call made]"
@@ -81,7 +84,7 @@ Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals
response=cached["response"],
company_name=company_name,
analysis_type="single_patent",
model=self.model,
model=effective_model,
metadata={
"patent_content_length": len(patent_content),
"cache_hit": True,
@@ -94,7 +97,7 @@ Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals
# Call API if no cache hit and client is available
if self.client:
response = self.client.chat.completions.create(
model=self.model,
model=effective_model,
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
@@ -106,7 +109,7 @@ Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals
response=response_text,
company_name=company_name,
analysis_type="single_patent",
model=self.model,
model=effective_model,
metadata={"patent_content_length": len(patent_content)},
token_usage={
"prompt_tokens": response.usage.prompt_tokens,
@@ -124,13 +127,13 @@ Provide a concise analysis (2-3 paragraphs) focusing on what this patent reveals
response=placeholder,
company_name=company_name,
analysis_type="single_patent",
model=self.model,
model=effective_model,
metadata={"patent_content_length": len(patent_content), "pending": True}
)
return placeholder
def analyze_patent_portfolio(
self, patents_data: list[Dict[str, str]], company_name: str
self, patents_data: list[Dict[str, str]], company_name: str, model: str | None = None
) -> str:
"""Analyze multiple patents to estimate overall company performance.
@@ -165,13 +168,16 @@ Patent Portfolio:
Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the company's innovation strength and performance outlook."""
effective_model = model or self.model
if self.test_mode:
logger.debug("TEST MODE - Portfolio prompt:\n%s", prompt)
return "[TEST MODE]"
metadata = {
"patent_count": len(patents_data),
"patent_ids": [p['patent_id'] for p in patents_data]
"patent_ids": [p['patent_id'] for p in patents_data],
"model": effective_model,
}
# Check cache first
@@ -188,7 +194,7 @@ Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the co
response=cached["response"],
company_name=company_name,
analysis_type="portfolio",
model=self.model,
model=effective_model,
metadata={
**metadata,
"cache_hit": True,
@@ -202,7 +208,7 @@ Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the co
if self.client:
try:
response = self.client.chat.completions.create(
model=self.model,
model=effective_model,
max_tokens=2048,
messages=[{"role": "user", "content": prompt}],
)
@@ -215,7 +221,7 @@ Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the co
response=response_text,
company_name=company_name,
analysis_type="portfolio",
model=self.model,
model=effective_model,
metadata=metadata,
token_usage={
"prompt_tokens": response.usage.prompt_tokens,
@@ -235,7 +241,7 @@ Provide a comprehensive analysis (4-5 paragraphs) with a final verdict on the co
response=placeholder,
company_name=company_name,
analysis_type="portfolio",
model=self.model,
model=effective_model,
metadata={**metadata, "pending": True}
)
return placeholder
+114
View File
@@ -0,0 +1,114 @@
"""Scheduled patent analysis for tracked companies.
Uses APScheduler to periodically re-analyze tracked companies and
detect significant changes in patent counts.
The scheduler reuses the application-level pooled DatabaseClient
(from ``SPARC.auth``) instead of creating its own connection, which
avoids exhausting the database connection pool under load.
"""
import logging
import os
from SPARC.analyzer import CompanyAnalyzer
from SPARC.auth import get_db_client
logger = logging.getLogger(__name__)
# Configurable via environment variable (in hours, default 24)
SCHEDULE_INTERVAL_HOURS = int(os.getenv("SCHEDULE_INTERVAL_HOURS", "24"))
# Patent count change threshold (percentage) to trigger an alert
CHANGE_THRESHOLD_PERCENT = int(os.getenv("CHANGE_THRESHOLD_PERCENT", "20"))
def run_scheduled_analysis() -> None:
"""Re-analyze all tracked companies and check for significant changes.
Uses the shared pooled DatabaseClient from ``SPARC.auth.get_db_client()``
rather than creating a disposable connection, so the scheduler participates
in the same connection pool as the rest of the application.
"""
db = get_db_client()
tracked = db.list_tracked_companies()
if not tracked:
logger.info("No tracked companies configured; skipping scheduled analysis")
return
logger.info("Running scheduled analysis for %d tracked companies", len(tracked))
analyzer = CompanyAnalyzer(db_client=db)
for company_row in tracked:
name = company_row["company_name"]
old_count = company_row.get("last_patent_count", 0) or 0
try:
result = analyzer._analyze_company_safe(name)
if result.success:
new_count = result.patent_count
# Update tracking record
db.update_tracked_company(name, new_count)
# Check for significant change
if old_count > 0:
delta_pct = abs(new_count - old_count) / old_count * 100
if delta_pct >= CHANGE_THRESHOLD_PERCENT:
direction = "increased" if new_count > old_count else "decreased"
message = (
f"Patent count for {name} {direction} by {delta_pct:.0f}% "
f"({old_count} -> {new_count})"
)
logger.warning("ALERT: %s", message)
db.store_alert(
company_name=name,
alert_type="patent_count_change",
message=message,
old_value=old_count,
new_value=new_count,
)
elif new_count > 0:
# First analysis -- record baseline
logger.info("Baseline for %s: %d patents", name, new_count)
else:
logger.warning("Scheduled analysis failed for %s: %s", name, result.error)
except Exception as e:
logger.error("Error analyzing tracked company %s: %s", name, e)
logger.info("Scheduled analysis complete")
def start_scheduler() -> None:
"""Start the APScheduler background scheduler.
Safe to call at application startup. If apscheduler is not installed,
the function logs a warning and returns without starting anything.
"""
try:
from apscheduler.schedulers.background import BackgroundScheduler
except ImportError:
logger.warning(
"apscheduler not installed; scheduled analysis disabled. "
"Install with: pip install apscheduler"
)
return
scheduler = BackgroundScheduler()
scheduler.add_job(
run_scheduled_analysis,
"interval",
hours=SCHEDULE_INTERVAL_HOURS,
id="scheduled_patent_analysis",
replace_existing=True,
)
scheduler.start()
logger.info(
"Scheduled patent analysis started (every %d hours, threshold %d%%)",
SCHEDULE_INTERVAL_HOURS,
CHANGE_THRESHOLD_PERCENT,
)
+47 -13
View File
@@ -1,4 +1,5 @@
import os
import io
import logging
import re
from datetime import datetime, timedelta
from typing import Dict
@@ -8,8 +9,21 @@ import requests
import serpapi
from SPARC import config
from SPARC.storage import StorageBackend, get_storage_backend
from SPARC.types import Patent, Patents
logger = logging.getLogger(__name__)
# Module-level storage instance (lazy-initialized)
_storage: StorageBackend | None = None
def _get_storage() -> StorageBackend:
global _storage
if _storage is None:
_storage = get_storage_backend()
return _storage
class SERP:
def query(company: str, days_back: int = None) -> Patents:
@@ -44,6 +58,7 @@ class SERP:
"tbs": date_filter,
"api_key": config.api_key,
}
logger.info("Querying Google Patents for '%s' (last %d days)", company, days_back)
search = serpapi.search(params)
# Convert results to Patent objects, skipping any without PDF links
patent_ids = []
@@ -52,13 +67,16 @@ class SERP:
pdf_link = patent.get("pdf")
if pdf_link:
patent_ids.append(Patent(patent_id=patent["publication_number"], pdf_link=pdf_link, summary=None))
# Patents without PDF links are skipped (see docstring for details)
else:
logger.debug("Skipping patent %s (no PDF link)", patent.get("publication_number", "unknown"))
logger.info("Found %d patents with PDF links for '%s'", len(patent_ids), company)
return Patents(patents=patent_ids)
def save_patents(patent: Patent) -> Patent:
"""
Save the patent PDF to the patents folder, skipping download if already cached.
"""Save the patent PDF to storage, skipping download if already cached.
Uses the configured storage backend (local filesystem or S3).
Args:
patent: Patent object
@@ -66,35 +84,51 @@ class SERP:
Returns:
Patent object with updated PDF path
"""
pdf_path = f"patents/{patent.patent_id}.pdf"
os.makedirs("patents", exist_ok=True)
storage = _get_storage()
key = f"{patent.patent_id}.pdf"
if not (os.path.exists(pdf_path) and os.path.getsize(pdf_path) > 0):
if not storage.exists(key):
logger.info("Downloading PDF for %s", patent.patent_id)
response = requests.get(patent.pdf_link)
with open(pdf_path, "wb") as f:
f.write(response.content)
storage.write(key, response.content)
logger.debug("Saved %d bytes for %s", len(response.content), patent.patent_id)
else:
logger.debug("Using cached PDF for %s", patent.patent_id)
patent.pdf_path = pdf_path
patent.pdf_path = storage.path_for(key)
return patent
def parse_patent_pdf(pdf_path: str) -> Dict:
"""Extract structured sections from patent PDF.
Extracts all major sections from a patent PDF including abstract,
claims, summary, and detailed description.
claims, summary, and detailed description. Supports both local file
paths and S3 URIs (s3://bucket/key).
Args:
pdf_path: Path to the patent PDF file
pdf_path: Local path or S3 URI to the patent PDF file
Returns:
Dictionary containing all extracted sections
"""
logger.debug("Parsing patent PDF: %s", pdf_path)
with pdfplumber.open(pdf_path) as pdf:
if pdf_path.startswith("s3://"):
# Read from S3 via storage backend
storage = _get_storage()
# Extract key from "s3://bucket/key"
key = pdf_path.split("/", 3)[-1]
data = storage.read(key)
pdf_file: io.BytesIO | str = io.BytesIO(data)
else:
pdf_file = pdf_path
with pdfplumber.open(pdf_file) as pdf:
# Extract all text
full_text = ""
for page in pdf.pages:
full_text += page.extract_text() + "\n"
logger.debug("Extracted text from %d pages (%d chars)", len(pdf.pages), len(full_text))
# Define section patterns (common in patents)
sections = {
+171
View File
@@ -0,0 +1,171 @@
"""Patent PDF storage abstraction.
Provides a unified interface for reading and writing patent PDF files,
with pluggable backends for local filesystem and S3-compatible object
storage (e.g., MinIO, AWS S3).
"""
import logging
import os
from abc import ABC, abstractmethod
from SPARC import config
logger = logging.getLogger(__name__)
class StorageBackend(ABC):
"""Abstract base class for patent PDF storage."""
@abstractmethod
def read(self, key: str) -> bytes:
"""Read a file by key.
Args:
key: Storage key (e.g., "US-12345678-B2.pdf")
Returns:
File contents as bytes.
Raises:
FileNotFoundError: If the file does not exist.
"""
@abstractmethod
def write(self, key: str, data: bytes) -> None:
"""Write data to storage.
Args:
key: Storage key (e.g., "US-12345678-B2.pdf")
data: File contents as bytes.
"""
@abstractmethod
def exists(self, key: str) -> bool:
"""Check if a file exists in storage.
Args:
key: Storage key.
Returns:
True if the file exists and has non-zero size.
"""
@abstractmethod
def path_for(self, key: str) -> str:
"""Return a path or URI suitable for downstream consumers.
For local storage this is a filesystem path; for S3 it is the
object key (callers that need a local file should use read()
and write to a temporary location).
"""
class LocalStorageBackend(StorageBackend):
"""Store patent PDFs on the local filesystem under a directory."""
def __init__(self, base_dir: str = "patents"):
self.base_dir = base_dir
os.makedirs(self.base_dir, exist_ok=True)
def _full_path(self, key: str) -> str:
return os.path.join(self.base_dir, key)
def read(self, key: str) -> bytes:
path = self._full_path(key)
if not os.path.exists(path):
raise FileNotFoundError(f"File not found: {path}")
with open(path, "rb") as f:
return f.read()
def write(self, key: str, data: bytes) -> None:
path = self._full_path(key)
os.makedirs(os.path.dirname(path) or self.base_dir, exist_ok=True)
with open(path, "wb") as f:
f.write(data)
logger.debug("Wrote %d bytes to %s", len(data), path)
def exists(self, key: str) -> bool:
path = self._full_path(key)
return os.path.exists(path) and os.path.getsize(path) > 0
def path_for(self, key: str) -> str:
return self._full_path(key)
class S3StorageBackend(StorageBackend):
"""Store patent PDFs in an S3-compatible bucket."""
def __init__(
self,
bucket: str,
endpoint_url: str = "",
access_key: str = "",
secret_key: str = "",
):
import boto3
kwargs: dict = {}
if endpoint_url:
kwargs["endpoint_url"] = endpoint_url
if access_key and secret_key:
kwargs["aws_access_key_id"] = access_key
kwargs["aws_secret_access_key"] = secret_key
self.s3 = boto3.client("s3", **kwargs)
self.bucket = bucket
# Ensure bucket exists (useful for MinIO local dev)
try:
self.s3.head_bucket(Bucket=self.bucket)
except Exception:
try:
self.s3.create_bucket(Bucket=self.bucket)
logger.info("Created S3 bucket: %s", self.bucket)
except Exception as e:
logger.warning("Could not create bucket %s: %s", self.bucket, e)
def read(self, key: str) -> bytes:
try:
response = self.s3.get_object(Bucket=self.bucket, Key=key)
return response["Body"].read()
except self.s3.exceptions.NoSuchKey:
raise FileNotFoundError(f"S3 object not found: s3://{self.bucket}/{key}")
except Exception as e:
if "NoSuchKey" in str(e) or "404" in str(e):
raise FileNotFoundError(f"S3 object not found: s3://{self.bucket}/{key}")
raise
def write(self, key: str, data: bytes) -> None:
self.s3.put_object(
Bucket=self.bucket,
Key=key,
Body=data,
ContentType="application/pdf",
)
logger.debug("Wrote %d bytes to s3://%s/%s", len(data), self.bucket, key)
def exists(self, key: str) -> bool:
try:
response = self.s3.head_object(Bucket=self.bucket, Key=key)
return response["ContentLength"] > 0
except Exception:
return False
def path_for(self, key: str) -> str:
return f"s3://{self.bucket}/{key}"
def get_storage_backend() -> StorageBackend:
"""Factory: return the configured storage backend instance."""
backend = config.storage_backend.lower()
if backend == "s3":
logger.info("Using S3 storage backend (bucket=%s)", config.s3_bucket)
return S3StorageBackend(
bucket=config.s3_bucket,
endpoint_url=config.s3_endpoint_url,
access_key=config.s3_access_key,
secret_key=config.s3_secret_key,
)
logger.info("Using local storage backend")
return LocalStorageBackend()
+1
View File
@@ -24,6 +24,7 @@ class CompanyAnalysisResult:
patent_count: int
success: bool
error: str | None = None
model: str | None = None
timestamp: datetime = field(default_factory=datetime.now)
+139
View File
@@ -0,0 +1,139 @@
"""Webhook notifications for job completion and alert events.
Sends JSON payloads to configured webhook URLs with retry logic.
Supports generic HTTP POST and Slack-compatible text payloads.
"""
import logging
import os
import time
from datetime import datetime
from typing import Any
import requests
logger = logging.getLogger(__name__)
# Comma-separated list of webhook URLs (env var based config)
_WEBHOOK_URLS_RAW = os.getenv("WEBHOOK_URLS", "")
WEBHOOK_URLS: list[str] = [
url.strip() for url in _WEBHOOK_URLS_RAW.split(",") if url.strip()
]
MAX_RETRIES = 3
BACKOFF_BASE = 2 # seconds
def _is_slack_url(url: str) -> bool:
"""Check if a URL looks like a Slack incoming webhook."""
return "hooks.slack.com" in url or "discord.com/api/webhooks" in url
def _build_payload(event_type: str, data: dict[str, Any], slack: bool = False) -> dict:
"""Build the webhook payload.
Args:
event_type: Type of event (e.g., "job_completed", "alert")
data: Event-specific data
slack: If True, wrap in Slack-compatible ``text`` format
Returns:
JSON-serializable payload dict
"""
payload = {
"event": event_type,
"timestamp": datetime.utcnow().isoformat() + "Z",
**data,
}
if slack:
# Build a human-readable summary for Slack/Discord
lines = [f"*[SPARC] {event_type}*"]
for key, value in data.items():
lines.append(f" {key}: {value}")
return {"text": "\n".join(lines)}
return payload
def _send_with_retry(url: str, payload: dict) -> bool:
"""Send a POST request with exponential backoff retry.
Args:
url: Webhook URL
payload: JSON payload to send
Returns:
True if delivered successfully, False after all retries exhausted
"""
for attempt in range(1, MAX_RETRIES + 1):
try:
response = requests.post(url, json=payload, timeout=10)
if response.status_code < 300:
logger.debug("Webhook delivered to %s (attempt %d)", url, attempt)
return True
logger.warning(
"Webhook %s returned %d (attempt %d/%d)",
url, response.status_code, attempt, MAX_RETRIES,
)
except requests.RequestException as e:
logger.warning(
"Webhook delivery failed for %s (attempt %d/%d): %s",
url, attempt, MAX_RETRIES, e,
)
if attempt < MAX_RETRIES:
wait = BACKOFF_BASE ** attempt
time.sleep(wait)
logger.error("Webhook permanently failed for %s after %d attempts", url, MAX_RETRIES)
return False
def notify(event_type: str, data: dict[str, Any]) -> None:
"""Fire all configured webhooks for an event.
Safe to call even when no webhooks are configured (returns immediately).
Args:
event_type: Event identifier (e.g., "job_completed", "patent_alert")
data: Event data to include in the payload
"""
if not WEBHOOK_URLS:
return
for url in WEBHOOK_URLS:
slack = _is_slack_url(url)
payload = _build_payload(event_type, data, slack=slack)
_send_with_retry(url, payload)
def notify_job_completed(
job_id: str,
status: str,
total_companies: int,
successful: int,
failed: int,
) -> None:
"""Send notification when a batch job completes."""
notify("job_completed", {
"job_id": job_id,
"status": status,
"total_companies": total_companies,
"successful": successful,
"failed": failed,
"summary": f"Batch job {job_id}: {successful}/{total_companies} succeeded",
})
def notify_alert(
company_name: str,
alert_type: str,
message: str,
) -> None:
"""Send notification for a tracked company alert."""
notify("patent_alert", {
"company_name": company_name,
"alert_type": alert_type,
"message": message,
})
+30 -2
View File
@@ -18,6 +18,7 @@ services:
restart: unless-stopped
init-db:
image: gitea.leeworks.dev/0xwheatyz/sparc:latest
build: .
container_name: sparc-init-db
command: python scripts/init_database.py
@@ -29,6 +30,7 @@ services:
restart: "no"
api:
image: gitea.leeworks.dev/0xwheatyz/sparc:latest
build: .
container_name: sparc-api
command: uvicorn SPARC.api:app --host 0.0.0.0 --port 8000
@@ -40,7 +42,7 @@ services:
JWT_SECRET: ${JWT_SECRET:-sparc-secret-key-change-in-production}
CORS_ORIGINS: ${CORS_ORIGINS:-}
APP_ENV: ${APP_ENV:-development}
ROOT_PATH: /api
ROOT_PATH: ""
ports:
- "8000:8000"
depends_on:
@@ -49,10 +51,34 @@ services:
init-db:
condition: service_completed_successfully
volumes:
- ./patents:/app/patents
- patent_data:/app/patents
restart: unless-stopped
# Optional: MinIO for S3-compatible local object storage
# Enable by setting STORAGE_BACKEND=s3 in .env
minio:
image: minio/minio:latest
container_name: sparc-minio
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: ${AWS_ACCESS_KEY_ID:-minioadmin}
MINIO_ROOT_PASSWORD: ${AWS_SECRET_ACCESS_KEY:-minioadmin}
ports:
- "9000:9000"
- "9001:9001"
volumes:
- minio_data:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 10s
timeout: 5s
retries: 3
restart: unless-stopped
profiles:
- s3
dashboard:
image: gitea.leeworks.dev/0xwheatyz/sparc:frontend-latest
build: ./frontend
container_name: sparc-dashboard
ports:
@@ -63,3 +89,5 @@ services:
volumes:
postgres_data:
patent_data:
minio_data:
+76 -1
View File
@@ -276,7 +276,7 @@ The `docker-compose.yml` includes all services needed for production:
|---------|-----------|------|-------------|
| `postgres` | sparc-postgres | 5432 | PostgreSQL database |
| `init-db` | sparc-init-db | - | One-time database initialization (seeds admin user) |
| `api` | sparc-api | 8000 | FastAPI REST API with JWT auth |
| `api` | sparc-api | 8000 | FastAPI REST API with JWT auth (patent PDFs stored in `patent_data` volume) |
| `dashboard` | sparc-dashboard | 8080 | React TypeScript web UI |
### Common Docker Compose Commands
@@ -307,6 +307,81 @@ docker-compose restart api
---
## Patent PDF Storage
The SPARC API downloads patent PDFs during analysis and stores them at `/app/patents` inside the container. These files are used for subsequent single-patent analysis requests and as a local cache to avoid re-downloading. If this directory is not persisted, all downloaded PDFs are lost when the container is recreated.
### Docker Compose (default)
The default `docker-compose.yml` declares a named volume called `patent_data` that is mounted at `/app/patents`:
```yaml
# In the api service:
volumes:
- patent_data:/app/patents
# At the top-level volumes section:
volumes:
patent_data:
```
This means PDFs survive `docker compose down` and `docker compose up` cycles. To remove patent data intentionally, run:
```bash
docker compose down -v # WARNING: also removes postgres_data
# or selectively:
docker volume rm sparc_patent_data
```
If you prefer a bind mount (e.g., for easy host-side access during development), replace the volume with:
```yaml
volumes:
- ./patents:/app/patents
```
### Kubernetes
For Kubernetes deployments, create a PersistentVolumeClaim and mount it into the API pod:
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: sparc-patent-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sparc-api
spec:
template:
spec:
containers:
- name: api
volumeMounts:
- name: patent-data
mountPath: /app/patents
volumes:
- name: patent-data
persistentVolumeClaim:
claimName: sparc-patent-data
```
Adjust the storage size based on expected patent volume. Each patent PDF is typically 1-5 MB.
### S3 Object Storage (alternative)
For production deployments that need shared or highly durable storage, set `STORAGE_BACKEND=s3` in your `.env` file. This stores patent PDFs in an S3-compatible bucket (AWS S3 or MinIO) instead of the local filesystem, eliminating the need for a persistent volume. See the S3/MinIO section in `.env.example` for configuration details.
---
## Troubleshooting
### Database Connection Issues
+9
View File
@@ -7,6 +7,15 @@
<title>SPARC Dashboard</title>
</head>
<body>
<script>
// Prevent FOUC: apply saved theme before first render
(function() {
var theme = localStorage.getItem('theme');
if (theme === 'dark' || (!theme && window.matchMedia('(prefers-color-scheme: dark)').matches)) {
document.documentElement.classList.add('dark');
}
})();
</script>
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
+1 -1
View File
@@ -15,7 +15,7 @@ server {
# Proxy API requests to backend
location /api/ {
proxy_pass ${API_URL}/;
proxy_pass ${API_URL};
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
+257
View File
@@ -26,6 +26,7 @@
"eslint-plugin-react-hooks": "^5.1.0",
"eslint-plugin-react-refresh": "^0.4.7",
"globals": "^15.8.0",
"openapi-typescript": "^7.0.0",
"postcss": "^8.4.39",
"tailwindcss": "^3.4.4",
"typescript": "~5.5.3",
@@ -1025,6 +1026,82 @@
"node": ">= 8"
}
},
"node_modules/@redocly/ajv": {
"version": "8.11.2",
"resolved": "https://registry.npmjs.org/@redocly/ajv/-/ajv-8.11.2.tgz",
"integrity": "sha512-io1JpnwtIcvojV7QKDUSIuMN/ikdOUd1ReEnUnMKGfDVridQZ31J0MmIuqwuRjWDZfmvr+Q0MqCcfHM2gTivOg==",
"dev": true,
"license": "MIT",
"dependencies": {
"fast-deep-equal": "^3.1.1",
"json-schema-traverse": "^1.0.0",
"require-from-string": "^2.0.2",
"uri-js-replace": "^1.0.1"
},
"funding": {
"type": "github",
"url": "https://github.com/sponsors/epoberezkin"
}
},
"node_modules/@redocly/ajv/node_modules/json-schema-traverse": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/json-schema-traverse/-/json-schema-traverse-1.0.0.tgz",
"integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug==",
"dev": true,
"license": "MIT"
},
"node_modules/@redocly/config": {
"version": "0.22.0",
"resolved": "https://registry.npmjs.org/@redocly/config/-/config-0.22.0.tgz",
"integrity": "sha512-gAy93Ddo01Z3bHuVdPWfCwzgfaYgMdaZPcfL7JZ7hWJoK9V0lXDbigTWkhiPFAaLWzbOJ+kbUQG1+XwIm0KRGQ==",
"dev": true,
"license": "MIT"
},
"node_modules/@redocly/openapi-core": {
"version": "1.34.11",
"resolved": "https://registry.npmjs.org/@redocly/openapi-core/-/openapi-core-1.34.11.tgz",
"integrity": "sha512-V09ayfnb5GyysmvARbt+voFZAjGcf7hSYxOYxSkCc4fbH/DTfq5YWoec8cflvmHHqyIFbqvmGKmYFzqhr9zxDg==",
"dev": true,
"license": "MIT",
"dependencies": {
"@redocly/ajv": "8.11.2",
"@redocly/config": "0.22.0",
"colorette": "1.4.0",
"https-proxy-agent": "7.0.6",
"js-levenshtein": "1.1.6",
"js-yaml": "4.1.1",
"minimatch": "5.1.9",
"pluralize": "8.0.0",
"yaml-ast-parser": "0.0.43"
},
"engines": {
"node": ">=18.17.0",
"npm": ">=9.5.0"
}
},
"node_modules/@redocly/openapi-core/node_modules/brace-expansion": {
"version": "2.0.3",
"resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.3.tgz",
"integrity": "sha512-MCV/fYJEbqx68aE58kv2cA/kiky1G8vux3OR6/jbS+jIMe/6fJWa0DTzJU7dqijOWYwHi1t29FlfYI9uytqlpA==",
"dev": true,
"license": "MIT",
"dependencies": {
"balanced-match": "^1.0.0"
}
},
"node_modules/@redocly/openapi-core/node_modules/minimatch": {
"version": "5.1.9",
"resolved": "https://registry.npmjs.org/minimatch/-/minimatch-5.1.9.tgz",
"integrity": "sha512-7o1wEA2RyMP7Iu7GNba9vc0RWWGACJOCZBJX2GJWip0ikV+wcOsgVuY9uE8CPiyQhkGFSlhuSkZPavN7u1c2Fw==",
"dev": true,
"license": "ISC",
"dependencies": {
"brace-expansion": "^2.0.1"
},
"engines": {
"node": ">=10"
}
},
"node_modules/@remix-run/router": {
"version": "1.23.2",
"resolved": "https://registry.npmjs.org/@remix-run/router/-/router-1.23.2.tgz",
@@ -1906,6 +1983,16 @@
"acorn": "^6.0.0 || ^7.0.0 || ^8.0.0"
}
},
"node_modules/agent-base": {
"version": "7.1.4",
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-7.1.4.tgz",
"integrity": "sha512-MnA+YT8fwfJPgBx3m60MNqakm30XOkyIoH1y6huTQvC0PwZG7ki8NacLBcrPbNoo8vEZy7Jpuk7+jMO+CUovTQ==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">= 14"
}
},
"node_modules/ajv": {
"version": "6.14.0",
"resolved": "https://registry.npmjs.org/ajv/-/ajv-6.14.0.tgz",
@@ -1923,6 +2010,16 @@
"url": "https://github.com/sponsors/epoberezkin"
}
},
"node_modules/ansi-colors": {
"version": "4.1.3",
"resolved": "https://registry.npmjs.org/ansi-colors/-/ansi-colors-4.1.3.tgz",
"integrity": "sha512-/6w/C21Pm1A7aZitlI5Ni/2J6FFQN8i1Cvz3kHABAAbw93v/NlvKdVOqz7CCWz/3iv/JplRSEEZ83XION15ovw==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6"
}
},
"node_modules/ansi-styles": {
"version": "4.3.0",
"resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-4.3.0.tgz",
@@ -2190,6 +2287,13 @@
"url": "https://github.com/chalk/chalk?sponsor=1"
}
},
"node_modules/change-case": {
"version": "5.4.4",
"resolved": "https://registry.npmjs.org/change-case/-/change-case-5.4.4.tgz",
"integrity": "sha512-HRQyTk2/YPEkt9TnUPbOpr64Uw3KOicFWPVBb+xiHvd6eBx/qPr9xqfBFDT8P2vWsvvz4jbEkfDe71W3VyNu2w==",
"dev": true,
"license": "MIT"
},
"node_modules/chokidar": {
"version": "3.6.0",
"resolved": "https://registry.npmjs.org/chokidar/-/chokidar-3.6.0.tgz",
@@ -2257,6 +2361,13 @@
"dev": true,
"license": "MIT"
},
"node_modules/colorette": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/colorette/-/colorette-1.4.0.tgz",
"integrity": "sha512-Y2oEozpomLn7Q3HFP7dpww7AtMJplbM9lGZP6RDfHqmbeRjiwRg4n6VM6j4KLmRke85uWEI7JqF17f3pqdRA0g==",
"dev": true,
"license": "MIT"
},
"node_modules/combined-stream": {
"version": "1.0.8",
"resolved": "https://registry.npmjs.org/combined-stream/-/combined-stream-1.0.8.tgz",
@@ -3165,6 +3276,20 @@
"node": ">= 0.4"
}
},
"node_modules/https-proxy-agent": {
"version": "7.0.6",
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-7.0.6.tgz",
"integrity": "sha512-vK9P5/iUfdl95AI+JVyUuIcVtd4ofvtrOr3HNtM2yxC9bnMbEdp3x01OhQNnjb8IJYi38VlTE3mBXwcfvywuSw==",
"dev": true,
"license": "MIT",
"dependencies": {
"agent-base": "^7.1.2",
"debug": "4"
},
"engines": {
"node": ">= 14"
}
},
"node_modules/ignore": {
"version": "5.3.2",
"resolved": "https://registry.npmjs.org/ignore/-/ignore-5.3.2.tgz",
@@ -3202,6 +3327,19 @@
"node": ">=0.8.19"
}
},
"node_modules/index-to-position": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/index-to-position/-/index-to-position-1.2.0.tgz",
"integrity": "sha512-Yg7+ztRkqslMAS2iFaU+Oa4KTSidr63OsFGlOrJoW981kIYO3CGCS3wA95P1mUi/IVSJkn0D479KTJpVpvFNuw==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=18"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/internmap": {
"version": "2.0.3",
"resolved": "https://registry.npmjs.org/internmap/-/internmap-2.0.3.tgz",
@@ -3290,6 +3428,16 @@
"jiti": "bin/jiti.js"
}
},
"node_modules/js-levenshtein": {
"version": "1.1.6",
"resolved": "https://registry.npmjs.org/js-levenshtein/-/js-levenshtein-1.1.6.tgz",
"integrity": "sha512-X2BB11YZtrRqY4EnQcLX5Rh373zbK4alC1FW7D7MBhL2gtcC17cTnr6DmfHZeS0s2rTHjUTMMHfG7gO8SSdw+g==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/js-tokens": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz",
@@ -3608,6 +3756,40 @@
"node": ">= 6"
}
},
"node_modules/openapi-typescript": {
"version": "7.13.0",
"resolved": "https://registry.npmjs.org/openapi-typescript/-/openapi-typescript-7.13.0.tgz",
"integrity": "sha512-EFP392gcqXS7ntPvbhBzbF8TyBA+baIYEm791Hy5YkjDYKTnk/Tn5OQeKm5BIZvJihpp8Zzr4hzx0Irde1LNGQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"@redocly/openapi-core": "^1.34.6",
"ansi-colors": "^4.1.3",
"change-case": "^5.4.4",
"parse-json": "^8.3.0",
"supports-color": "^10.2.2",
"yargs-parser": "^21.1.1"
},
"bin": {
"openapi-typescript": "bin/cli.js"
},
"peerDependencies": {
"typescript": "^5.x"
}
},
"node_modules/openapi-typescript/node_modules/supports-color": {
"version": "10.2.2",
"resolved": "https://registry.npmjs.org/supports-color/-/supports-color-10.2.2.tgz",
"integrity": "sha512-SS+jx45GF1QjgEXQx4NJZV9ImqmO2NPz5FNsIHrsDjh2YsHnawpan7SNQ1o8NuhrbHZy9AZhIoCUiCeaW/C80g==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=18"
},
"funding": {
"url": "https://github.com/chalk/supports-color?sponsor=1"
}
},
"node_modules/optionator": {
"version": "0.9.4",
"resolved": "https://registry.npmjs.org/optionator/-/optionator-0.9.4.tgz",
@@ -3671,6 +3853,24 @@
"node": ">=6"
}
},
"node_modules/parse-json": {
"version": "8.3.0",
"resolved": "https://registry.npmjs.org/parse-json/-/parse-json-8.3.0.tgz",
"integrity": "sha512-ybiGyvspI+fAoRQbIPRddCcSTV9/LsJbf0e/S85VLowVGzRmokfneg2kwVW/KU5rOXrPSbF1qAKPMgNTqqROQQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/code-frame": "^7.26.2",
"index-to-position": "^1.1.0",
"type-fest": "^4.39.1"
},
"engines": {
"node": ">=18"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/path-exists": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/path-exists/-/path-exists-4.0.0.tgz",
@@ -3738,6 +3938,16 @@
"node": ">= 6"
}
},
"node_modules/pluralize": {
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/pluralize/-/pluralize-8.0.0.tgz",
"integrity": "sha512-Nc3IT5yHzflTfbjgqWcCPpo7DaKy4FnpB0l/zCAW0Tc7jxAiuqSxHasntB3D7887LSrA93kDJ9IXovxJYxyLCA==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=4"
}
},
"node_modules/postcss": {
"version": "8.5.8",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.8.tgz",
@@ -4124,6 +4334,16 @@
"decimal.js-light": "^2.4.1"
}
},
"node_modules/require-from-string": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/require-from-string/-/require-from-string-2.0.2.tgz",
"integrity": "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/resolve": {
"version": "1.22.11",
"resolved": "https://registry.npmjs.org/resolve/-/resolve-1.22.11.tgz",
@@ -4510,6 +4730,19 @@
"node": ">= 0.8.0"
}
},
"node_modules/type-fest": {
"version": "4.41.0",
"resolved": "https://registry.npmjs.org/type-fest/-/type-fest-4.41.0.tgz",
"integrity": "sha512-TeTSQ6H5YHvpqVwBRcnLDCBnDOHWYu7IvGbHT6N8AOymcr9PJGjc1GTtiWZTYg0NCgYwvnYWEkVChQAr9bjfwA==",
"dev": true,
"license": "(MIT OR CC0-1.0)",
"engines": {
"node": ">=16"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/typescript": {
"version": "5.5.4",
"resolved": "https://registry.npmjs.org/typescript/-/typescript-5.5.4.tgz",
@@ -4589,6 +4822,13 @@
"punycode": "^2.1.0"
}
},
"node_modules/uri-js-replace": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/uri-js-replace/-/uri-js-replace-1.0.1.tgz",
"integrity": "sha512-W+C9NWNLFOoBI2QWDp4UT9pv65r2w5Cx+3sTYFvtMdDBxkKt1syCqsUdSFAChbEe1uK5TfS04wt/nGwmaeIQ0g==",
"dev": true,
"license": "MIT"
},
"node_modules/util-deprecate": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/util-deprecate/-/util-deprecate-1.0.2.tgz",
@@ -4711,6 +4951,23 @@
"dev": true,
"license": "ISC"
},
"node_modules/yaml-ast-parser": {
"version": "0.0.43",
"resolved": "https://registry.npmjs.org/yaml-ast-parser/-/yaml-ast-parser-0.0.43.tgz",
"integrity": "sha512-2PTINUwsRqSd+s8XxKaJWQlUuEMHJQyEuh2edBbW8KNJz0SJPwUSD2zRWqezFEdN7IzAgeuYHFUCF7o8zRdZ0A==",
"dev": true,
"license": "Apache-2.0"
},
"node_modules/yargs-parser": {
"version": "21.1.1",
"resolved": "https://registry.npmjs.org/yargs-parser/-/yargs-parser-21.1.1.tgz",
"integrity": "sha512-tVpsJW7DdjecAiFpbIB1e3qxIQsE6NoPc5/eTdrbbIC4h0LVsWhnoa3g+m2HclBIujHzsxZ4VJVA+GUuc2/LBw==",
"dev": true,
"license": "ISC",
"engines": {
"node": ">=12"
}
},
"node_modules/yocto-queue": {
"version": "0.1.0",
"resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-0.1.0.tgz",
+3
View File
@@ -7,6 +7,8 @@
"dev": "vite",
"build": "tsc -b && vite build",
"lint": "eslint .",
"generate": "openapi-typescript http://localhost:8000/api/openapi.json -o src/api/schema.d.ts",
"generate:local": "openapi-typescript src/api/openapi.json -o src/api/schema.d.ts",
"typecheck": "tsc --noEmit",
"preview": "vite preview"
},
@@ -31,6 +33,7 @@
"globals": "^15.8.0",
"postcss": "^8.4.39",
"tailwindcss": "^3.4.4",
"openapi-typescript": "^7.0.0",
"typescript": "~5.5.3",
"typescript-eslint": "^8.0.0",
"vite": "^5.3.3"
+14
View File
@@ -1,6 +1,7 @@
import { BrowserRouter, Routes, Route, Navigate } from 'react-router-dom';
import { QueryClient, QueryClientProvider } from '@tanstack/react-query';
import { AuthProvider } from './context/AuthContext';
import { ThemeProvider } from './context/ThemeContext';
import { Layout } from './components/Layout';
import { ProtectedRoute } from './components/ProtectedRoute';
import { Login } from './pages/Login';
@@ -10,6 +11,8 @@ import { Batch } from './pages/Batch';
import { AnalyticsPage } from './pages/Analytics';
import { About } from './pages/About';
import { AdminUsers } from './pages/AdminUsers';
import { AdminRateLimits } from './pages/AdminRateLimits';
import { Compare } from './pages/Compare';
const queryClient = new QueryClient({
defaultOptions: {
@@ -22,6 +25,7 @@ const queryClient = new QueryClient({
function App() {
return (
<ThemeProvider>
<QueryClientProvider client={queryClient}>
<AuthProvider>
<BrowserRouter>
@@ -41,6 +45,7 @@ function App() {
<Route path="/analysis" element={<Analysis />} />
<Route path="/batch" element={<Batch />} />
<Route path="/analytics" element={<AnalyticsPage />} />
<Route path="/compare" element={<Compare />} />
<Route path="/about" element={<About />} />
{/* Admin routes */}
@@ -52,6 +57,14 @@ function App() {
</ProtectedRoute>
}
/>
<Route
path="/admin/rate-limits"
element={
<ProtectedRoute requireAdmin>
<AdminRateLimits />
</ProtectedRoute>
}
/>
</Route>
{/* Default redirect */}
@@ -61,6 +74,7 @@ function App() {
</BrowserRouter>
</AuthProvider>
</QueryClientProvider>
</ThemeProvider>
);
}
+102 -4
View File
@@ -89,29 +89,53 @@ export const authApi = {
},
};
// Model types
export interface ModelInfo {
id: string;
name: string;
provider: string;
}
export interface ModelsResponse {
models: ModelInfo[];
default: string;
}
// Analysis API
export const analysisApi = {
analyzeCompany: async (companyName: string): Promise<CompanyAnalysis> => {
const response = await api.get<CompanyAnalysis>(`/analyze/${encodeURIComponent(companyName)}`);
analyzeCompany: async (companyName: string, model?: string): Promise<CompanyAnalysis> => {
const params = new URLSearchParams();
if (model) params.append('model', model);
const qs = params.toString();
const response = await api.get<CompanyAnalysis>(
`/analyze/${encodeURIComponent(companyName)}${qs ? `?${qs}` : ''}`
);
return response.data;
},
analyzeBatch: async (companies: string[], maxWorkers = 3): Promise<BatchAnalysisResult> => {
analyzeBatch: async (companies: string[], maxWorkers = 3, model?: string): Promise<BatchAnalysisResult> => {
const response = await api.post<BatchAnalysisResult>('/analyze/batch', {
companies,
max_workers: maxWorkers,
...(model ? { model } : {}),
});
return response.data;
},
analyzeBatchAsync: async (companies: string[], maxWorkers = 3): Promise<JobStatus> => {
analyzeBatchAsync: async (companies: string[], maxWorkers = 3, model?: string): Promise<JobStatus> => {
const response = await api.post<JobStatus>('/analyze/batch/async', {
companies,
max_workers: maxWorkers,
...(model ? { model } : {}),
});
return response.data;
},
listModels: async (): Promise<ModelsResponse> => {
const response = await api.get<ModelsResponse>('/models');
return response.data;
},
getJobStatus: async (jobId: string): Promise<JobStatus> => {
const response = await api.get<JobStatus>(`/jobs/${jobId}`);
return response.data;
@@ -126,14 +150,83 @@ export const analysisApi = {
},
};
// Export API
export const exportApi = {
exportCsv: async (companyName: string): Promise<void> => {
const response = await api.get(`/export/${encodeURIComponent(companyName)}`, {
responseType: 'blob',
});
const url = window.URL.createObjectURL(new Blob([response.data]));
const link = document.createElement('a');
link.href = url;
link.setAttribute('download', `sparc_${companyName.toLowerCase().replace(/\s+/g, '_')}_export.csv`);
document.body.appendChild(link);
link.click();
link.remove();
window.URL.revokeObjectURL(url);
},
exportPdf: async (companyName: string): Promise<void> => {
const response = await api.get(`/export/${encodeURIComponent(companyName)}/pdf`, {
responseType: 'blob',
});
const safeName = companyName.toLowerCase().replace(/\s+/g, '_');
const date = new Date().toISOString().split('T')[0];
const url = window.URL.createObjectURL(new Blob([response.data], { type: 'application/pdf' }));
const link = document.createElement('a');
link.href = url;
link.setAttribute('download', `${safeName}-analysis-${date}.pdf`);
document.body.appendChild(link);
link.click();
link.remove();
window.URL.revokeObjectURL(url);
},
};
// Analytics API
export interface TrendData {
by_month: Array<{ month: string; company_name: string; count: number }>;
by_type_over_time: Array<{ month: string; analysis_type: string; count: number }>;
period_days: number;
}
export const analyticsApi = {
getAnalytics: async (days = 30): Promise<Analytics> => {
const response = await api.get<Analytics>(`/analytics?days=${days}`);
return response.data;
},
getTrends: async (days = 90): Promise<TrendData> => {
const response = await api.get<TrendData>(`/analytics/trends?days=${days}`);
return response.data;
},
};
// Rate limit types
export interface RateLimitIpEntry {
ip: string;
total: number;
rejected: number;
}
export interface RateLimitEndpointStats {
endpoint: string;
limit: string;
total_requests: number;
rejected_requests: number;
by_ip: RateLimitIpEntry[];
}
export interface ThrottledBucket {
timestamp: string;
count: number;
}
export interface RateLimitStatsResponse {
rate_limits: RateLimitEndpointStats[];
throttled_24h: number;
throttled_over_time: ThrottledBucket[];
}
// Admin API
export const adminApi = {
listUsers: async (limit = 100, offset = 0): Promise<User[]> => {
@@ -149,6 +242,11 @@ export const adminApi = {
deleteUser: async (userId: number): Promise<void> => {
await api.delete(`/admin/users/${userId}`);
},
getRateLimits: async (): Promise<RateLimitStatsResponse> => {
const response = await api.get<RateLimitStatsResponse>('/admin/rate-limits');
return response.data;
},
};
export default api;
File diff suppressed because it is too large Load Diff
+975
View File
@@ -0,0 +1,975 @@
/**
* This file was auto-generated by openapi-typescript.
* Do not make direct changes to the file.
*/
export interface paths {
"/auth/register": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Register
* @description Register a new user.
*
* The first registered user automatically becomes an admin.
*/
post: operations["register_auth_register_post"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/auth/login": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Login
* @description Authenticate user and return JWT tokens.
*/
post: operations["login_auth_login_post"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/auth/refresh": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Refresh Token
* @description Refresh access token using refresh token.
*/
post: operations["refresh_token_auth_refresh_post"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/auth/me": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Get Me
* @description Get current authenticated user.
*/
get: operations["get_me_auth_me_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/admin/users": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* List Users
* @description List all users (admin only).
*/
get: operations["list_users_admin_users_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/admin/users/{user_id}/role": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
/**
* Update User Role
* @description Update a user's role (admin only).
*/
patch: operations["update_user_role_admin_users__user_id__role_patch"];
trace?: never;
};
"/admin/users/{user_id}": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
post?: never;
/**
* Delete User
* @description Delete a user (admin only).
*/
delete: operations["delete_user_admin_users__user_id__delete"];
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/analytics": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Get Analytics
* @description Get analytics data (authenticated users only).
*/
get: operations["get_analytics_analytics_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/health": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Health Check
* @description Check API health status.
*/
get: operations["health_check_health_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/analyze/{company_name}": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Analyze Company
* @description Analyze a single company's patent portfolio.
*
* This endpoint retrieves recent patents for the specified company,
* parses them, and uses AI to generate a comprehensive analysis.
*
* Args:
* company_name: Name of the company to analyze (e.g., "nvidia", "intel")
*
* Returns:
* Analysis results including patent count, AI insights, and success status
*/
get: operations["analyze_company_analyze__company_name__get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/analyze/batch": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Analyze Companies Batch
* @description Analyze multiple companies' patent portfolios.
*
* Processes companies concurrently for improved performance.
* Limited to 20 companies per request.
*
* Args:
* request: List of company names and optional worker count
*
* Returns:
* Batch results with individual company analyses and summary statistics
*/
post: operations["analyze_companies_batch_analyze_batch_post"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/analyze/batch/async": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
get?: never;
put?: never;
/**
* Analyze Companies Async
* @description Start an asynchronous batch analysis job.
*
* Returns immediately with a job ID that can be used to poll for status.
* Useful for large batch analyses that may take a long time.
*
* Args:
* request: List of company names and optional worker count
*
* Returns:
* Job status with job_id for polling
*/
post: operations["analyze_companies_async_analyze_batch_async_post"];
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/jobs/{job_id}": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* Get Job Status
* @description Get the status of a background analysis job.
*
* Args:
* job_id: The job ID returned from the async batch endpoint
*
* Returns:
* Current job status including progress and results when complete
*/
get: operations["get_job_status_jobs__job_id__get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
"/jobs": {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
/**
* List Jobs
* @description List all analysis jobs.
*
* Args:
* status: Optional filter by job status
* limit: Maximum number of jobs to return (default 10, max 100)
*
* Returns:
* List of job statuses
*/
get: operations["list_jobs_jobs_get"];
put?: never;
post?: never;
delete?: never;
options?: never;
head?: never;
patch?: never;
trace?: never;
};
}
export type webhooks = Record<string, never>;
export interface components {
schemas: {
/**
* AnalyticsResponse
* @description Analytics response model.
*/
AnalyticsResponse: {
/** Total Messages */
total_messages: number;
/** By Company */
by_company: {
[key: string]: unknown;
}[];
/** By Type */
by_type: {
[key: string]: unknown;
}[];
/** Period Days */
period_days: number;
};
/**
* BatchAnalysisRequest
* @description Request model for batch company analysis.
*/
BatchAnalysisRequest: {
/**
* Companies
* @description List of company names to analyze
*/
companies: string[];
/**
* Max Workers
* @description Max concurrent analyses
* @default 3
*/
max_workers: number;
};
/**
* BatchAnalysisResponse
* @description Response model for batch company analysis.
*/
BatchAnalysisResponse: {
/** Results */
results: components["schemas"]["CompanyAnalysisResponse"][];
/** Total Companies */
total_companies: number;
/** Successful */
successful: number;
/** Failed */
failed: number;
/**
* Timestamp
* Format: date-time
*/
timestamp: string;
};
/**
* CompanyAnalysisResponse
* @description Response model for single company analysis.
*/
CompanyAnalysisResponse: {
/** Company Name */
company_name: string;
/** Analysis */
analysis: string;
/** Patent Count */
patent_count: number;
/** Success */
success: boolean;
/** Error */
error?: string | null;
/**
* Timestamp
* Format: date-time
*/
timestamp: string;
};
/** HTTPValidationError */
HTTPValidationError: {
/** Detail */
detail?: components["schemas"]["ValidationError"][];
};
/**
* HealthResponse
* @description Health check response.
*/
HealthResponse: {
/** Status */
status: string;
/** Version */
version: string;
/**
* Timestamp
* Format: date-time
*/
timestamp: string;
};
/**
* JobStatus
* @description Status of a background analysis job.
*/
JobStatus: {
/** Job Id */
job_id: string;
/** Status */
status: string;
/** Progress */
progress: number;
/** Total Companies */
total_companies: number;
/** Completed Companies */
completed_companies: number;
result?: components["schemas"]["BatchAnalysisResponse"] | null;
/** Error */
error?: string | null;
};
/**
* LoginRequest
* @description User login request.
*/
LoginRequest: {
/**
* Email
* Format: email
*/
email: string;
/** Password */
password: string;
};
/**
* RefreshRequest
* @description Token refresh request.
*/
RefreshRequest: {
/** Refresh Token */
refresh_token: string;
};
/**
* RegisterRequest
* @description User registration request.
*/
RegisterRequest: {
/**
* Email
* Format: email
*/
email: string;
/**
* Password
* @description Password (min 8 characters)
*/
password: string;
};
/**
* TokenResponse
* @description Token response model.
*/
TokenResponse: {
/** Access Token */
access_token: string;
/** Refresh Token */
refresh_token: string;
/**
* Token Type
* @default bearer
*/
token_type: string;
};
/**
* UpdateRoleRequest
* @description Update user role request.
*/
UpdateRoleRequest: {
/** Role */
role: string;
};
/**
* UserResponse
* @description User response model.
*/
UserResponse: {
/** Id */
id: number;
/** Email */
email: string;
/** Role */
role: string;
/**
* Created At
* Format: date-time
*/
created_at: string;
};
/** ValidationError */
ValidationError: {
/** Location */
loc: (string | number)[];
/** Message */
msg: string;
/** Error Type */
type: string;
/** Input */
input?: unknown;
/** Context */
ctx?: Record<string, never>;
};
};
responses: never;
parameters: never;
requestBodies: never;
headers: never;
pathItems: never;
}
export type $defs = Record<string, never>;
export interface operations {
register_auth_register_post: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["RegisterRequest"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["UserResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
login_auth_login_post: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["LoginRequest"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["TokenResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
refresh_token_auth_refresh_post: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["RefreshRequest"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["TokenResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
get_me_auth_me_get: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["UserResponse"];
};
};
};
};
list_users_admin_users_get: {
parameters: {
query?: {
limit?: number;
offset?: number;
};
header?: never;
path?: never;
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["UserResponse"][];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
update_user_role_admin_users__user_id__role_patch: {
parameters: {
query?: never;
header?: never;
path: {
user_id: number;
};
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["UpdateRoleRequest"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["UserResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
delete_user_admin_users__user_id__delete: {
parameters: {
query?: never;
header?: never;
path: {
user_id: number;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": unknown;
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
get_analytics_analytics_get: {
parameters: {
query?: {
days?: number;
};
header?: never;
path?: never;
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["AnalyticsResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
health_check_health_get: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HealthResponse"];
};
};
};
};
analyze_company_analyze__company_name__get: {
parameters: {
query?: never;
header?: never;
path: {
company_name: string;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["CompanyAnalysisResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
analyze_companies_batch_analyze_batch_post: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["BatchAnalysisRequest"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["BatchAnalysisResponse"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
analyze_companies_async_analyze_batch_async_post: {
parameters: {
query?: never;
header?: never;
path?: never;
cookie?: never;
};
requestBody: {
content: {
"application/json": components["schemas"]["BatchAnalysisRequest"];
};
};
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["JobStatus"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
get_job_status_jobs__job_id__get: {
parameters: {
query?: never;
header?: never;
path: {
job_id: string;
};
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["JobStatus"];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
list_jobs_jobs_get: {
parameters: {
query?: {
/** @description Filter by status: pending, running, completed, failed */
status?: string | null;
limit?: number;
};
header?: never;
path?: never;
cookie?: never;
};
requestBody?: never;
responses: {
/** @description Successful Response */
200: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["JobStatus"][];
};
};
/** @description Validation Error */
422: {
headers: {
[name: string]: unknown;
};
content: {
"application/json": components["schemas"]["HTTPValidationError"];
};
};
};
};
}
+13 -2
View File
@@ -1,9 +1,11 @@
import { Outlet, NavLink, useNavigate } from 'react-router-dom';
import { useAuth } from '../context/AuthContext';
import { Search, Layers, BarChart3, Info, Users, LogOut } from 'lucide-react';
import { useTheme } from '../context/ThemeContext';
import { Search, Layers, BarChart3, Info, Users, LogOut, GitCompareArrows, Sun, Moon, ShieldAlert } from 'lucide-react';
export function Layout() {
const { user, isAdmin, logout } = useAuth();
const { theme, toggleTheme } = useTheme();
const navigate = useNavigate();
const handleLogout = () => {
@@ -15,15 +17,17 @@ export function Layout() {
{ to: '/analysis', icon: Search, label: 'Analysis' },
{ to: '/batch', icon: Layers, label: 'Batch' },
{ to: '/analytics', icon: BarChart3, label: 'Analytics' },
{ to: '/compare', icon: GitCompareArrows, label: 'Compare' },
{ to: '/about', icon: Info, label: 'About' },
];
if (isAdmin) {
navItems.push({ to: '/admin/users', icon: Users, label: 'Users' });
navItems.push({ to: '/admin/rate-limits', icon: ShieldAlert, label: 'Rate Limits' });
}
return (
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950">
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950">
{/* Header */}
<header className="bg-bg-card/80 backdrop-blur-lg border-b border-primary/20">
<div className="max-w-7xl mx-auto px-4 sm:px-6 lg:px-8">
@@ -63,6 +67,13 @@ export function Layout() {
{/* User menu */}
<div className="flex items-center gap-4">
<button
onClick={toggleTheme}
className="p-2 rounded-lg text-text-secondary hover:text-text-primary hover:bg-bg-card-hover transition-all"
aria-label={theme === 'dark' ? 'Switch to light mode' : 'Switch to dark mode'}
>
{theme === 'dark' ? <Sun size={18} /> : <Moon size={18} />}
</button>
<div className="text-right hidden sm:block">
<div className="text-sm font-medium text-text-primary">{user?.email}</div>
<div className="text-xs text-text-secondary capitalize">{user?.role}</div>
+1 -1
View File
@@ -12,7 +12,7 @@ export function ProtectedRoute({ children, requireAdmin = false }: ProtectedRout
if (isLoading) {
return (
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950 flex items-center justify-center">
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950 flex items-center justify-center">
<div className="animate-spin rounded-full h-12 w-12 border-t-2 border-b-2 border-primary"></div>
</div>
);
+48
View File
@@ -0,0 +1,48 @@
import { createContext, useContext, useEffect, useState } from 'react';
type Theme = 'light' | 'dark';
interface ThemeContextType {
theme: Theme;
toggleTheme: () => void;
}
const ThemeContext = createContext<ThemeContextType | undefined>(undefined);
function getInitialTheme(): Theme {
const stored = localStorage.getItem('theme');
if (stored === 'light' || stored === 'dark') return stored;
return window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light';
}
export function ThemeProvider({ children }: { children: React.ReactNode }) {
const [theme, setTheme] = useState<Theme>(getInitialTheme);
useEffect(() => {
const root = document.documentElement;
if (theme === 'dark') {
root.classList.add('dark');
} else {
root.classList.remove('dark');
}
localStorage.setItem('theme', theme);
}, [theme]);
const toggleTheme = () => {
setTheme((prev) => (prev === 'dark' ? 'light' : 'dark'));
};
return (
<ThemeContext.Provider value={{ theme, toggleTheme }}>
{children}
</ThemeContext.Provider>
);
}
export function useTheme() {
const context = useContext(ThemeContext);
if (!context) {
throw new Error('useTheme must be used within a ThemeProvider');
}
return context;
}
+41
View File
@@ -0,0 +1,41 @@
import { useTheme } from './ThemeContext';
/**
* Returns theme-aware color values for recharts components.
*
* Recharts accepts only raw color strings (not CSS variables),
* so this hook bridges the Tailwind/CSS-variable theme system
* to the imperative recharts API.
*/
export function useChartTheme() {
const { theme } = useTheme();
const isDark = theme === 'dark';
return {
/** Axis tick and grid line stroke color */
axisStroke: isDark ? '#94a3b8' : '#64748b',
/** Tooltip container background */
tooltipBg: isDark ? '#1e293b' : '#ffffff',
/** Tooltip container border */
tooltipBorder: isDark
? '1px solid rgba(99, 102, 241, 0.3)'
: '1px solid rgba(99, 102, 241, 0.2)',
/** Tooltip label text color */
tooltipLabelColor: isDark ? '#f8fafc' : '#0f172a',
/** Tooltip item text color */
tooltipItemColor: isDark ? '#e2e8f0' : '#334155',
/** Convenience: full contentStyle object for recharts Tooltip */
tooltipContentStyle: {
backgroundColor: isDark ? '#1e293b' : '#ffffff',
border: isDark
? '1px solid rgba(99, 102, 241, 0.3)'
: '1px solid rgba(99, 102, 241, 0.2)',
borderRadius: '8px',
color: isDark ? '#f8fafc' : '#0f172a',
},
/** Convenience: labelStyle for recharts Tooltip */
tooltipLabelStyle: {
color: isDark ? '#f8fafc' : '#0f172a',
},
};
}
+22 -2
View File
@@ -2,6 +2,26 @@
@tailwind components;
@tailwind utilities;
/* Light mode (default) */
:root {
--color-bg-dark: #f1f5f9;
--color-bg-card: #ffffff;
--color-bg-card-hover: #e2e8f0;
--color-text-primary: #0f172a;
--color-text-secondary: #475569;
--color-border: #cbd5e1;
}
/* Dark mode */
.dark {
--color-bg-dark: #0f172a;
--color-bg-card: #1e293b;
--color-bg-card-hover: #334155;
--color-text-primary: #f8fafc;
--color-text-secondary: #94a3b8;
--color-border: #334155;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
-webkit-font-smoothing: antialiased;
@@ -15,7 +35,7 @@ body {
}
::-webkit-scrollbar-track {
background: #1e293b;
background: var(--color-bg-card);
}
::-webkit-scrollbar-thumb {
@@ -30,5 +50,5 @@ body {
/* Selection */
::selection {
background: rgba(99, 102, 241, 0.3);
color: #f8fafc;
color: var(--color-text-primary);
}
+240
View File
@@ -0,0 +1,240 @@
import { useState } from 'react';
import { useQuery } from '@tanstack/react-query';
import { adminApi } from '../api/client';
import type { RateLimitStatsResponse } from '../api/client';
import { ShieldAlert, Activity, AlertCircle, RefreshCw, Clock } from 'lucide-react';
const REFRESH_OPTIONS = [
{ label: '15s', value: 15_000 },
{ label: '30s', value: 30_000 },
{ label: '1m', value: 60_000 },
{ label: 'Off', value: 0 },
];
export function AdminRateLimits() {
const [refreshInterval, setRefreshInterval] = useState(30_000);
const { data, isLoading, isError, dataUpdatedAt } = useQuery<RateLimitStatsResponse>({
queryKey: ['admin-rate-limits'],
queryFn: () => adminApi.getRateLimits(),
refetchInterval: refreshInterval || false,
});
if (isLoading) {
return (
<div className="flex items-center justify-center min-h-[400px]">
<div className="animate-spin rounded-full h-12 w-12 border-t-2 border-b-2 border-primary"></div>
</div>
);
}
if (isError) {
return (
<div className="flex items-center gap-2 bg-error/10 border border-error/20 text-error rounded-xl px-4 py-3">
<AlertCircle size={18} />
<span>Failed to load rate limit statistics.</span>
</div>
);
}
const maxThrottledCount = data?.throttled_over_time?.length
? Math.max(...data.throttled_over_time.map((b) => b.count))
: 0;
return (
<div className="space-y-6">
{/* Header */}
<div className="flex items-center justify-between flex-wrap gap-4">
<div>
<h2 className="text-xl font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-2">
Rate Limiting Dashboard
</h2>
<p className="text-text-secondary">Monitor API rate limits and throttled requests.</p>
</div>
<div className="flex items-center gap-3">
{/* Last updated */}
{dataUpdatedAt > 0 && (
<span className="text-xs text-text-secondary flex items-center gap-1">
<Clock size={12} />
Updated {new Date(dataUpdatedAt).toLocaleTimeString()}
</span>
)}
{/* Refresh interval selector */}
<div className="flex items-center gap-1 bg-bg-card/60 border border-primary/15 rounded-xl p-1">
<RefreshCw size={14} className="text-text-secondary ml-2" />
{REFRESH_OPTIONS.map((opt) => (
<button
key={opt.value}
onClick={() => setRefreshInterval(opt.value)}
className={`px-3 py-1 rounded-lg text-xs font-medium transition-all ${
refreshInterval === opt.value
? 'bg-primary text-white'
: 'text-text-secondary hover:text-text-primary hover:bg-bg-card-hover'
}`}
>
{opt.label}
</button>
))}
</div>
</div>
</div>
{/* Summary cards */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-4">
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-5">
<div className="flex items-center gap-2 mb-2">
<Activity size={18} className="text-primary" />
<span className="text-sm font-semibold text-text-secondary uppercase tracking-wider">
Total Requests
</span>
</div>
<div className="text-3xl font-bold text-text-primary">
{data?.rate_limits.reduce((sum, rl) => sum + rl.total_requests, 0) ?? 0}
</div>
</div>
<div className="bg-bg-card/60 border border-error/15 rounded-2xl p-5">
<div className="flex items-center gap-2 mb-2">
<ShieldAlert size={18} className="text-error" />
<span className="text-sm font-semibold text-text-secondary uppercase tracking-wider">
Throttled (24h)
</span>
</div>
<div className="text-3xl font-bold text-error">
{data?.throttled_24h ?? 0}
</div>
</div>
<div className="bg-bg-card/60 border border-secondary/15 rounded-2xl p-5">
<div className="flex items-center gap-2 mb-2">
<ShieldAlert size={18} className="text-secondary" />
<span className="text-sm font-semibold text-text-secondary uppercase tracking-wider">
Rate-Limited Endpoints
</span>
</div>
<div className="text-3xl font-bold text-text-primary">
{data?.rate_limits.length ?? 0}
</div>
</div>
</div>
{/* Throttled over time chart (simple bar chart) */}
{data?.throttled_over_time && data.throttled_over_time.length > 0 && (
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-5">
<h3 className="text-sm font-semibold text-text-secondary uppercase tracking-wider mb-4">
Throttled Requests Over Time (Last 24h)
</h3>
<div className="flex items-end gap-1 h-32">
{data.throttled_over_time.map((bucket) => {
const height = maxThrottledCount > 0 ? (bucket.count / maxThrottledCount) * 100 : 0;
const hour = new Date(bucket.timestamp).getHours();
return (
<div key={bucket.timestamp} className="flex-1 flex flex-col items-center gap-1">
<span className="text-xs text-text-secondary">{bucket.count}</span>
<div
className="w-full bg-error/70 rounded-t-sm min-h-[2px] transition-all"
style={{ height: `${Math.max(height, 2)}%` }}
title={`${bucket.timestamp}: ${bucket.count} throttled`}
/>
<span className="text-[10px] text-text-secondary">{hour}:00</span>
</div>
);
})}
</div>
</div>
)}
{/* Per-endpoint table */}
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl overflow-hidden">
<div className="overflow-x-auto">
<table className="w-full">
<thead>
<tr className="border-b border-primary/10">
<th className="text-left px-6 py-4 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Endpoint
</th>
<th className="text-left px-6 py-4 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Limit
</th>
<th className="text-right px-6 py-4 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Total Requests
</th>
<th className="text-right px-6 py-4 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Rejected
</th>
</tr>
</thead>
<tbody className="divide-y divide-primary/10">
{data?.rate_limits.map((rl) => (
<tr key={rl.endpoint} className="hover:bg-bg-card-hover/50 transition-colors">
<td className="px-6 py-4 font-mono text-sm text-text-primary">{rl.endpoint}</td>
<td className="px-6 py-4">
<span className="inline-flex px-2 py-0.5 rounded-full text-xs font-medium bg-primary/10 text-primary border border-primary/20">
{rl.limit}
</span>
</td>
<td className="px-6 py-4 text-right text-text-primary font-semibold">
{rl.total_requests}
</td>
<td className="px-6 py-4 text-right">
<span className={rl.rejected_requests > 0 ? 'text-error font-semibold' : 'text-text-secondary'}>
{rl.rejected_requests}
</span>
</td>
</tr>
))}
</tbody>
</table>
</div>
</div>
{/* Per-IP breakdown */}
{data?.rate_limits.some((rl) => rl.by_ip.length > 0) && (
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl overflow-hidden">
<div className="px-6 py-4 border-b border-primary/10">
<h3 className="text-sm font-semibold text-text-secondary uppercase tracking-wider">
Per-IP Breakdown
</h3>
</div>
<div className="overflow-x-auto">
<table className="w-full">
<thead>
<tr className="border-b border-primary/10">
<th className="text-left px-6 py-3 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Endpoint
</th>
<th className="text-left px-6 py-3 text-sm font-semibold text-text-secondary uppercase tracking-wider">
IP Address
</th>
<th className="text-right px-6 py-3 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Total
</th>
<th className="text-right px-6 py-3 text-sm font-semibold text-text-secondary uppercase tracking-wider">
Rejected
</th>
</tr>
</thead>
<tbody className="divide-y divide-primary/10">
{data.rate_limits.flatMap((rl) =>
rl.by_ip.map((ipEntry) => (
<tr
key={`${rl.endpoint}-${ipEntry.ip}`}
className="hover:bg-bg-card-hover/50 transition-colors"
>
<td className="px-6 py-3 font-mono text-sm text-text-primary">{rl.endpoint}</td>
<td className="px-6 py-3 font-mono text-sm text-text-secondary">{ipEntry.ip}</td>
<td className="px-6 py-3 text-right text-text-primary">{ipEntry.total}</td>
<td className="px-6 py-3 text-right">
<span className={ipEntry.rejected > 0 ? 'text-error font-semibold' : 'text-text-secondary'}>
{ipEntry.rejected}
</span>
</td>
</tr>
))
)}
</tbody>
</table>
</div>
</div>
)}
</div>
);
}
+82 -32
View File
@@ -1,15 +1,21 @@
import { useState } from 'react';
import { useMutation } from '@tanstack/react-query';
import { analysisApi } from '../api/client';
import { Search, CheckCircle, AlertCircle, Clock, FileText } from 'lucide-react';
import { useMutation, useQuery } from '@tanstack/react-query';
import { analysisApi, exportApi } from '../api/client';
import { Search, CheckCircle, AlertCircle, Clock, FileText, Download, ChevronDown } from 'lucide-react';
import type { CompanyAnalysis } from '../types';
export function Analysis() {
const [companyName, setCompanyName] = useState('');
const [selectedModel, setSelectedModel] = useState('');
const [result, setResult] = useState<CompanyAnalysis | null>(null);
const modelsQuery = useQuery({
queryKey: ['models'],
queryFn: () => analysisApi.listModels(),
});
const mutation = useMutation({
mutationFn: (name: string) => analysisApi.analyzeCompany(name),
mutationFn: (name: string) => analysisApi.analyzeCompany(name, selectedModel || undefined),
onSuccess: (data) => setResult(data),
});
@@ -33,31 +39,57 @@ export function Analysis() {
</div>
{/* Search Form */}
<form onSubmit={handleSubmit} className="flex gap-4">
<div className="flex-1 relative">
<Search className="absolute left-4 top-1/2 -translate-y-1/2 text-text-secondary" size={18} />
<input
type="text"
value={companyName}
onChange={(e) => setCompanyName(e.target.value)}
placeholder="Enter company name (e.g., nvidia, intel, amd)"
className="w-full bg-bg-card/80 border border-primary/30 rounded-xl pl-12 pr-4 py-3 text-text-primary placeholder-text-secondary/50 focus:outline-none focus:border-primary focus:ring-2 focus:ring-primary/20 transition-all"
/>
<form onSubmit={handleSubmit} className="space-y-4">
<div className="flex gap-4">
<div className="flex-1 relative">
<Search className="absolute left-4 top-1/2 -translate-y-1/2 text-text-secondary" size={18} />
<input
type="text"
value={companyName}
onChange={(e) => setCompanyName(e.target.value)}
placeholder="Enter company name (e.g., nvidia, intel, amd)"
className="w-full bg-bg-card/80 border border-primary/30 rounded-xl pl-12 pr-4 py-3 text-text-primary placeholder-text-secondary/50 focus:outline-none focus:border-primary focus:ring-2 focus:ring-primary/20 transition-all"
/>
</div>
<button
type="submit"
disabled={mutation.isPending || !companyName.trim()}
className="bg-gradient-to-r from-primary to-primary-dark text-white font-semibold py-3 px-6 rounded-xl hover:shadow-lg hover:shadow-primary/30 transition-all disabled:opacity-50 disabled:cursor-not-allowed flex items-center gap-2"
>
{mutation.isPending ? (
<div className="animate-spin rounded-full h-5 w-5 border-t-2 border-b-2 border-white"></div>
) : (
<>
<Search size={18} />
Analyze
</>
)}
</button>
</div>
{/* Model Selector */}
<div className="flex items-center gap-3">
<label className="text-sm font-medium text-text-secondary whitespace-nowrap">
LLM Model
</label>
<div className="relative flex-1 max-w-xs">
<select
value={selectedModel}
onChange={(e) => setSelectedModel(e.target.value)}
className="w-full appearance-none bg-bg-card/80 border border-primary/30 rounded-lg pl-3 pr-8 py-2 text-sm text-text-primary focus:outline-none focus:border-primary focus:ring-2 focus:ring-primary/20 transition-all cursor-pointer"
>
<option value="">
{modelsQuery.data ? `Default (${modelsQuery.data.default})` : 'Default'}
</option>
{modelsQuery.data?.models.map((m) => (
<option key={m.id} value={m.id}>
{m.name} ({m.provider})
</option>
))}
</select>
<ChevronDown className="absolute right-2 top-1/2 -translate-y-1/2 text-text-secondary pointer-events-none" size={16} />
</div>
</div>
<button
type="submit"
disabled={mutation.isPending || !companyName.trim()}
className="bg-gradient-to-r from-primary to-primary-dark text-white font-semibold py-3 px-6 rounded-xl hover:shadow-lg hover:shadow-primary/30 transition-all disabled:opacity-50 disabled:cursor-not-allowed flex items-center gap-2"
>
{mutation.isPending ? (
<div className="animate-spin rounded-full h-5 w-5 border-t-2 border-b-2 border-white"></div>
) : (
<>
<Search size={18} />
Analyze
</>
)}
</button>
</form>
{/* Error */}
@@ -106,10 +138,28 @@ export function Analysis() {
{/* Analysis Content */}
{result.success && result.analysis && (
<div className="bg-bg-card/60 backdrop-blur-lg border border-primary/15 rounded-2xl p-6">
<h3 className="text-lg font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-4">
AI Analysis Results
</h3>
<div className="prose prose-invert max-w-none">
<div className="flex items-center justify-between border-b-2 border-primary/30 pb-2 mb-4">
<h3 className="text-lg font-semibold text-text-primary">
AI Analysis Results
</h3>
<div className="flex items-center gap-2">
<button
onClick={() => exportApi.exportCsv(result.company_name)}
className="flex items-center gap-2 text-sm bg-primary/20 hover:bg-primary/30 text-primary font-medium px-3 py-1.5 rounded-lg transition-colors"
>
<Download size={14} />
Export CSV
</button>
<button
onClick={() => exportApi.exportPdf(result.company_name)}
className="flex items-center gap-2 text-sm bg-primary/20 hover:bg-primary/30 text-primary font-medium px-3 py-1.5 rounded-lg transition-colors"
>
<FileText size={14} />
Export PDF
</button>
</div>
</div>
<div className="prose dark:prose-invert max-w-none">
<div className="text-text-primary whitespace-pre-wrap leading-relaxed">
{result.analysis}
</div>
+148 -23
View File
@@ -2,22 +2,52 @@ import { useState } from 'react';
import { useQuery } from '@tanstack/react-query';
import { analyticsApi } from '../api/client';
import { AlertCircle, Database } from 'lucide-react';
import { PieChart, Pie, Cell, BarChart, Bar, XAxis, YAxis, Tooltip, ResponsiveContainer, Legend } from 'recharts';
import { PieChart, Pie, Cell, BarChart, Bar, LineChart, Line, XAxis, YAxis, Tooltip, ResponsiveContainer, Legend } from 'recharts';
import { useChartTheme } from '../context/useChartTheme';
const COLORS = ['#6366f1', '#0ea5e9', '#10b981', '#f59e0b', '#ef4444', '#8b5cf6', '#ec4899', '#14b8a6'];
export function AnalyticsPage() {
const [days, setDays] = useState(30);
const chartTheme = useChartTheme();
const { data, isLoading, isError } = useQuery({
const { data, isLoading, isError, refetch } = useQuery({
queryKey: ['analytics', days],
queryFn: () => analyticsApi.getAnalytics(days),
});
const trendsQuery = useQuery({
queryKey: ['analytics-trends', days],
queryFn: () => analyticsApi.getTrends(days),
});
if (isLoading) {
return (
<div className="flex items-center justify-center min-h-[400px]">
<div className="animate-spin rounded-full h-12 w-12 border-t-2 border-b-2 border-primary"></div>
<div className="space-y-6">
<div>
<h2 className="text-xl font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-2">
Analytics Dashboard
</h2>
<p className="text-text-secondary">Loading analytics data...</p>
</div>
{/* Skeleton cards */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-4">
{[1, 2, 3].map((i) => (
<div key={i} className="bg-gradient-to-br from-primary/10 to-secondary/10 border border-primary/20 rounded-xl p-5 text-center animate-pulse">
<div className="h-9 w-16 bg-primary/20 rounded mx-auto mb-2" />
<div className="h-4 w-24 bg-primary/10 rounded mx-auto" />
</div>
))}
</div>
{/* Skeleton charts */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{[1, 2].map((i) => (
<div key={i} className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6 animate-pulse">
<div className="h-5 w-40 bg-primary/20 rounded mb-4" />
<div className="h-[300px] bg-primary/5 rounded" />
</div>
))}
</div>
</div>
);
}
@@ -33,15 +63,18 @@ export function AnalyticsPage() {
<div className="bg-gradient-to-br from-primary/10 to-secondary/5 border border-primary/20 rounded-xl p-6">
<div className="flex items-center gap-3 text-warning mb-2">
<Database size={24} />
<span className="font-semibold">Database Not Connected</span>
<span className="font-semibold">Unable to Load Analytics</span>
</div>
<p className="text-text-secondary">
Set <code className="bg-bg-card px-2 py-1 rounded">USE_DATABASE=true</code> in your .env file to enable analytics tracking.
Could not connect to the analytics database. Ensure PostgreSQL is running and
<code className="bg-bg-card px-2 py-1 rounded mx-1">DATABASE_URL</code> is configured correctly.
</p>
</div>
<div className="flex items-center gap-2 bg-secondary/10 border border-secondary/20 text-secondary rounded-xl px-4 py-3">
<AlertCircle size={18} />
<span>Analytics features require storing analysis results in PostgreSQL for historical tracking.</span>
<button
onClick={() => refetch()}
className="mt-3 text-sm bg-primary/20 hover:bg-primary/30 text-primary font-medium px-4 py-2 rounded-lg transition-colors"
>
Retry
</button>
</div>
</div>
);
@@ -129,11 +162,7 @@ export function AnalyticsPage() {
))}
</Pie>
<Tooltip
contentStyle={{
backgroundColor: '#1e293b',
border: '1px solid rgba(99, 102, 241, 0.3)',
borderRadius: '8px',
}}
contentStyle={chartTheme.tooltipContentStyle}
/>
<Legend />
</PieChart>
@@ -147,15 +176,11 @@ export function AnalyticsPage() {
<h3 className="text-lg font-semibold text-text-primary mb-4">Analysis Types</h3>
<ResponsiveContainer width="100%" height={300}>
<BarChart data={typeData}>
<XAxis dataKey="name" stroke="#94a3b8" fontSize={12} />
<YAxis stroke="#94a3b8" fontSize={12} />
<XAxis dataKey="name" stroke={chartTheme.axisStroke} fontSize={12} />
<YAxis stroke={chartTheme.axisStroke} fontSize={12} />
<Tooltip
contentStyle={{
backgroundColor: '#1e293b',
border: '1px solid rgba(99, 102, 241, 0.3)',
borderRadius: '8px',
}}
labelStyle={{ color: '#f8fafc' }}
contentStyle={chartTheme.tooltipContentStyle}
labelStyle={chartTheme.tooltipLabelStyle}
/>
<Bar dataKey="count" fill="#6366f1" radius={[4, 4, 0, 0]} />
</BarChart>
@@ -163,6 +188,106 @@ export function AnalyticsPage() {
</div>
)}
</div>
{/* Trend Charts */}
{trendsQuery.data && (
<div className="space-y-6">
<h3 className="text-lg font-semibold text-text-primary border-b-2 border-primary/30 pb-2">
Trends Over Time
</h3>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{/* Patent count over time per company (line chart) */}
{trendsQuery.data.by_month.length > 0 && (() => {
// Pivot data: each month as a row, companies as columns
const companies = [...new Set(trendsQuery.data!.by_month.map(d => d.company_name))];
const months = [...new Set(trendsQuery.data!.by_month.map(d => d.month))].sort();
const pivoted = months.map(month => {
const row: Record<string, string | number> = { month };
for (const c of companies) {
const entry = trendsQuery.data!.by_month.find(d => d.month === month && d.company_name === c);
row[c] = entry?.count || 0;
}
return row;
});
return (
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6">
<h4 className="text-md font-semibold text-text-primary mb-4">Analyses per Company Over Time</h4>
<ResponsiveContainer width="100%" height={300}>
<LineChart data={pivoted}>
<XAxis dataKey="month" stroke={chartTheme.axisStroke} fontSize={12} />
<YAxis stroke={chartTheme.axisStroke} fontSize={12} />
<Tooltip
contentStyle={chartTheme.tooltipContentStyle}
labelStyle={chartTheme.tooltipLabelStyle}
/>
<Legend />
{companies.map((company, idx) => (
<Line
key={company}
type="monotone"
dataKey={company}
stroke={COLORS[idx % COLORS.length]}
strokeWidth={2}
dot={{ r: 4 }}
name={company.toUpperCase()}
/>
))}
</LineChart>
</ResponsiveContainer>
</div>
);
})()}
{/* Analysis type distribution over time (stacked bar) */}
{trendsQuery.data.by_type_over_time.length > 0 && (() => {
const types = [...new Set(trendsQuery.data!.by_type_over_time.map(d => d.analysis_type))];
const months = [...new Set(trendsQuery.data!.by_type_over_time.map(d => d.month))].sort();
const pivoted = months.map(month => {
const row: Record<string, string | number> = { month };
for (const t of types) {
const entry = trendsQuery.data!.by_type_over_time.find(d => d.month === month && d.analysis_type === t);
row[t] = entry?.count || 0;
}
return row;
});
return (
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6">
<h4 className="text-md font-semibold text-text-primary mb-4">Analysis Types Over Time</h4>
<ResponsiveContainer width="100%" height={300}>
<BarChart data={pivoted}>
<XAxis dataKey="month" stroke={chartTheme.axisStroke} fontSize={12} />
<YAxis stroke={chartTheme.axisStroke} fontSize={12} />
<Tooltip
contentStyle={chartTheme.tooltipContentStyle}
labelStyle={chartTheme.tooltipLabelStyle}
/>
<Legend />
{types.map((type, idx) => (
<Bar
key={type}
dataKey={type}
stackId="types"
fill={COLORS[idx % COLORS.length]}
name={type}
/>
))}
</BarChart>
</ResponsiveContainer>
</div>
);
})()}
</div>
{trendsQuery.data.by_month.length === 0 && (
<div className="text-text-secondary text-center py-8">
No trend data available yet. Run analyses over multiple days to see trends.
</div>
)}
</div>
)}
</div>
);
}
+197 -15
View File
@@ -1,20 +1,37 @@
import { useState } from 'react';
import { useMutation } from '@tanstack/react-query';
import { useMutation, useQuery } from '@tanstack/react-query';
import { analysisApi } from '../api/client';
import { Rocket, CheckCircle, AlertCircle, ChevronDown, ChevronUp } from 'lucide-react';
import { Rocket, CheckCircle, AlertCircle, ChevronDown, ChevronUp, RefreshCw, Inbox } from 'lucide-react';
import { BarChart, Bar, XAxis, YAxis, Tooltip, ResponsiveContainer, Cell } from 'recharts';
import { useChartTheme } from '../context/useChartTheme';
import type { BatchAnalysisResult } from '../types';
export function Batch() {
const [companiesInput, setCompaniesInput] = useState('');
const [maxWorkers, setMaxWorkers] = useState(3);
const [selectedModel, setSelectedModel] = useState('');
const [result, setResult] = useState<BatchAnalysisResult | null>(null);
const [expandedItems, setExpandedItems] = useState<Set<string>>(new Set());
const chartTheme = useChartTheme();
const modelsQuery = useQuery({
queryKey: ['models'],
queryFn: () => analysisApi.listModels(),
});
const jobsQuery = useQuery({
queryKey: ['jobs'],
queryFn: () => analysisApi.listJobs(undefined, 20),
});
const mutation = useMutation({
mutationFn: ({ companies, workers }: { companies: string[]; workers: number }) =>
analysisApi.analyzeBatch(companies, workers),
onSuccess: (data) => setResult(data),
analysisApi.analyzeBatch(companies, workers, selectedModel || undefined),
onSuccess: (data) => {
setResult(data);
jobsQuery.refetch();
},
});
const handleSubmit = (e: React.FormEvent) => {
@@ -85,6 +102,29 @@ export function Batch() {
<div className="text-center text-text-primary font-semibold">{maxWorkers}</div>
</div>
<div>
<label className="block text-sm font-medium text-text-secondary mb-2">
LLM Model
</label>
<div className="relative">
<select
value={selectedModel}
onChange={(e) => setSelectedModel(e.target.value)}
className="w-full appearance-none bg-bg-card/80 border border-primary/30 rounded-lg pl-3 pr-8 py-2 text-sm text-text-primary focus:outline-none focus:border-primary focus:ring-2 focus:ring-primary/20 transition-all cursor-pointer"
>
<option value="">
{modelsQuery.data ? `Default (${modelsQuery.data.default})` : 'Default'}
</option>
{modelsQuery.data?.models.map((m) => (
<option key={m.id} value={m.id}>
{m.name} ({m.provider})
</option>
))}
</select>
<ChevronDown className="absolute right-2 top-1/2 -translate-y-1/2 text-text-secondary pointer-events-none" size={16} />
</div>
</div>
<button
type="submit"
disabled={mutation.isPending || !companiesInput.trim()}
@@ -114,9 +154,38 @@ export function Batch() {
{/* Error */}
{mutation.isError && (
<div className="flex items-center gap-2 bg-error/10 border border-error/20 text-error rounded-xl px-4 py-3">
<AlertCircle size={18} />
<span>Batch analysis failed. Please try again.</span>
<div className="bg-error/10 border border-error/20 rounded-xl px-4 py-3">
<div className="flex items-center gap-2 text-error">
<AlertCircle size={18} />
<span className="font-semibold">Batch analysis failed</span>
</div>
<p className="text-text-secondary text-sm mt-1 ml-7">
{mutation.error instanceof Error ? mutation.error.message : 'An unexpected error occurred.'}
{' '}Check your connection and try again.
</p>
<div className="ml-7 mt-2 flex items-center gap-3">
<button
onClick={() => {
const companies = companiesInput
.split(/[,\n]/)
.map((c) => c.trim())
.filter((c) => c.length > 0);
if (companies.length > 0) {
mutation.mutate({ companies, workers: maxWorkers });
}
}}
className="text-sm text-primary hover:text-primary-dark underline flex items-center gap-1"
>
<RefreshCw size={14} />
Retry
</button>
<button
onClick={() => mutation.reset()}
className="text-sm text-text-secondary hover:text-text-primary underline"
>
Dismiss
</button>
</div>
</div>
)}
@@ -144,15 +213,11 @@ export function Batch() {
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6">
<ResponsiveContainer width="100%" height={300}>
<BarChart data={chartData}>
<XAxis dataKey="name" stroke="#94a3b8" fontSize={12} />
<YAxis stroke="#94a3b8" fontSize={12} />
<XAxis dataKey="name" stroke={chartTheme.axisStroke} fontSize={12} />
<YAxis stroke={chartTheme.axisStroke} fontSize={12} />
<Tooltip
contentStyle={{
backgroundColor: '#1e293b',
border: '1px solid rgba(99, 102, 241, 0.3)',
borderRadius: '8px',
}}
labelStyle={{ color: '#f8fafc' }}
contentStyle={chartTheme.tooltipContentStyle}
labelStyle={chartTheme.tooltipLabelStyle}
/>
<Bar dataKey="patents" radius={[4, 4, 0, 0]}>
{chartData.map((entry, index) => (
@@ -218,6 +283,123 @@ export function Batch() {
</div>
</div>
)}
{/* Job History */}
<div>
<h3 className="text-lg font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-4">
Job History
</h3>
{/* Loading skeleton */}
{jobsQuery.isLoading && (
<div className="space-y-3">
{[...Array(3)].map((_, i) => (
<div
key={i}
className="bg-bg-card/60 border border-primary/15 rounded-xl p-4 animate-pulse"
>
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
<div className="h-5 w-5 rounded-full bg-primary/20" />
<div className="h-4 w-32 rounded bg-primary/20" />
<div className="h-4 w-20 rounded bg-primary/10" />
</div>
<div className="h-6 w-20 rounded-full bg-primary/15" />
</div>
<div className="mt-3 flex gap-4">
<div className="h-3 w-24 rounded bg-primary/10" />
<div className="h-3 w-16 rounded bg-primary/10" />
</div>
</div>
))}
</div>
)}
{/* Job history error */}
{jobsQuery.isError && (
<div className="bg-error/10 border border-error/20 rounded-xl px-4 py-3">
<div className="flex items-center gap-2 text-error">
<AlertCircle size={18} />
<span className="font-semibold">Failed to load job history</span>
</div>
<p className="text-text-secondary text-sm mt-1 ml-7">
{jobsQuery.error instanceof Error ? jobsQuery.error.message : 'Could not retrieve past jobs.'}
</p>
<button
onClick={() => jobsQuery.refetch()}
className="ml-7 mt-2 text-sm text-primary hover:text-primary-dark underline flex items-center gap-1"
>
<RefreshCw size={14} />
Retry
</button>
</div>
)}
{/* Empty state */}
{jobsQuery.isSuccess && jobsQuery.data.length === 0 && !result && (
<div className="bg-bg-card/60 border border-primary/15 border-dashed rounded-xl p-8 text-center">
<Inbox className="mx-auto text-text-secondary/40 mb-3" size={40} />
<p className="text-text-secondary font-medium">No batch jobs yet</p>
<p className="text-text-secondary/70 text-sm mt-1">
Submit a batch analysis above to get started. Your job history will appear here.
</p>
</div>
)}
{/* Job list */}
{jobsQuery.isSuccess && jobsQuery.data.length > 0 && (
<div className="space-y-3">
{jobsQuery.data.map((job) => (
<div
key={job.job_id}
className="bg-bg-card/60 border border-primary/15 rounded-xl p-4"
>
<div className="flex items-center justify-between">
<div className="flex items-center gap-3">
{job.status === 'completed' && <CheckCircle className="text-success" size={18} />}
{job.status === 'failed' && <AlertCircle className="text-error" size={18} />}
{(job.status === 'pending' || job.status === 'running') && (
<div className="animate-spin rounded-full h-[18px] w-[18px] border-t-2 border-b-2 border-secondary" />
)}
<span className="font-mono text-sm text-text-primary">{job.job_id.slice(0, 8)}</span>
<span className="text-text-secondary text-sm">
{job.total_companies} {job.total_companies === 1 ? 'company' : 'companies'}
</span>
</div>
<span
className={`text-xs font-semibold px-2.5 py-1 rounded-full ${
job.status === 'completed'
? 'bg-success/15 text-success'
: job.status === 'failed'
? 'bg-error/15 text-error'
: 'bg-secondary/15 text-secondary'
}`}
>
{job.status}
</span>
</div>
{(job.status === 'running' || job.status === 'pending') && job.total_companies > 0 && (
<div className="mt-3">
<div className="flex items-center justify-between text-xs text-text-secondary mb-1">
<span>Progress</span>
<span>{job.completed_companies}/{job.total_companies}</span>
</div>
<div className="h-1.5 bg-bg-dark rounded-full overflow-hidden">
<div
className="h-full bg-gradient-to-r from-primary to-secondary rounded-full transition-all duration-300"
style={{ width: `${(job.completed_companies / job.total_companies) * 100}%` }}
/>
</div>
</div>
)}
{job.status === 'failed' && job.error && (
<p className="mt-2 text-sm text-error/80">{job.error}</p>
)}
</div>
))}
</div>
)}
</div>
</div>
);
}
+161
View File
@@ -0,0 +1,161 @@
import { useState } from 'react';
import { useSearchParams } from 'react-router-dom';
import { useQuery } from '@tanstack/react-query';
import { analysisApi } from '../api/client';
import { GitCompareArrows, AlertCircle, FileText, Clock } from 'lucide-react';
import type { CompanyAnalysis } from '../types';
function CompanyPanel({ data, isLoading, isError }: { data?: CompanyAnalysis; isLoading: boolean; isError: boolean }) {
if (isLoading) {
return (
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6 animate-pulse">
<div className="h-6 w-32 bg-primary/20 rounded mb-4" />
<div className="space-y-3">
<div className="h-4 bg-primary/10 rounded w-full" />
<div className="h-4 bg-primary/10 rounded w-3/4" />
<div className="h-4 bg-primary/10 rounded w-5/6" />
</div>
</div>
);
}
if (isError) {
return (
<div className="bg-error/10 border border-error/20 rounded-2xl p-6">
<div className="flex items-center gap-2 text-error">
<AlertCircle size={18} />
<span>Failed to load analysis. Check the company name and try again.</span>
</div>
</div>
);
}
if (!data) return null;
return (
<div className="bg-bg-card/60 border border-primary/15 rounded-2xl p-6 space-y-4">
<h3 className="text-lg font-bold text-text-primary border-b-2 border-primary/30 pb-2">
{data.company_name.toUpperCase()}
</h3>
<div className="grid grid-cols-2 gap-3">
<div className="bg-primary/10 rounded-lg p-3 text-center">
<FileText className="mx-auto mb-1 text-primary" size={18} />
<div className="text-xl font-bold text-text-primary">{data.patent_count}</div>
<div className="text-xs text-text-secondary uppercase">Patents</div>
</div>
<div className="bg-primary/10 rounded-lg p-3 text-center">
<Clock className="mx-auto mb-1 text-primary" size={18} />
<div className="text-sm font-medium text-text-primary">
{new Date(data.timestamp).toLocaleDateString()}
</div>
<div className="text-xs text-text-secondary uppercase">Analyzed</div>
</div>
</div>
{data.success && data.analysis ? (
<div className="text-text-primary whitespace-pre-wrap leading-relaxed text-sm">
{data.analysis}
</div>
) : (
<div className="text-error text-sm">{data.error || 'Analysis not available'}</div>
)}
</div>
);
}
export function Compare() {
const [searchParams, setSearchParams] = useSearchParams();
const [companyA, setCompanyA] = useState(searchParams.get('a') || '');
const [companyB, setCompanyB] = useState(searchParams.get('b') || '');
const queryA = searchParams.get('a') || '';
const queryB = searchParams.get('b') || '';
const resultA = useQuery({
queryKey: ['analyze', queryA],
queryFn: () => analysisApi.analyzeCompany(queryA),
enabled: !!queryA,
});
const resultB = useQuery({
queryKey: ['analyze', queryB],
queryFn: () => analysisApi.analyzeCompany(queryB),
enabled: !!queryB,
});
const handleCompare = (e: React.FormEvent) => {
e.preventDefault();
const a = companyA.trim();
const b = companyB.trim();
if (a && b) {
setSearchParams({ a, b });
}
};
return (
<div className="space-y-6">
{/* Header */}
<div>
<h2 className="text-xl font-semibold text-text-primary border-b-2 border-primary/30 pb-2 mb-2">
Portfolio Comparison
</h2>
<p className="text-text-secondary">
Compare patent portfolios of two companies side by side.
</p>
</div>
{/* Input Form */}
<form onSubmit={handleCompare} className="flex flex-col sm:flex-row gap-3 items-end">
<div className="flex-1">
<label className="block text-sm font-medium text-text-secondary mb-1">Company A</label>
<input
type="text"
value={companyA}
onChange={(e) => setCompanyA(e.target.value)}
placeholder="e.g. nvidia"
className="w-full bg-bg-card/80 border border-primary/30 rounded-xl px-4 py-2.5 text-text-primary placeholder-text-secondary/50 focus:outline-none focus:border-primary focus:ring-2 focus:ring-primary/20 transition-all"
/>
</div>
<div className="flex-1">
<label className="block text-sm font-medium text-text-secondary mb-1">Company B</label>
<input
type="text"
value={companyB}
onChange={(e) => setCompanyB(e.target.value)}
placeholder="e.g. intel"
className="w-full bg-bg-card/80 border border-primary/30 rounded-xl px-4 py-2.5 text-text-primary placeholder-text-secondary/50 focus:outline-none focus:border-primary focus:ring-2 focus:ring-primary/20 transition-all"
/>
</div>
<button
type="submit"
disabled={!companyA.trim() || !companyB.trim() || resultA.isLoading || resultB.isLoading}
className="bg-gradient-to-r from-primary to-primary-dark text-white font-semibold py-2.5 px-6 rounded-xl hover:shadow-lg hover:shadow-primary/30 transition-all disabled:opacity-50 disabled:cursor-not-allowed flex items-center gap-2"
>
<GitCompareArrows size={18} />
Compare
</button>
</form>
{/* Comparison Panels */}
{(queryA || queryB) && (
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{queryA && (
<CompanyPanel
data={resultA.data}
isLoading={resultA.isLoading}
isError={resultA.isError}
/>
)}
{queryB && (
<CompanyPanel
data={resultB.data}
isLoading={resultB.isLoading}
isError={resultB.isError}
/>
)}
</div>
)}
</div>
);
}
+1 -1
View File
@@ -31,7 +31,7 @@ export function Login() {
};
return (
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950 flex items-center justify-center px-4">
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950 flex items-center justify-center px-4">
<div className="w-full max-w-md">
{/* Brand */}
<div className="text-center mb-8">
+1 -1
View File
@@ -40,7 +40,7 @@ export function Register() {
};
return (
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-indigo-950 flex items-center justify-center px-4">
<div className="min-h-screen bg-gradient-to-br from-bg-dark to-slate-100 dark:to-indigo-950 flex items-center justify-center px-4">
<div className="w-full max-w-md">
{/* Brand */}
<div className="text-center mb-8">
+28 -42
View File
@@ -1,46 +1,32 @@
export interface User {
id: number;
email: string;
role: 'admin' | 'user';
created_at: string;
}
/**
* Application types derived from the auto-generated OpenAPI schema.
*
* Run `npm run generate:local` (or `npm run generate` with the API running)
* to regenerate `src/api/schema.d.ts` from the backend OpenAPI spec.
*
* These aliases keep the rest of the codebase stable while the source of
* truth lives in the generated file.
*/
export interface TokenResponse {
access_token: string;
refresh_token: string;
token_type: string;
}
import type { components } from '../api/schema';
export interface CompanyAnalysis {
company_name: string;
analysis: string;
patent_count: number;
success: boolean;
error: string | null;
timestamp: string;
}
export interface BatchAnalysisResult {
results: CompanyAnalysis[];
total_companies: number;
successful: number;
failed: number;
timestamp: string;
}
export interface JobStatus {
job_id: string;
status: 'pending' | 'running' | 'completed' | 'failed';
progress: number;
total_companies: number;
completed_companies: number;
result: BatchAnalysisResult | null;
error: string | null;
}
export interface Analytics {
total_messages: number;
// Re-export schema types under the names the rest of the app expects.
export type User = components['schemas']['UserResponse'];
export type TokenResponse = components['schemas']['TokenResponse'];
export type CompanyAnalysis = components['schemas']['CompanyAnalysisResponse'];
export type BatchAnalysisResult = components['schemas']['BatchAnalysisResponse'];
export type JobStatus = components['schemas']['JobStatus'];
export type Analytics = Omit<components['schemas']['AnalyticsResponse'], 'by_company' | 'by_type'> & {
by_company: Array<{ company_name: string; count: number }>;
by_type: Array<{ analysis_type: string; count: number }>;
period_days: number;
}
};
// Additional generated types that may be useful elsewhere.
export type RegisterRequest = components['schemas']['RegisterRequest'];
export type LoginRequest = components['schemas']['LoginRequest'];
export type RefreshRequest = components['schemas']['RefreshRequest'];
export type UpdateRoleRequest = components['schemas']['UpdateRoleRequest'];
export type HealthResponse = components['schemas']['HealthResponse'];
export type BatchAnalysisRequest = components['schemas']['BatchAnalysisRequest'];
export type ValidationError = components['schemas']['ValidationError'];
export type HTTPValidationError = components['schemas']['HTTPValidationError'];
+7 -6
View File
@@ -4,6 +4,7 @@ export default {
"./index.html",
"./src/**/*.{js,ts,jsx,tsx}",
],
darkMode: 'class',
theme: {
extend: {
colors: {
@@ -16,15 +17,15 @@ export default {
warning: '#f59e0b',
error: '#ef4444',
bg: {
dark: '#0f172a',
card: '#1e293b',
'card-hover': '#334155',
dark: 'var(--color-bg-dark)',
card: 'var(--color-bg-card)',
'card-hover': 'var(--color-bg-card-hover)',
},
text: {
primary: '#f8fafc',
secondary: '#94a3b8',
primary: 'var(--color-text-primary)',
secondary: 'var(--color-text-secondary)',
},
border: '#334155',
border: 'var(--color-border)',
},
},
},
+3
View File
@@ -15,3 +15,6 @@ pandas
bcrypt
PyJWT
slowapi
apscheduler
boto3
reportlab
+211
View File
@@ -0,0 +1,211 @@
"""Tests for analyze_single_patent auto-download path.
Covers issue #1661:
- PDF exists on disk: direct analysis (happy path)
- PDF not on disk, cached link exists: auto-download and analyze
- PDF not on disk, no cached link: FileNotFoundError
- Analysis failure after PDF found: graceful error message
- Model override parameter passthrough
"""
import os
from unittest.mock import MagicMock, patch
import pytest
from SPARC.analyzer import CompanyAnalyzer
from SPARC.types import Patent
@pytest.fixture(autouse=True)
def mock_db(mocker):
"""Mock DatabaseClient so no real DB is needed."""
mock_db_cls = mocker.patch("SPARC.analyzer.DatabaseClient")
mock_db_instance = MagicMock()
mock_db_instance.get_cached_patent.return_value = None
mock_db_instance.get_cached_serp_query.return_value = None
mock_db_cls.return_value = mock_db_instance
return mock_db_instance
@pytest.fixture
def analyzer(mocker, mock_db):
"""Create a CompanyAnalyzer with mocked LLM and DB."""
mocker.patch("SPARC.analyzer.LLMAnalyzer")
return CompanyAnalyzer(openrouter_api_key="test-key")
class TestAnalyzeSinglePatentAutoDownload:
"""Test the auto-download logic in analyze_single_patent."""
def test_pdf_on_disk_analyzed_directly(self, analyzer, mocker, tmp_path):
"""When PDF exists on disk, it is analyzed directly without download."""
patent_id = "US-11234567-B2"
# Create the patents dir and PDF file
patents_dir = tmp_path / "patents"
patents_dir.mkdir()
pdf_path = patents_dir / f"{patent_id}.pdf"
pdf_path.write_bytes(b"fake PDF content")
mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
mock_parse.return_value = {"abstract": "test", "claims": "test claims"}
mock_minimize.return_value = "minimized content"
analyzer.llm_analyzer.analyze_patent_content.return_value = "Good patent."
# Change cwd so patents/{patent_id}.pdf resolves to our tmp_path
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
result = analyzer.analyze_single_patent(patent_id, "TestCo")
finally:
os.chdir(original_cwd)
assert result == "Good patent."
# DB cache should not have been queried since file existed
analyzer.db.get_cached_patent.assert_not_called()
def test_auto_download_from_cached_link(self, analyzer, mocker, tmp_path):
"""When PDF is not on disk but link is cached, auto-download occurs."""
patent_id = "US-99887766-A1"
# No patents dir exists (PDF not on disk)
mock_save = mocker.patch("SPARC.analyzer.SERP.save_patents")
downloaded_patent = Patent(patent_id=patent_id, pdf_link="https://example.com/patent.pdf")
downloaded_patent.pdf_path = f"patents/{patent_id}.pdf"
mock_save.return_value = downloaded_patent
# Cached patent has a PDF link
analyzer.db.get_cached_patent.return_value = {
"patent_id": patent_id,
"pdf_link": "https://example.com/patent.pdf",
}
# Mock the rest of the analysis pipeline
mock_parse = mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf")
mock_minimize = mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm")
mock_parse.return_value = {"abstract": "test abstract"}
mock_minimize.return_value = "minimized content"
analyzer.llm_analyzer.analyze_patent_content.return_value = "Strong innovation."
# Change cwd so patents/{patent_id}.pdf does NOT exist
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
result = analyzer.analyze_single_patent(patent_id, "DownloadCo")
finally:
os.chdir(original_cwd)
assert result == "Strong innovation."
analyzer.db.get_cached_patent.assert_called_once_with(patent_id)
mock_save.assert_called_once()
# Verify the Patent passed to save_patents has the correct ID and link
saved_patent = mock_save.call_args[0][0]
assert saved_patent.patent_id == patent_id
assert saved_patent.pdf_link == "https://example.com/patent.pdf"
def test_no_cached_link_raises_file_not_found(self, analyzer, mocker, tmp_path):
"""When PDF is not on disk and no cached link, FileNotFoundError raised."""
patent_id = "US-00000000-X1"
analyzer.db.get_cached_patent.return_value = None
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
with pytest.raises(FileNotFoundError, match="no download link is cached"):
analyzer.analyze_single_patent(patent_id, "MissingCo")
finally:
os.chdir(original_cwd)
def test_cached_patent_without_pdf_link_raises(self, analyzer, mocker, tmp_path):
"""When cached patent exists but has no pdf_link, FileNotFoundError raised."""
patent_id = "US-11111111-B1"
analyzer.db.get_cached_patent.return_value = {
"patent_id": patent_id,
"pdf_link": None,
}
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
with pytest.raises(FileNotFoundError, match="no download link is cached"):
analyzer.analyze_single_patent(patent_id, "NoPDFCo")
finally:
os.chdir(original_cwd)
def test_analysis_exception_returns_error_message(self, analyzer, mocker, tmp_path):
"""When analysis pipeline fails, returns error string instead of raising."""
patent_id = "US-22222222-A2"
# Create the PDF on disk so it skips download
patents_dir = tmp_path / "patents"
patents_dir.mkdir()
(patents_dir / f"{patent_id}.pdf").write_bytes(b"fake PDF")
# Parse fails
mocker.patch(
"SPARC.analyzer.SERP.parse_patent_pdf",
side_effect=ValueError("Corrupt PDF"),
)
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
result = analyzer.analyze_single_patent(patent_id, "ErrorCo")
finally:
os.chdir(original_cwd)
assert "Failed to analyze patent" in result
assert "Corrupt PDF" in result
def test_model_override_passed_to_llm(self, analyzer, mocker, tmp_path):
"""The model parameter is forwarded to the LLM analyzer."""
patent_id = "US-33333333-B2"
patents_dir = tmp_path / "patents"
patents_dir.mkdir()
(patents_dir / f"{patent_id}.pdf").write_bytes(b"fake PDF")
mocker.patch("SPARC.analyzer.SERP.parse_patent_pdf", return_value={"abstract": "test"})
mocker.patch("SPARC.analyzer.SERP.minimize_patent_for_llm", return_value="content")
analyzer.llm_analyzer.analyze_patent_content.return_value = "Analysis result."
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
result = analyzer.analyze_single_patent(
patent_id, "ModelCo", model="openai/gpt-4o"
)
finally:
os.chdir(original_cwd)
assert result == "Analysis result."
analyzer.llm_analyzer.analyze_patent_content.assert_called_once_with(
patent_content="content",
company_name="ModelCo",
model="openai/gpt-4o",
)
def test_file_not_found_during_parse_re_raised(self, analyzer, mocker, tmp_path):
"""FileNotFoundError during parsing is re-raised, not caught."""
patent_id = "US-44444444-C1"
patents_dir = tmp_path / "patents"
patents_dir.mkdir()
(patents_dir / f"{patent_id}.pdf").write_bytes(b"fake PDF")
mocker.patch(
"SPARC.analyzer.SERP.parse_patent_pdf",
side_effect=FileNotFoundError("PDF file vanished"),
)
original_cwd = os.getcwd()
os.chdir(tmp_path)
try:
with pytest.raises(FileNotFoundError, match="PDF file vanished"):
analyzer.analyze_single_patent(patent_id, "VanishCo")
finally:
os.chdir(original_cwd)
+44
View File
@@ -182,3 +182,47 @@ class TestJobEndpoints:
"""Test listing jobs with status filter."""
response = client.get("/jobs?status=completed")
assert response.status_code == 200
class TestModelValidation:
"""Test that unsupported model identifiers are rejected."""
def test_analyze_rejects_unsupported_model(self, client, mock_analyzer):
"""GET /analyze/{company} with unsupported model returns 400."""
response = client.get("/analyze/nvidia?model=fake/nonexistent-model")
assert response.status_code == 400
assert "Unsupported model" in response.json()["detail"]
def test_analyze_accepts_supported_model(self, client, mock_analyzer):
"""GET /analyze/{company} with a supported model succeeds."""
mock_result = CompanyAnalysisResult(
company_name="nvidia",
analysis="test",
patent_count=1,
success=True,
timestamp=datetime.now(),
model="anthropic/claude-3.5-sonnet",
)
mock_analyzer._analyze_company_safe.return_value = mock_result
response = client.get("/analyze/nvidia?model=anthropic/claude-3.5-sonnet")
assert response.status_code == 200
def test_batch_rejects_unsupported_model(self, client, mock_analyzer):
"""POST /analyze/batch with unsupported model returns 400."""
response = client.post(
"/analyze/batch",
json={"companies": ["nvidia"], "model": "fake/nonexistent-model"},
)
assert response.status_code == 400
assert "Unsupported model" in response.json()["detail"]
def test_list_models_returns_supported(self, client):
"""GET /models returns the allow-list."""
response = client.get("/models")
assert response.status_code == 200
data = response.json()
assert "models" in data
assert "default" in data
assert len(data["models"]) > 0
assert all("id" in m and "name" in m and "provider" in m for m in data["models"])
+209 -9
View File
@@ -1,13 +1,29 @@
"""Tests for JWT authentication flow: register, login, protected routes, refresh, admin access."""
"""Tests for JWT authentication flow: register, login, protected routes, refresh, admin access.
from datetime import datetime, timezone
Covers all five scenarios required by issue #1624:
1. Registration (POST /auth/register)
2. Login (POST /auth/login)
3. Protected route access (GET /auth/me) -- valid, missing, expired, wrong-type tokens
4. Token refresh (POST /auth/refresh)
5. Admin-only endpoints (GET /admin/users, PATCH role, DELETE user)
All tests use mocked DB fixtures and require no live database.
"""
from datetime import datetime, timedelta, timezone
from unittest.mock import MagicMock, patch
import jwt as pyjwt
import pytest
from fastapi.testclient import TestClient
from SPARC.api import app
from SPARC.auth import create_access_token, create_refresh_token
from SPARC.auth import (
JWT_ALGORITHM,
JWT_SECRET,
create_access_token,
create_refresh_token,
)
@pytest.fixture
@@ -171,12 +187,6 @@ class TestGetMe:
def test_expired_token_returns_401(self, client, mock_db):
"""An expired token should return 401."""
# Create a token that has already expired
from datetime import timedelta
import jwt as pyjwt
from SPARC.auth import JWT_ALGORITHM, JWT_SECRET
payload = {
"sub": "1",
"email": "user@test.com",
@@ -300,3 +310,193 @@ class TestAdminUsers:
assert response.status_code == 400
assert "own role" in response.json()["detail"].lower()
def test_role_change_nonexistent_user_returns_404(self, client, mock_db):
"""Changing role for a user that does not exist should return 404."""
admin = _make_admin_user()
mock_db.get_user_by_id.return_value = admin
mock_db.update_user_role.return_value = None
response = client.patch(
"/admin/users/999/role",
json={"role": "admin"},
headers=_auth_header(admin),
)
assert response.status_code == 404
assert "not found" in response.json()["detail"].lower()
def test_regular_user_cannot_change_role(self, client, mock_db):
"""Non-admin user should receive 403 when trying to change roles."""
user = _make_regular_user()
mock_db.get_user_by_id.return_value = user
response = client.patch(
"/admin/users/1/role",
json={"role": "admin"},
headers=_auth_header(user),
)
assert response.status_code == 403
class TestAdminDeleteUser:
"""DELETE /admin/users/{user_id}"""
def test_admin_can_delete_user(self, client, mock_db):
"""Admin should be able to delete another user."""
admin = _make_admin_user()
mock_db.get_user_by_id.return_value = admin
mock_db.delete_user.return_value = True
response = client.delete(
"/admin/users/2",
headers=_auth_header(admin),
)
assert response.status_code == 200
assert "deleted" in response.json()["message"].lower()
mock_db.delete_user.assert_called_once_with(2)
def test_admin_cannot_delete_self(self, client, mock_db):
"""Admin should not be able to delete themselves."""
admin = _make_admin_user()
mock_db.get_user_by_id.return_value = admin
response = client.delete(
"/admin/users/1",
headers=_auth_header(admin),
)
assert response.status_code == 400
assert "yourself" in response.json()["detail"].lower()
def test_delete_nonexistent_user_returns_404(self, client, mock_db):
"""Deleting a user that does not exist should return 404."""
admin = _make_admin_user()
mock_db.get_user_by_id.return_value = admin
mock_db.delete_user.return_value = False
response = client.delete(
"/admin/users/999",
headers=_auth_header(admin),
)
assert response.status_code == 404
assert "not found" in response.json()["detail"].lower()
def test_regular_user_cannot_delete_user(self, client, mock_db):
"""Non-admin user should receive 403 when trying to delete users."""
user = _make_regular_user()
mock_db.get_user_by_id.return_value = user
response = client.delete(
"/admin/users/1",
headers=_auth_header(user),
)
assert response.status_code == 403
def test_no_token_cannot_delete_user(self, client):
"""Missing token should be rejected for delete endpoint."""
response = client.delete("/admin/users/1")
assert response.status_code in (401, 403)
class TestEdgeCases:
"""Additional edge-case tests for auth robustness."""
def test_register_invalid_email_returns_422(self, client, mock_db):
"""Registration with an invalid email format should return 422."""
response = client.post(
"/auth/register",
json={"email": "not-an-email", "password": "securepass123"},
)
assert response.status_code == 422
def test_register_short_password_returns_422(self, client, mock_db):
"""Registration with a password shorter than 8 chars should return 422."""
response = client.post(
"/auth/register",
json={"email": "user@test.com", "password": "short"},
)
assert response.status_code == 422
def test_register_missing_fields_returns_422(self, client, mock_db):
"""Registration with missing fields should return 422."""
response = client.post("/auth/register", json={})
assert response.status_code == 422
def test_login_missing_fields_returns_422(self, client, mock_db):
"""Login with missing fields should return 422."""
response = client.post("/auth/login", json={"email": "user@test.com"})
assert response.status_code == 422
def test_malformed_token_returns_401(self, client, mock_db):
"""A completely malformed token string should return 401."""
response = client.get(
"/auth/me",
headers={"Authorization": "Bearer not.a.valid.jwt.token"},
)
assert response.status_code == 401
def test_token_with_wrong_secret_returns_401(self, client, mock_db):
"""A token signed with a different secret should return 401."""
payload = {
"sub": "1",
"email": "user@test.com",
"role": "user",
"exp": datetime.now(timezone.utc) + timedelta(hours=1),
"type": "access",
}
wrong_secret_token = pyjwt.encode(payload, "wrong-secret", algorithm=JWT_ALGORITHM)
response = client.get(
"/auth/me",
headers={"Authorization": f"Bearer {wrong_secret_token}"},
)
assert response.status_code == 401
def test_token_for_deleted_user_returns_401(self, client, mock_db):
"""A valid token for a user no longer in the DB should return 401."""
user = _make_regular_user()
mock_db.get_user_by_id.return_value = None # user was deleted
response = client.get("/auth/me", headers=_auth_header(user))
assert response.status_code == 401
def test_refresh_for_deleted_user_returns_401(self, client, mock_db):
"""Refreshing a token for a deleted user should return 401."""
user = _make_regular_user()
mock_db.get_user_by_id.return_value = None
refresh = create_refresh_token(user["id"], user["email"], user["role"])
response = client.post(
"/auth/refresh", json={"refresh_token": refresh}
)
assert response.status_code == 401
def test_login_returns_decodable_tokens(self, client, mock_db):
"""Tokens returned by login should be decodable and contain expected claims."""
user = _make_regular_user()
mock_db.authenticate_user.return_value = user
response = client.post(
"/auth/login",
json={"email": "user@test.com", "password": "correctpassword"},
)
data = response.json()
access_payload = pyjwt.decode(
data["access_token"], JWT_SECRET, algorithms=[JWT_ALGORITHM]
)
assert access_payload["sub"] == str(user["id"])
assert access_payload["email"] == user["email"]
assert access_payload["type"] == "access"
refresh_payload = pyjwt.decode(
data["refresh_token"], JWT_SECRET, algorithms=[JWT_ALGORITHM]
)
assert refresh_payload["type"] == "refresh"
+157
View File
@@ -0,0 +1,157 @@
"""Tests for company name input validation on analysis endpoints."""
from datetime import datetime
from unittest.mock import Mock
import pytest
from fastapi.testclient import TestClient
from SPARC.api import app
from SPARC.types import CompanyAnalysisResult
@pytest.fixture
def client():
"""Create test client."""
return TestClient(app)
@pytest.fixture
def mock_analyzer(mocker):
"""Mock the global analyzer so valid requests succeed."""
mock = Mock()
mock._analyze_company_safe.return_value = CompanyAnalysisResult(
company_name="nvidia",
analysis="Test analysis",
patent_count=1,
success=True,
timestamp=datetime.now(),
)
mocker.patch("SPARC.api._analyzer", mock)
return mock
class TestCompanyNameValidation:
"""Test that company names are validated on analysis endpoints."""
# --- Too short ---
def test_single_char_rejected(self, client, mock_analyzer):
"""A one-character company name should be rejected."""
response = client.get("/analyze/X")
assert response.status_code == 422
# --- Too long ---
def test_over_100_chars_rejected(self, client, mock_analyzer):
"""A company name longer than 100 characters should be rejected."""
long_name = "A" * 101
response = client.get(f"/analyze/{long_name}")
assert response.status_code == 422
# --- Special characters ---
@pytest.mark.parametrize(
"bad_name",
[
"nvidia!",
"intel@corp",
"test#company",
"foo$bar",
"a%b",
"x^y",
"semi;colon",
"drop'table",
'say"hello',
"path/traversal",
"back\\slash",
"pipe|char",
"star*glob",
"question?mark",
"<script>",
"curly{brace}",
"equal=sign",
"plus+plus",
"comma,separated",
],
)
def test_special_chars_rejected(self, client, mock_analyzer, bad_name):
"""Company names with disallowed special characters should be rejected."""
response = client.get(f"/analyze/{bad_name}")
assert response.status_code == 422
# --- Valid names ---
@pytest.mark.parametrize(
"valid_name",
[
"nvidia",
"Intel",
"TSMC",
"Texas Instruments",
"Johnson-Johnson",
"AT&T",
"St. Jude Medical",
"3M",
"21st Century Fox",
"ab", # minimum length
"A" * 100, # maximum length
],
)
def test_valid_names_accepted(self, client, mock_analyzer, valid_name):
"""Valid company names should be accepted (200, not 422)."""
response = client.get(f"/analyze/{valid_name}")
# Should not be a validation error; 200 or other non-422 status is fine
assert response.status_code != 422
# --- Batch endpoint validation ---
def test_batch_too_short_rejected(self, client, mock_analyzer):
"""Batch endpoint should reject company names that are too short."""
response = client.post(
"/analyze/batch",
json={"companies": ["X"]},
)
assert response.status_code == 422
def test_batch_too_long_rejected(self, client, mock_analyzer):
"""Batch endpoint should reject company names that are too long."""
response = client.post(
"/analyze/batch",
json={"companies": ["A" * 101]},
)
assert response.status_code == 422
def test_batch_special_chars_rejected(self, client, mock_analyzer):
"""Batch endpoint should reject company names with special chars."""
response = client.post(
"/analyze/batch",
json={"companies": ["nvidia!", "intel"]},
)
assert response.status_code == 422
def test_batch_valid_names_accepted(self, client, mock_analyzer):
"""Batch endpoint should accept valid company names."""
response = client.post(
"/analyze/batch",
json={"companies": ["nvidia", "Intel", "AT&T"]},
)
assert response.status_code != 422
# --- Name must start with alphanumeric ---
def test_leading_space_rejected(self, client, mock_analyzer):
"""Company name starting with a space should be rejected."""
response = client.post(
"/analyze/batch",
json={"companies": [" nvidia"]},
)
assert response.status_code == 422
def test_leading_hyphen_rejected(self, client, mock_analyzer):
"""Company name starting with a hyphen should be rejected."""
response = client.post(
"/analyze/batch",
json={"companies": ["-nvidia"]},
)
assert response.status_code == 422
+224
View File
@@ -0,0 +1,224 @@
"""Tests for export endpoints: CSV and PDF export of analysis results.
Covers issue #1655:
- GET /export/{company_name} (CSV export)
- GET /export/{company_name}/pdf (PDF export)
All tests mock the database layer and use JWT auth fixtures from test_auth patterns.
"""
from datetime import datetime, timezone
from unittest.mock import MagicMock, patch
import pytest
from fastapi.testclient import TestClient
from SPARC.api import app
from SPARC.auth import create_access_token
@pytest.fixture
def client():
"""Create test client."""
return TestClient(app)
@pytest.fixture(autouse=True)
def mock_db():
"""Mock the database client used by export and auth endpoints."""
db = MagicMock()
# Default: user exists for auth
db.get_user_by_id.return_value = {
"id": 1,
"email": "user@test.com",
"role": "user",
"created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
}
# Mock get_conn for export queries
mock_cursor = MagicMock()
mock_conn = MagicMock()
mock_conn.cursor.return_value.__enter__ = MagicMock(return_value=mock_cursor)
mock_conn.cursor.return_value.__exit__ = MagicMock(return_value=False)
db.get_conn.return_value.__enter__ = MagicMock(return_value=mock_conn)
db.get_conn.return_value.__exit__ = MagicMock(return_value=False)
db._mock_cursor = mock_cursor
with patch("SPARC.api.get_db_client", return_value=db), \
patch("SPARC.auth.get_db_client", return_value=db):
yield db
def _auth_header():
"""Create an Authorization header with a valid access token."""
token = create_access_token(1, "user@test.com", "user")
return {"Authorization": f"Bearer {token}"}
def _sample_rows():
"""Return sample llm_messages rows as tuples (matching cursor.fetchall format)."""
return [
(
"NVIDIA",
"company_analysis",
"anthropic/claude-3.5-sonnet",
"Strong AI patent portfolio with focus on GPU architectures.",
datetime(2025, 6, 15, 10, 30, 0),
),
(
"NVIDIA",
"patent_analysis",
"openai/gpt-4o",
"Patent US-12345678-B2 covers novel tensor core design.",
datetime(2025, 6, 14, 9, 0, 0),
),
]
class TestCSVExport:
"""GET /export/{company_name} -- CSV export."""
def test_csv_export_success(self, client, mock_db):
"""Valid company with results returns a CSV file."""
mock_db._mock_cursor.fetchall.return_value = _sample_rows()
response = client.get("/export/NVIDIA", headers=_auth_header())
assert response.status_code == 200
assert response.headers["content-type"].startswith("text/csv")
assert "attachment" in response.headers.get("content-disposition", "")
assert "sparc_nvidia_export.csv" in response.headers["content-disposition"]
# Verify CSV content (CSV uses \r\n line endings)
lines = response.text.strip().split("\n")
assert len(lines) == 3 # header + 2 data rows
assert lines[0].strip() == "company_name,analysis_type,model,analysis,timestamp"
assert "NVIDIA" in lines[1]
assert "company_analysis" in lines[1]
def test_csv_export_no_results_returns_404(self, client, mock_db):
"""Unknown company returns 404."""
mock_db._mock_cursor.fetchall.return_value = []
response = client.get("/export/nonexistent", headers=_auth_header())
assert response.status_code == 404
assert "No analysis results found" in response.json()["detail"]
def test_csv_export_unauthenticated_returns_401(self, client):
"""Request without token returns 401."""
response = client.get("/export/NVIDIA")
assert response.status_code == 401
def test_csv_export_invalid_token_returns_401(self, client):
"""Request with invalid token returns 401."""
response = client.get(
"/export/NVIDIA",
headers={"Authorization": "Bearer invalid.token.here"},
)
assert response.status_code == 401
def test_csv_export_filename_sanitization(self, client, mock_db):
"""Company names with spaces get sanitized in the filename."""
mock_db._mock_cursor.fetchall.return_value = [
(
"Tesla Motors",
"company_analysis",
"anthropic/claude-3.5-sonnet",
"EV patent portfolio analysis.",
datetime(2025, 6, 15, 10, 0, 0),
),
]
response = client.get("/export/Tesla Motors", headers=_auth_header())
assert response.status_code == 200
assert "tesla_motors" in response.headers["content-disposition"]
def test_csv_export_single_row(self, client, mock_db):
"""Single analysis result produces valid CSV with one data row."""
mock_db._mock_cursor.fetchall.return_value = [_sample_rows()[0]]
response = client.get("/export/NVIDIA", headers=_auth_header())
assert response.status_code == 200
lines = response.text.strip().split("\n")
assert len(lines) == 2 # header + 1 data row
class TestPDFExport:
"""GET /export/{company_name}/pdf -- PDF report export."""
def test_pdf_export_success(self, client, mock_db):
"""Valid company with results returns a PDF file."""
mock_db._mock_cursor.fetchall.return_value = _sample_rows()
response = client.get("/export/NVIDIA/pdf", headers=_auth_header())
assert response.status_code == 200
assert response.headers["content-type"] == "application/pdf"
assert "attachment" in response.headers.get("content-disposition", "")
# PDF files start with %PDF
assert response.content[:4] == b"%PDF"
def test_pdf_export_no_results_returns_404(self, client, mock_db):
"""Unknown company returns 404."""
mock_db._mock_cursor.fetchall.return_value = []
response = client.get("/export/nonexistent/pdf", headers=_auth_header())
assert response.status_code == 404
assert "No analysis results found" in response.json()["detail"]
def test_pdf_export_unauthenticated_returns_401(self, client):
"""Request without token returns 401."""
response = client.get("/export/NVIDIA/pdf")
assert response.status_code == 401
def test_pdf_export_invalid_token_returns_401(self, client):
"""Request with invalid token returns 401."""
response = client.get(
"/export/NVIDIA/pdf",
headers={"Authorization": "Bearer invalid.token.here"},
)
assert response.status_code == 401
def test_pdf_export_filename_contains_date(self, client, mock_db):
"""PDF filename includes the analysis date."""
mock_db._mock_cursor.fetchall.return_value = _sample_rows()
response = client.get("/export/NVIDIA/pdf", headers=_auth_header())
assert response.status_code == 200
disposition = response.headers["content-disposition"]
assert "nvidia-analysis-" in disposition
assert ".pdf" in disposition
def test_pdf_export_special_chars_in_response(self, client, mock_db):
"""Analysis text with XML-special chars (<, >, &) does not break PDF generation."""
rows = [
(
"TestCo",
"company_analysis",
"anthropic/claude-3.5-sonnet",
"Revenue > $1B & growth <20% for Q4. Test <html> escaping.",
datetime(2025, 6, 15, 10, 0, 0),
),
]
mock_db._mock_cursor.fetchall.return_value = rows
response = client.get("/export/TestCo/pdf", headers=_auth_header())
assert response.status_code == 200
assert response.content[:4] == b"%PDF"
def test_pdf_export_multiple_analyses(self, client, mock_db):
"""Multiple analysis records produce a valid PDF with content."""
mock_db._mock_cursor.fetchall.return_value = _sample_rows()
response = client.get("/export/NVIDIA/pdf", headers=_auth_header())
assert response.status_code == 200
# PDF should have reasonable size (more than just headers)
assert len(response.content) > 500
+169
View File
@@ -0,0 +1,169 @@
"""Tests for cursor-based pagination on /analyze/batch GET and /jobs endpoints."""
from datetime import datetime, timedelta
from unittest.mock import Mock, patch
import pytest
from fastapi.testclient import TestClient
from SPARC.api import app
@pytest.fixture
def client():
"""Create test client."""
return TestClient(app)
def _make_analysis_row(id_: int, minutes_ago: int = 0, company: str = "nvidia"):
"""Create a fake analysis row dict."""
ts = datetime.now() - timedelta(minutes=minutes_ago)
return {
"id": id_,
"company_name": company,
"analysis_type": "patent_portfolio",
"model": "openai/gpt-4o",
"response": f"Analysis for {company}",
"timestamp": ts,
}
def _make_job_row(job_id: str, minutes_ago: int = 0, status: str = "completed"):
"""Create a fake job row dict."""
ts = datetime.now() - timedelta(minutes=minutes_ago)
return {
"job_id": job_id,
"status": status,
"progress": 100 if status == "completed" else 0,
"total_companies": 1,
"completed_companies": 1 if status == "completed" else 0,
"result": None,
"error": None,
"created_at": ts,
}
class TestAnalyzeBatchGetPagination:
"""Test cursor-based pagination on GET /analyze/batch."""
@patch("SPARC.api._get_job_db")
def test_returns_items_and_no_cursor_when_less_than_limit(self, mock_get_db, client):
"""When fewer results than limit, next_cursor should be null."""
db = Mock()
db.list_analyses.return_value = [
_make_analysis_row(1, minutes_ago=10),
_make_analysis_row(2, minutes_ago=20),
]
mock_get_db.return_value = db
response = client.get("/analyze/batch?limit=10")
assert response.status_code == 200
data = response.json()
assert len(data["items"]) == 2
assert data["next_cursor"] is None
@patch("SPARC.api._get_job_db")
def test_returns_cursor_when_more_results_exist(self, mock_get_db, client):
"""When more results exist than limit, next_cursor should be set."""
db = Mock()
# Return limit+1 rows to simulate more data
rows = [_make_analysis_row(i, minutes_ago=i) for i in range(4)]
db.list_analyses.return_value = rows
mock_get_db.return_value = db
response = client.get("/analyze/batch?limit=3")
assert response.status_code == 200
data = response.json()
assert len(data["items"]) == 3
assert data["next_cursor"] is not None
@patch("SPARC.api._get_job_db")
def test_cursor_passed_to_db(self, mock_get_db, client):
"""The cursor query param should be forwarded to the database layer."""
db = Mock()
db.list_analyses.return_value = []
mock_get_db.return_value = db
client.get("/analyze/batch?cursor=2025-01-01T00:00:00|42")
db.list_analyses.assert_called_once()
call_kwargs = db.list_analyses.call_args
assert call_kwargs.kwargs.get("cursor") == "2025-01-01T00:00:00|42" or \
(call_kwargs[1].get("cursor") == "2025-01-01T00:00:00|42" if len(call_kwargs) > 1 else False)
@patch("SPARC.api._get_job_db")
def test_default_limit_is_50(self, mock_get_db, client):
"""Default limit should be 50."""
db = Mock()
db.list_analyses.return_value = []
mock_get_db.return_value = db
client.get("/analyze/batch")
call_kwargs = db.list_analyses.call_args
# The endpoint requests limit+1 from DB, so 51
assert 51 in call_kwargs.args or call_kwargs.kwargs.get("limit") == 51
def test_limit_over_200_rejected(self, client):
"""Limit > 200 should be rejected with 422."""
response = client.get("/analyze/batch?limit=201")
assert response.status_code == 422
def test_limit_zero_rejected(self, client):
"""Limit < 1 should be rejected with 422."""
response = client.get("/analyze/batch?limit=0")
assert response.status_code == 422
@patch("SPARC.api._get_job_db")
def test_company_name_filter(self, mock_get_db, client):
"""The company_name filter should be forwarded to the database."""
db = Mock()
db.list_analyses.return_value = []
mock_get_db.return_value = db
client.get("/analyze/batch?company_name=intel")
call_kwargs = db.list_analyses.call_args
assert call_kwargs.kwargs.get("company_name") == "intel" or \
"intel" in (call_kwargs.args if call_kwargs.args else [])
@patch("SPARC.api._get_job_db")
def test_empty_result_set(self, mock_get_db, client):
"""Empty result set returns empty items and null cursor."""
db = Mock()
db.list_analyses.return_value = []
mock_get_db.return_value = db
response = client.get("/analyze/batch")
assert response.status_code == 200
data = response.json()
assert data["items"] == []
assert data["next_cursor"] is None
class TestJobsPaginationDefaults:
"""Test that /jobs endpoint uses updated defaults."""
@patch("SPARC.api._get_job_db")
def test_default_limit_is_50(self, mock_get_db, client):
"""Default limit should now be 50."""
db = Mock()
db.list_jobs.return_value = []
mock_get_db.return_value = db
client.get("/jobs")
call_kwargs = db.list_jobs.call_args
# Endpoint requests limit+1 from DB, so 51
assert 51 in call_kwargs.args or call_kwargs.kwargs.get("limit") == 51
def test_limit_over_200_rejected(self, client):
"""Limit > 200 should be rejected with 422."""
response = client.get("/jobs?limit=201")
assert response.status_code == 422
@patch("SPARC.api._get_job_db")
def test_limit_200_accepted(self, mock_get_db, client):
"""Limit of exactly 200 should be accepted."""
db = Mock()
db.list_jobs.return_value = []
mock_get_db.return_value = db
response = client.get("/jobs?limit=200")
assert response.status_code == 200
+2 -1
View File
@@ -1,7 +1,8 @@
"""Tests for rate limiting on auth endpoints."""
from unittest.mock import MagicMock, patch
import pytest
from unittest.mock import Mock, patch, MagicMock
from fastapi.testclient import TestClient
from SPARC.api import app
+178
View File
@@ -0,0 +1,178 @@
"""Tests for the /admin/rate-limits endpoint."""
from unittest.mock import patch
import pytest
from fastapi.testclient import TestClient
from SPARC import api
from SPARC.api import app
from SPARC.auth import UserResponse
@pytest.fixture
def client():
"""Create test client."""
return TestClient(app)
@pytest.fixture(autouse=True)
def reset_stats():
"""Reset rate limit stats between tests."""
api._rate_limit_stats.clear()
api._rejected_log.clear()
yield
api._rate_limit_stats.clear()
api._rejected_log.clear()
def _mock_admin():
"""Return a mock admin user."""
return UserResponse(id=1, email="admin@test.com", role="admin", created_at="2025-01-01T00:00:00")
def _mock_user():
"""Return a mock non-admin user."""
return UserResponse(id=2, email="user@test.com", role="user", created_at="2025-01-01T00:00:00")
class TestRateLimitAdminEndpoint:
"""Test GET /admin/rate-limits."""
def test_admin_can_access(self, client):
"""Admin users should be able to access the rate-limits endpoint."""
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
assert response.status_code == 200
data = response.json()
assert "rate_limits" in data
assert isinstance(data["rate_limits"], list)
finally:
app.dependency_overrides.clear()
def test_non_admin_rejected(self, client):
"""Non-admin users should get 401/403."""
response = client.get("/admin/rate-limits")
assert response.status_code in (401, 403)
def test_returns_configured_endpoints(self, client):
"""Should list all rate-limited endpoints."""
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
assert response.status_code == 200
data = response.json()
endpoints = [rl["endpoint"] for rl in data["rate_limits"]]
assert "/auth/register" in endpoints
assert "/auth/login" in endpoints
finally:
app.dependency_overrides.clear()
def test_empty_state_shows_zero_counts(self, client):
"""When no requests have been made, counts should be zero."""
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
for rl in data["rate_limits"]:
assert rl["total_requests"] == 0
assert rl["rejected_requests"] == 0
assert rl["by_ip"] == []
assert data["throttled_24h"] == 0
assert data["throttled_over_time"] == []
finally:
app.dependency_overrides.clear()
def test_tracks_requests(self, client):
"""After making requests, the stats should reflect them."""
api._track_rate_limit_request("/auth/login", "127.0.0.1")
api._track_rate_limit_request("/auth/login", "127.0.0.1")
api._track_rate_limit_request("/auth/login", "192.168.1.1", rejected=True)
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
login_stats = next(rl for rl in data["rate_limits"] if rl["endpoint"] == "/auth/login")
assert login_stats["total_requests"] == 3
assert login_stats["rejected_requests"] == 1
finally:
app.dependency_overrides.clear()
def test_includes_limit_config(self, client):
"""Each endpoint entry should include the rate limit config string."""
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
for rl in data["rate_limits"]:
assert "limit" in rl
assert isinstance(rl["limit"], str)
finally:
app.dependency_overrides.clear()
def test_per_ip_breakdown(self, client):
"""Stats should include per-IP breakdown with total and rejected counts."""
api._track_rate_limit_request("/auth/login", "10.0.0.1")
api._track_rate_limit_request("/auth/login", "10.0.0.1", rejected=True)
api._track_rate_limit_request("/auth/login", "10.0.0.2")
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
login_stats = next(rl for rl in data["rate_limits"] if rl["endpoint"] == "/auth/login")
by_ip = login_stats["by_ip"]
assert len(by_ip) == 2
ip1 = next(entry for entry in by_ip if entry["ip"] == "10.0.0.1")
assert ip1["total"] == 2
assert ip1["rejected"] == 1
ip2 = next(entry for entry in by_ip if entry["ip"] == "10.0.0.2")
assert ip2["total"] == 1
assert ip2["rejected"] == 0
finally:
app.dependency_overrides.clear()
def test_throttled_24h_count(self, client):
"""Should report total throttled requests in the last 24 hours."""
api._track_rate_limit_request("/auth/login", "10.0.0.1", rejected=True)
api._track_rate_limit_request("/auth/register", "10.0.0.2", rejected=True)
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
assert data["throttled_24h"] == 2
finally:
app.dependency_overrides.clear()
def test_throttled_over_time_structure(self, client):
"""Throttled-over-time should be a list of {timestamp, count} buckets."""
api._track_rate_limit_request("/auth/login", "10.0.0.1", rejected=True)
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
assert len(data["throttled_over_time"]) >= 1
entry = data["throttled_over_time"][0]
assert "timestamp" in entry
assert "count" in entry
assert entry["count"] >= 1
finally:
app.dependency_overrides.clear()
def test_response_shape_matches_contract(self, client):
"""The full response should match the expected shape for the frontend."""
app.dependency_overrides[api.get_current_admin] = _mock_admin
try:
response = client.get("/admin/rate-limits")
data = response.json()
# Top-level keys
assert set(data.keys()) == {"rate_limits", "throttled_24h", "throttled_over_time"}
# Each rate_limit entry
for rl in data["rate_limits"]:
assert set(rl.keys()) == {"endpoint", "limit", "total_requests", "rejected_requests", "by_ip"}
finally:
app.dependency_overrides.clear()
+7
View File
@@ -14,6 +14,7 @@ class TestJWTSecretStartupCheck:
with patch.dict(os.environ, {"APP_ENV": "production"}):
# Reload config to pick up the new APP_ENV
import importlib
import SPARC.config
importlib.reload(SPARC.config)
@@ -31,6 +32,7 @@ class TestJWTSecretStartupCheck:
"""Starting with default secret and APP_ENV=development must not raise."""
with patch.dict(os.environ, {"APP_ENV": "development"}):
import importlib
import SPARC.config
importlib.reload(SPARC.config)
@@ -46,6 +48,7 @@ class TestJWTSecretStartupCheck:
"""Starting with a custom secret in production must not raise."""
with patch.dict(os.environ, {"APP_ENV": "production"}):
import importlib
import SPARC.config
importlib.reload(SPARC.config)
@@ -65,6 +68,7 @@ class TestJWTSecretStartupCheck:
env.pop("APP_ENV", None)
with patch.dict(os.environ, env, clear=True):
import importlib
import SPARC.config
importlib.reload(SPARC.config)
@@ -84,6 +88,7 @@ class TestCORSConfig:
"""When CORS_ORIGINS is unset, defaults to localhost origins."""
with patch.dict(os.environ, {"CORS_ORIGINS": ""}):
import importlib
import SPARC.config
importlib.reload(SPARC.config)
assert SPARC.config.cors_origins == [
@@ -95,6 +100,7 @@ class TestCORSConfig:
"""Setting CORS_ORIGINS configures allowed origins."""
with patch.dict(os.environ, {"CORS_ORIGINS": "https://sparc.example.com,https://app.example.com"}):
import importlib
import SPARC.config
importlib.reload(SPARC.config)
assert SPARC.config.cors_origins == [
@@ -109,6 +115,7 @@ class TestCORSConfig:
"""A single origin without comma works correctly."""
with patch.dict(os.environ, {"CORS_ORIGINS": "https://sparc.example.com"}):
import importlib
import SPARC.config
importlib.reload(SPARC.config)
assert SPARC.config.cors_origins == ["https://sparc.example.com"]
+263
View File
@@ -0,0 +1,263 @@
"""Tests for S3/MinIO storage backend in storage.py.
Covers issue #1660:
- S3StorageBackend read, write, exists, path_for
- Error handling: NoSuchKey, generic S3 errors, bucket auto-creation
- get_storage_backend() factory function
- LocalStorageBackend (basic sanity checks)
"""
from unittest.mock import MagicMock, patch
import pytest
from SPARC.storage import LocalStorageBackend, S3StorageBackend, get_storage_backend
# ---------- S3StorageBackend ----------
class TestS3StorageBackend:
"""Tests for the S3-compatible storage backend."""
@pytest.fixture
def s3_backend(self):
"""Create an S3StorageBackend with a fully mocked boto3 client."""
with patch.dict("sys.modules", {"boto3": MagicMock()}):
import boto3 as mock_boto
mock_s3 = MagicMock()
mock_boto.client.return_value = mock_s3
mock_s3.head_bucket.return_value = {}
backend = S3StorageBackend(
bucket="test-bucket",
endpoint_url="http://minio:9000",
access_key="minioadmin",
secret_key="minioadmin",
)
# Expose mock for assertions
backend._mock_s3 = mock_s3
yield backend
def test_write_puts_object(self, s3_backend):
"""write() calls put_object with correct bucket, key, and body."""
s3_backend.write("US-12345678-B2.pdf", b"PDF content here")
s3_backend._mock_s3.put_object.assert_called_once_with(
Bucket="test-bucket",
Key="US-12345678-B2.pdf",
Body=b"PDF content here",
ContentType="application/pdf",
)
def test_read_returns_body(self, s3_backend):
"""read() returns the Body content from get_object."""
mock_body = MagicMock()
mock_body.read.return_value = b"PDF data"
s3_backend._mock_s3.get_object.return_value = {"Body": mock_body}
result = s3_backend.read("US-12345678-B2.pdf")
assert result == b"PDF data"
s3_backend._mock_s3.get_object.assert_called_once_with(
Bucket="test-bucket",
Key="US-12345678-B2.pdf",
)
def test_read_nosuchkey_raises_file_not_found(self, s3_backend):
"""read() raises FileNotFoundError when object does not exist."""
# Create a NoSuchKey exception class on the mock
nosuchkey = type("NoSuchKey", (Exception,), {})
s3_backend._mock_s3.exceptions.NoSuchKey = nosuchkey
s3_backend._mock_s3.get_object.side_effect = nosuchkey("not found")
# Reassign s3 to trigger the except branch
s3_backend.s3 = s3_backend._mock_s3
with pytest.raises(FileNotFoundError, match="S3 object not found"):
s3_backend.read("missing.pdf")
def test_read_generic_404_raises_file_not_found(self, s3_backend):
"""read() handles generic 404 errors from S3-compatible APIs."""
nosuchkey = type("NoSuchKey", (Exception,), {})
s3_backend._mock_s3.exceptions.NoSuchKey = nosuchkey
s3_backend.s3 = s3_backend._mock_s3
s3_backend.s3.get_object.side_effect = Exception("An error occurred (404)")
with pytest.raises(FileNotFoundError, match="S3 object not found"):
s3_backend.read("missing.pdf")
def test_read_other_error_re_raises(self, s3_backend):
"""read() re-raises non-404 errors."""
nosuchkey = type("NoSuchKey", (Exception,), {})
s3_backend._mock_s3.exceptions.NoSuchKey = nosuchkey
s3_backend.s3 = s3_backend._mock_s3
s3_backend.s3.get_object.side_effect = Exception("Internal server error")
with pytest.raises(Exception, match="Internal server error"):
s3_backend.read("some-file.pdf")
def test_exists_returns_true_for_existing_object(self, s3_backend):
"""exists() returns True when head_object succeeds with content."""
s3_backend._mock_s3.head_object.return_value = {"ContentLength": 1024}
assert s3_backend.exists("US-12345678-B2.pdf") is True
def test_exists_returns_false_for_missing_object(self, s3_backend):
"""exists() returns False when head_object raises an exception."""
s3_backend._mock_s3.head_object.side_effect = Exception("Not Found")
assert s3_backend.exists("missing.pdf") is False
def test_exists_returns_false_for_zero_length(self, s3_backend):
"""exists() returns False when object has zero content length."""
s3_backend._mock_s3.head_object.return_value = {"ContentLength": 0}
assert s3_backend.exists("empty.pdf") is False
def test_path_for_returns_s3_uri(self, s3_backend):
"""path_for() returns an s3:// URI."""
path = s3_backend.path_for("US-12345678-B2.pdf")
assert path == "s3://test-bucket/US-12345678-B2.pdf"
def test_constructor_creates_bucket_if_missing(self):
"""Constructor creates the bucket if head_bucket fails."""
with patch.dict("sys.modules", {"boto3": MagicMock()}):
import boto3 as mock_boto
mock_s3 = MagicMock()
mock_boto.client.return_value = mock_s3
mock_s3.head_bucket.side_effect = Exception("Bucket not found")
S3StorageBackend(
bucket="new-bucket",
endpoint_url="http://minio:9000",
access_key="admin",
secret_key="admin",
)
mock_s3.create_bucket.assert_called_once_with(Bucket="new-bucket")
def test_constructor_handles_bucket_creation_failure(self):
"""Constructor logs warning but does not crash if bucket creation fails."""
with patch.dict("sys.modules", {"boto3": MagicMock()}):
import boto3 as mock_boto
mock_s3 = MagicMock()
mock_boto.client.return_value = mock_s3
mock_s3.head_bucket.side_effect = Exception("Bucket not found")
mock_s3.create_bucket.side_effect = Exception("Permission denied")
# Should not raise
backend = S3StorageBackend(
bucket="locked-bucket",
endpoint_url="http://minio:9000",
access_key="admin",
secret_key="admin",
)
assert backend.bucket == "locked-bucket"
def test_constructor_passes_endpoint_and_credentials(self):
"""Constructor passes endpoint_url and credentials to boto3.client."""
with patch.dict("sys.modules", {"boto3": MagicMock()}):
import boto3 as mock_boto
mock_s3 = MagicMock()
mock_boto.client.return_value = mock_s3
S3StorageBackend(
bucket="test",
endpoint_url="http://minio:9000",
access_key="mykey",
secret_key="mysecret",
)
mock_boto.client.assert_called_with(
"s3",
endpoint_url="http://minio:9000",
aws_access_key_id="mykey",
aws_secret_access_key="mysecret",
)
# ---------- LocalStorageBackend ----------
class TestLocalStorageBackend:
"""Basic sanity checks for the local filesystem backend."""
def test_write_and_read(self, tmp_path):
"""Write and read round-trip produces identical content."""
backend = LocalStorageBackend(base_dir=str(tmp_path))
backend.write("test.pdf", b"hello world")
result = backend.read("test.pdf")
assert result == b"hello world"
def test_read_missing_file_raises(self, tmp_path):
"""Reading a non-existent file raises FileNotFoundError."""
backend = LocalStorageBackend(base_dir=str(tmp_path))
with pytest.raises(FileNotFoundError):
backend.read("nonexistent.pdf")
def test_exists_true_for_written_file(self, tmp_path):
"""exists() returns True after writing a file."""
backend = LocalStorageBackend(base_dir=str(tmp_path))
backend.write("test.pdf", b"data")
assert backend.exists("test.pdf") is True
def test_exists_false_for_missing_file(self, tmp_path):
"""exists() returns False for non-existent file."""
backend = LocalStorageBackend(base_dir=str(tmp_path))
assert backend.exists("missing.pdf") is False
def test_exists_false_for_empty_file(self, tmp_path):
"""exists() returns False for zero-length file."""
backend = LocalStorageBackend(base_dir=str(tmp_path))
backend.write("empty.pdf", b"")
assert backend.exists("empty.pdf") is False
def test_path_for_returns_full_path(self, tmp_path):
"""path_for() returns the full filesystem path."""
backend = LocalStorageBackend(base_dir=str(tmp_path))
path = backend.path_for("test.pdf")
assert path == str(tmp_path / "test.pdf")
# ---------- get_storage_backend() factory ----------
class TestGetStorageBackend:
"""Tests for the storage backend factory function."""
@patch("SPARC.storage.config")
def test_returns_local_backend_by_default(self, mock_config):
"""Default config returns LocalStorageBackend."""
mock_config.storage_backend = "local"
backend = get_storage_backend()
assert isinstance(backend, LocalStorageBackend)
@patch("SPARC.storage.config")
def test_returns_s3_backend_when_configured(self, mock_config):
"""Setting storage_backend=s3 returns S3StorageBackend."""
mock_config.storage_backend = "s3"
mock_config.s3_bucket = "test-bucket"
mock_config.s3_endpoint_url = "http://minio:9000"
mock_config.s3_access_key = "key"
mock_config.s3_secret_key = "secret"
with patch.dict("sys.modules", {"boto3": MagicMock()}):
backend = get_storage_backend()
assert isinstance(backend, S3StorageBackend)
@patch("SPARC.storage.config")
def test_case_insensitive_backend_selection(self, mock_config):
"""Backend selection is case-insensitive."""
mock_config.storage_backend = "LOCAL"
backend = get_storage_backend()
assert isinstance(backend, LocalStorageBackend)
+387
View File
@@ -0,0 +1,387 @@
"""Tests for tracked company admin endpoints and scheduler integration.
Covers issue #1656:
- GET /admin/tracked (list tracked companies)
- POST /admin/tracked (add a tracked company)
- DELETE /admin/tracked/{company_name} (remove a tracked company)
- GET /admin/alerts (list alerts)
- scheduler.run_scheduled_analysis() integration
All tests mock the database layer and use JWT auth fixtures.
"""
from datetime import datetime, timezone
from unittest.mock import MagicMock, patch, call
import pytest
from fastapi.testclient import TestClient
from SPARC.api import app
from SPARC.auth import create_access_token
@pytest.fixture
def client():
"""Create test client."""
return TestClient(app)
@pytest.fixture(autouse=True)
def mock_db():
"""Mock the database client used by admin and auth endpoints."""
db = MagicMock()
# Default admin user for auth
db.get_user_by_id.return_value = {
"id": 1,
"email": "admin@test.com",
"role": "admin",
"created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
}
with patch("SPARC.api.get_db_client", return_value=db), \
patch("SPARC.auth.get_db_client", return_value=db):
yield db
def _admin_header():
"""Create an Authorization header with a valid admin access token."""
token = create_access_token(1, "admin@test.com", "admin")
return {"Authorization": f"Bearer {token}"}
def _user_header():
"""Create an Authorization header with a regular user access token."""
token = create_access_token(2, "user@test.com", "user")
return {"Authorization": f"Bearer {token}"}
# ---------- GET /admin/tracked ----------
class TestListTrackedCompanies:
"""GET /admin/tracked"""
def test_list_tracked_returns_companies(self, client, mock_db):
"""Admin can list tracked companies."""
mock_db.list_tracked_companies.return_value = [
{"company_name": "NVIDIA", "last_patent_count": 120, "last_analyzed": "2025-06-15"},
{"company_name": "AMD", "last_patent_count": 80, "last_analyzed": "2025-06-14"},
]
response = client.get("/admin/tracked", headers=_admin_header())
assert response.status_code == 200
data = response.json()
assert len(data) == 2
assert data[0]["company_name"] == "NVIDIA"
def test_list_tracked_empty(self, client, mock_db):
"""Returns empty list when no companies are tracked."""
mock_db.list_tracked_companies.return_value = []
response = client.get("/admin/tracked", headers=_admin_header())
assert response.status_code == 200
assert response.json() == []
def test_list_tracked_requires_admin(self, client, mock_db):
"""Regular user cannot access tracked companies list."""
mock_db.get_user_by_id.return_value = {
"id": 2,
"email": "user@test.com",
"role": "user",
"created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
}
response = client.get("/admin/tracked", headers=_user_header())
assert response.status_code == 403
def test_list_tracked_unauthenticated(self, client):
"""Unauthenticated request returns 401."""
response = client.get("/admin/tracked")
assert response.status_code == 401
# ---------- POST /admin/tracked ----------
class TestAddTrackedCompany:
"""POST /admin/tracked"""
def test_add_tracked_company_success(self, client, mock_db):
"""Admin can add a company to tracking."""
mock_db.add_tracked_company.return_value = {
"company_name": "Intel",
"last_patent_count": 0,
"last_analyzed": None,
}
response = client.post(
"/admin/tracked",
json={"company_name": "Intel"},
headers=_admin_header(),
)
assert response.status_code == 200
data = response.json()
assert data["company_name"] == "Intel"
mock_db.add_tracked_company.assert_called_once_with("Intel")
def test_add_duplicate_returns_409(self, client, mock_db):
"""Adding an already-tracked company returns 409."""
mock_db.add_tracked_company.return_value = None
response = client.post(
"/admin/tracked",
json={"company_name": "NVIDIA"},
headers=_admin_header(),
)
assert response.status_code == 409
assert "already tracked" in response.json()["detail"].lower()
def test_add_tracked_requires_admin(self, client, mock_db):
"""Regular user cannot add tracked companies."""
mock_db.get_user_by_id.return_value = {
"id": 2,
"email": "user@test.com",
"role": "user",
"created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
}
response = client.post(
"/admin/tracked",
json={"company_name": "Intel"},
headers=_user_header(),
)
assert response.status_code == 403
def test_add_tracked_empty_name_rejected(self, client):
"""Empty company name is rejected by validation."""
response = client.post(
"/admin/tracked",
json={"company_name": ""},
headers=_admin_header(),
)
assert response.status_code == 422 # Pydantic validation error
# ---------- DELETE /admin/tracked/{company_name} ----------
class TestRemoveTrackedCompany:
"""DELETE /admin/tracked/{company_name}"""
def test_remove_tracked_company_success(self, client, mock_db):
"""Admin can remove a tracked company."""
mock_db.remove_tracked_company.return_value = True
response = client.delete(
"/admin/tracked/NVIDIA",
headers=_admin_header(),
)
assert response.status_code == 200
assert "Stopped tracking" in response.json()["message"]
mock_db.remove_tracked_company.assert_called_once_with("NVIDIA")
def test_remove_nonexistent_returns_404(self, client, mock_db):
"""Removing a non-tracked company returns 404."""
mock_db.remove_tracked_company.return_value = False
response = client.delete(
"/admin/tracked/UnknownCorp",
headers=_admin_header(),
)
assert response.status_code == 404
assert "not found" in response.json()["detail"].lower()
def test_remove_tracked_requires_admin(self, client, mock_db):
"""Regular user cannot remove tracked companies."""
mock_db.get_user_by_id.return_value = {
"id": 2,
"email": "user@test.com",
"role": "user",
"created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
}
response = client.delete(
"/admin/tracked/NVIDIA",
headers=_user_header(),
)
assert response.status_code == 403
# ---------- GET /admin/alerts ----------
class TestListAlerts:
"""GET /admin/alerts"""
def test_list_alerts_returns_data(self, client, mock_db):
"""Admin can list alerts."""
mock_db.list_alerts.return_value = [
{
"id": 1,
"company_name": "NVIDIA",
"alert_type": "patent_count_change",
"message": "Patent count increased by 25%",
"created_at": "2025-06-15T10:00:00Z",
},
]
response = client.get("/admin/alerts", headers=_admin_header())
assert response.status_code == 200
data = response.json()
assert len(data) == 1
assert data[0]["alert_type"] == "patent_count_change"
def test_list_alerts_with_limit(self, client, mock_db):
"""Custom limit parameter is passed to the database."""
mock_db.list_alerts.return_value = []
response = client.get("/admin/alerts?limit=10", headers=_admin_header())
assert response.status_code == 200
mock_db.list_alerts.assert_called_once_with(limit=10)
def test_list_alerts_requires_admin(self, client, mock_db):
"""Regular user cannot access alerts."""
mock_db.get_user_by_id.return_value = {
"id": 2,
"email": "user@test.com",
"role": "user",
"created_at": datetime(2025, 1, 1, tzinfo=timezone.utc),
}
response = client.get("/admin/alerts", headers=_user_header())
assert response.status_code == 403
# ---------- Scheduler integration ----------
class TestSchedulerIntegration:
"""Tests for scheduler.run_scheduled_analysis()."""
def test_no_tracked_companies_skips_analysis(self):
"""Scheduler does nothing when no companies are tracked."""
mock_db = MagicMock()
mock_db.list_tracked_companies.return_value = []
with patch("SPARC.scheduler.get_db_client", return_value=mock_db), \
patch("SPARC.scheduler.CompanyAnalyzer") as mock_analyzer_cls:
from SPARC.scheduler import run_scheduled_analysis
run_scheduled_analysis()
mock_analyzer_cls.assert_not_called()
def test_scheduler_analyzes_each_tracked_company(self):
"""Scheduler runs analysis for every tracked company."""
mock_db = MagicMock()
mock_db.list_tracked_companies.return_value = [
{"company_name": "NVIDIA", "last_patent_count": 100},
{"company_name": "AMD", "last_patent_count": 50},
]
mock_result_nvidia = MagicMock(success=True, patent_count=110)
mock_result_amd = MagicMock(success=True, patent_count=55)
mock_analyzer = MagicMock()
mock_analyzer._analyze_company_safe.side_effect = [mock_result_nvidia, mock_result_amd]
with patch("SPARC.scheduler.get_db_client", return_value=mock_db), \
patch("SPARC.scheduler.CompanyAnalyzer", return_value=mock_analyzer):
from SPARC.scheduler import run_scheduled_analysis
run_scheduled_analysis()
assert mock_analyzer._analyze_company_safe.call_count == 2
mock_db.update_tracked_company.assert_any_call("NVIDIA", 110)
mock_db.update_tracked_company.assert_any_call("AMD", 55)
def test_scheduler_triggers_alert_on_significant_change(self):
"""Scheduler stores an alert when patent count changes significantly."""
mock_db = MagicMock()
mock_db.list_tracked_companies.return_value = [
{"company_name": "Tesla", "last_patent_count": 100},
]
mock_result = MagicMock(success=True, patent_count=130) # 30% increase
mock_analyzer = MagicMock()
mock_analyzer._analyze_company_safe.return_value = mock_result
with patch("SPARC.scheduler.get_db_client", return_value=mock_db), \
patch("SPARC.scheduler.CompanyAnalyzer", return_value=mock_analyzer):
from SPARC.scheduler import run_scheduled_analysis
run_scheduled_analysis()
mock_db.store_alert.assert_called_once()
alert_kwargs = mock_db.store_alert.call_args
assert alert_kwargs[1]["company_name"] == "Tesla"
assert alert_kwargs[1]["alert_type"] == "patent_count_change"
assert alert_kwargs[1]["old_value"] == 100
assert alert_kwargs[1]["new_value"] == 130
def test_scheduler_no_alert_for_small_change(self):
"""Scheduler does not alert when change is below threshold."""
mock_db = MagicMock()
mock_db.list_tracked_companies.return_value = [
{"company_name": "Intel", "last_patent_count": 100},
]
mock_result = MagicMock(success=True, patent_count=105) # 5% increase
mock_analyzer = MagicMock()
mock_analyzer._analyze_company_safe.return_value = mock_result
with patch("SPARC.scheduler.get_db_client", return_value=mock_db), \
patch("SPARC.scheduler.CompanyAnalyzer", return_value=mock_analyzer):
from SPARC.scheduler import run_scheduled_analysis
run_scheduled_analysis()
mock_db.store_alert.assert_not_called()
def test_scheduler_handles_analysis_failure(self):
"""Scheduler continues when one company fails analysis."""
mock_db = MagicMock()
mock_db.list_tracked_companies.return_value = [
{"company_name": "FailCo", "last_patent_count": 50},
{"company_name": "SuccessCo", "last_patent_count": 30},
]
mock_fail_result = MagicMock(success=False, error="API timeout")
mock_ok_result = MagicMock(success=True, patent_count=35)
mock_analyzer = MagicMock()
mock_analyzer._analyze_company_safe.side_effect = [mock_fail_result, mock_ok_result]
with patch("SPARC.scheduler.get_db_client", return_value=mock_db), \
patch("SPARC.scheduler.CompanyAnalyzer", return_value=mock_analyzer):
from SPARC.scheduler import run_scheduled_analysis
run_scheduled_analysis()
# FailCo should not get updated, SuccessCo should
mock_db.update_tracked_company.assert_called_once_with("SuccessCo", 35)
def test_scheduler_handles_exception_in_analysis(self):
"""Scheduler continues even when analysis raises an exception."""
mock_db = MagicMock()
mock_db.list_tracked_companies.return_value = [
{"company_name": "CrashCo", "last_patent_count": 10},
{"company_name": "OKCo", "last_patent_count": 20},
]
mock_ok_result = MagicMock(success=True, patent_count=22)
mock_analyzer = MagicMock()
mock_analyzer._analyze_company_safe.side_effect = [
RuntimeError("unexpected error"),
mock_ok_result,
]
with patch("SPARC.scheduler.get_db_client", return_value=mock_db), \
patch("SPARC.scheduler.CompanyAnalyzer", return_value=mock_analyzer):
from SPARC.scheduler import run_scheduled_analysis
run_scheduled_analysis()
# OKCo should still be processed
mock_db.update_tracked_company.assert_called_once_with("OKCo", 22)
+280
View File
@@ -0,0 +1,280 @@
"""Tests for webhook notification system: retry logic and Slack/Discord payload format.
Covers issue #1657:
- Retry logic with exponential backoff in _send_with_retry
- Slack/Discord payload formatting in _build_payload
- Generic HTTP POST payload formatting
- notify() dispatching to multiple URLs
- notify_job_completed() and notify_alert() convenience helpers
"""
from datetime import datetime
from unittest.mock import MagicMock, patch, call
import pytest
import requests
from SPARC.webhooks import (
MAX_RETRIES,
_build_payload,
_is_slack_url,
_send_with_retry,
notify,
notify_alert,
notify_job_completed,
)
class TestIsSlackUrl:
"""Tests for Slack/Discord URL detection."""
def test_slack_webhook_url(self):
assert _is_slack_url("https://hooks.slack.com/services/T00/B00/xxx") is True
def test_discord_webhook_url(self):
assert _is_slack_url("https://discord.com/api/webhooks/123/abc") is True
def test_generic_url(self):
assert _is_slack_url("https://example.com/webhook") is False
def test_empty_url(self):
assert _is_slack_url("") is False
class TestBuildPayload:
"""Tests for payload construction."""
def test_generic_payload_structure(self):
"""Generic payload includes event type, timestamp, and data."""
payload = _build_payload("job_completed", {"job_id": "abc123"})
assert payload["event"] == "job_completed"
assert payload["job_id"] == "abc123"
assert "timestamp" in payload
# Timestamp should be ISO format ending with Z
assert payload["timestamp"].endswith("Z")
def test_slack_payload_wraps_in_text(self):
"""Slack payload wraps content in a 'text' field."""
payload = _build_payload("patent_alert", {"company_name": "NVIDIA"}, slack=True)
assert "text" in payload
assert "patent_alert" in payload["text"]
assert "NVIDIA" in payload["text"]
# Slack payload should NOT have the event/timestamp at top level
assert "event" not in payload
assert "timestamp" not in payload
def test_generic_payload_does_not_have_text_field(self):
"""Non-Slack payload does not wrap in text."""
payload = _build_payload("job_completed", {"status": "done"})
assert "text" not in payload
assert payload["status"] == "done"
def test_slack_payload_contains_bold_header(self):
"""Slack payload starts with bold event header using Slack markdown."""
payload = _build_payload("job_completed", {"count": 5}, slack=True)
assert payload["text"].startswith("*[SPARC] job_completed*")
def test_payload_merges_all_data_keys(self):
"""All data keys are included in the generic payload."""
data = {"key1": "val1", "key2": 42, "key3": True}
payload = _build_payload("test_event", data)
assert payload["key1"] == "val1"
assert payload["key2"] == 42
assert payload["key3"] is True
class TestSendWithRetry:
"""Tests for retry logic in _send_with_retry."""
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_success_on_first_attempt(self, mock_post, mock_sleep):
"""Successful delivery on first attempt, no retries."""
mock_post.return_value = MagicMock(status_code=200)
result = _send_with_retry("https://example.com/hook", {"event": "test"})
assert result is True
mock_post.assert_called_once()
mock_sleep.assert_not_called()
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_success_on_second_attempt(self, mock_post, mock_sleep):
"""Fails first, succeeds on retry."""
mock_post.side_effect = [
MagicMock(status_code=500),
MagicMock(status_code=200),
]
result = _send_with_retry("https://example.com/hook", {"event": "test"})
assert result is True
assert mock_post.call_count == 2
mock_sleep.assert_called_once()
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_all_retries_exhausted(self, mock_post, mock_sleep):
"""Returns False after all retries fail."""
mock_post.return_value = MagicMock(status_code=500)
result = _send_with_retry("https://example.com/hook", {"event": "test"})
assert result is False
assert mock_post.call_count == MAX_RETRIES
assert mock_sleep.call_count == MAX_RETRIES - 1
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_exponential_backoff_timing(self, mock_post, mock_sleep):
"""Backoff wait times follow exponential pattern (2^attempt)."""
mock_post.return_value = MagicMock(status_code=500)
_send_with_retry("https://example.com/hook", {"event": "test"})
# With BACKOFF_BASE=2: attempt 1 -> sleep(2), attempt 2 -> sleep(4)
expected_waits = [call(2 ** i) for i in range(1, MAX_RETRIES)]
assert mock_sleep.call_args_list == expected_waits
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_network_error_triggers_retry(self, mock_post, mock_sleep):
"""Network exceptions trigger retry, not immediate failure."""
mock_post.side_effect = [
requests.ConnectionError("Connection refused"),
MagicMock(status_code=200),
]
result = _send_with_retry("https://example.com/hook", {"event": "test"})
assert result is True
assert mock_post.call_count == 2
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_timeout_error_triggers_retry(self, mock_post, mock_sleep):
"""Timeout exceptions trigger retry."""
mock_post.side_effect = [
requests.Timeout("Request timed out"),
MagicMock(status_code=200),
]
result = _send_with_retry("https://example.com/hook", {"event": "test"})
assert result is True
assert mock_post.call_count == 2
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_2xx_status_codes_accepted(self, mock_post, mock_sleep):
"""Any 2xx status code is treated as success."""
mock_post.return_value = MagicMock(status_code=204)
result = _send_with_retry("https://example.com/hook", {"event": "test"})
assert result is True
mock_post.assert_called_once()
@patch("SPARC.webhooks.time.sleep")
@patch("SPARC.webhooks.requests.post")
def test_posts_json_payload(self, mock_post, mock_sleep):
"""Payload is sent as JSON with correct timeout."""
mock_post.return_value = MagicMock(status_code=200)
payload = {"event": "test", "data": "value"}
_send_with_retry("https://example.com/hook", payload)
mock_post.assert_called_once_with(
"https://example.com/hook", json=payload, timeout=10
)
class TestNotify:
"""Tests for the notify() dispatcher."""
@patch("SPARC.webhooks._send_with_retry")
@patch("SPARC.webhooks.WEBHOOK_URLS", ["https://example.com/hook1", "https://example.com/hook2"])
def test_dispatches_to_all_urls(self, mock_send):
"""notify() sends to every configured webhook URL."""
mock_send.return_value = True
notify("job_completed", {"job_id": "test123"})
assert mock_send.call_count == 2
@patch("SPARC.webhooks._send_with_retry")
@patch("SPARC.webhooks.WEBHOOK_URLS", [])
def test_no_urls_configured_returns_immediately(self, mock_send):
"""No-op when no webhook URLs are configured."""
notify("job_completed", {"job_id": "test123"})
mock_send.assert_not_called()
@patch("SPARC.webhooks._send_with_retry")
@patch("SPARC.webhooks.WEBHOOK_URLS", [
"https://hooks.slack.com/services/T00/B00/xxx",
"https://example.com/generic",
])
def test_slack_url_gets_slack_payload(self, mock_send):
"""Slack URLs receive Slack-formatted payloads, others get generic."""
mock_send.return_value = True
notify("test_event", {"key": "val"})
# First call (Slack URL) should have "text" key
slack_payload = mock_send.call_args_list[0][0][1]
assert "text" in slack_payload
# Second call (generic URL) should have "event" key
generic_payload = mock_send.call_args_list[1][0][1]
assert "event" in generic_payload
assert generic_payload["event"] == "test_event"
class TestNotifyJobCompleted:
"""Tests for notify_job_completed() convenience function."""
@patch("SPARC.webhooks.notify")
def test_sends_correct_event_and_data(self, mock_notify):
"""Job completion sends proper event type and summary."""
notify_job_completed(
job_id="batch-001",
status="completed",
total_companies=10,
successful=8,
failed=2,
)
mock_notify.assert_called_once()
event, data = mock_notify.call_args[0]
assert event == "job_completed"
assert data["job_id"] == "batch-001"
assert data["successful"] == 8
assert data["failed"] == 2
assert "8/10" in data["summary"]
class TestNotifyAlert:
"""Tests for notify_alert() convenience function."""
@patch("SPARC.webhooks.notify")
def test_sends_correct_event_and_data(self, mock_notify):
"""Alert notification sends patent_alert event type."""
notify_alert(
company_name="NVIDIA",
alert_type="patent_count_change",
message="Patent count increased by 30%",
)
mock_notify.assert_called_once()
event, data = mock_notify.call_args[0]
assert event == "patent_alert"
assert data["company_name"] == "NVIDIA"
assert data["alert_type"] == "patent_count_change"
assert "30%" in data["message"]