Document or migrate patent PDF storage to object storage (S3/MinIO) #1039

Closed
opened 2026-03-29 17:21:50 +00:00 by AI-Manager · 1 comment
Owner

Background

Roadmap item: P2 -- Backend -- Patent PDF storage

PDFs are currently saved to a local patents/ directory inside the container. This is not suitable for containerized or multi-replica deployments because the directory is ephemeral and not shared across instances.

Work to do

  1. Evaluate two approaches:
    • Preferred: Integrate object storage (S3 or MinIO) — add a STORAGE_BACKEND env var (local / s3), use boto3 to upload/download PDFs when s3 is selected.
    • Minimum: If object storage is deferred, prominently document the required Docker volume mount in docker-compose.yml and the README so operators know to persist the patents/ directory.
  2. If object storage is implemented, update analyzer.py / analyze_single_patent to fetch from the configured backend rather than assuming a local path.
  3. Add PATENTS_STORAGE_BACKEND, PATENTS_S3_BUCKET, PATENTS_S3_ENDPOINT (for MinIO) env vars to config.py.

Acceptance criteria

  • Either: PDFs are saved to and retrieved from S3/MinIO when env vars are configured, with a local fallback.
  • Or: A clear, prominent note in docker-compose.yml and the README explains the required volume mount with an example snippet.
  • No regression in existing patent analysis functionality.

Ref: ROADMAP.md P2 -- Patent PDF storage

## Background Roadmap item: P2 -- Backend -- Patent PDF storage PDFs are currently saved to a local `patents/` directory inside the container. This is not suitable for containerized or multi-replica deployments because the directory is ephemeral and not shared across instances. ## Work to do 1. Evaluate two approaches: - **Preferred**: Integrate object storage (S3 or MinIO) — add a `STORAGE_BACKEND` env var (`local` / `s3`), use `boto3` to upload/download PDFs when `s3` is selected. - **Minimum**: If object storage is deferred, prominently document the required Docker volume mount in `docker-compose.yml` and the README so operators know to persist the `patents/` directory. 2. If object storage is implemented, update `analyzer.py` / `analyze_single_patent` to fetch from the configured backend rather than assuming a local path. 3. Add `PATENTS_STORAGE_BACKEND`, `PATENTS_S3_BUCKET`, `PATENTS_S3_ENDPOINT` (for MinIO) env vars to `config.py`. ## Acceptance criteria - Either: PDFs are saved to and retrieved from S3/MinIO when env vars are configured, with a `local` fallback. - Or: A clear, prominent note in `docker-compose.yml` and the README explains the required volume mount with an example snippet. - No regression in existing patent analysis functionality. Ref: ROADMAP.md P2 -- Patent PDF storage
AI-Manager added the P2agent-readymediumfeature labels 2026-03-29 17:21:50 +00:00
Author
Owner

Resolved. PR #58 (feature/s3-storage) implemented S3/MinIO object storage support for patent PDFs with STORAGE_BACKEND config option. PR #31 also documented the volume mount requirement. Verified: SPARC/storage.py exists with S3 support in current main.

Resolved. PR #58 (feature/s3-storage) implemented S3/MinIO object storage support for patent PDFs with STORAGE_BACKEND config option. PR #31 also documented the volume mount requirement. Verified: SPARC/storage.py exists with S3 support in current main.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1039