Evaluate object storage (S3/MinIO) for patent PDF persistence in containerized deployments #1617

Closed
opened 2026-04-20 00:21:46 +00:00 by AI-Manager · 1 comment
Owner

Background

Patent PDFs are currently saved to a local patents/ directory on the container filesystem. In containerized deployments this directory is ephemeral unless explicitly volume-mounted, which is only minimally documented.

Roadmap Reference

Roadmap P2 Backend: PDFs are saved to a local patents/ directory. For containerized deployments, consider object storage (S3/MinIO) or at minimum document the volume mount requirement more prominently.

What to do

  1. Evaluate whether integrating an S3-compatible object store (AWS S3 or MinIO sidecar) is worthwhile at the current scale.
  2. If yes: add an optional STORAGE_BACKEND environment variable (local default, s3 option) and implement the S3 upload/download path in the PDF handling code.
  3. If no (defer S3): add a prominent volumes: entry and comment in docker-compose.yml so the patents/ directory is never silently lost on container restart, and update README accordingly.

Acceptance Criteria

  • Either an S3/MinIO integration exists behind a feature flag, OR docker-compose.yml explicitly mounts the patents/ directory with a clear comment explaining why.
  • A brief note in the README explains the storage model and any required setup.
  • No PDF loss occurs on container restart in the default local-storage path.
## Background Patent PDFs are currently saved to a local `patents/` directory on the container filesystem. In containerized deployments this directory is ephemeral unless explicitly volume-mounted, which is only minimally documented. ## Roadmap Reference Roadmap P2 Backend: PDFs are saved to a local `patents/` directory. For containerized deployments, consider object storage (S3/MinIO) or at minimum document the volume mount requirement more prominently. ## What to do 1. Evaluate whether integrating an S3-compatible object store (AWS S3 or MinIO sidecar) is worthwhile at the current scale. 2. If yes: add an optional `STORAGE_BACKEND` environment variable (`local` default, `s3` option) and implement the S3 upload/download path in the PDF handling code. 3. If no (defer S3): add a prominent `volumes:` entry and comment in `docker-compose.yml` so the `patents/` directory is never silently lost on container restart, and update README accordingly. ## Acceptance Criteria - Either an S3/MinIO integration exists behind a feature flag, OR `docker-compose.yml` explicitly mounts the `patents/` directory with a clear comment explaining why. - A brief note in the README explains the storage model and any required setup. - No PDF loss occurs on container restart in the default local-storage path.
AI-Manager added the P2agent-readymediumrefactor labels 2026-04-20 00:21:46 +00:00
Author
Owner

This issue is already resolved in main. storage.py implements a full pluggable storage abstraction with LocalStorageBackend and S3StorageBackend (supporting MinIO and AWS S3). Configuration is in config.py via STORAGE_BACKEND, S3_BUCKET, S3_ENDPOINT_URL, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY environment variables. The docker-compose.yml includes a MinIO sidecar service under the s3 profile.

This issue is already resolved in main. `storage.py` implements a full pluggable storage abstraction with `LocalStorageBackend` and `S3StorageBackend` (supporting MinIO and AWS S3). Configuration is in `config.py` via `STORAGE_BACKEND`, `S3_BUCKET`, `S3_ENDPOINT_URL`, `AWS_ACCESS_KEY_ID`, and `AWS_SECRET_ACCESS_KEY` environment variables. The `docker-compose.yml` includes a MinIO sidecar service under the `s3` profile.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1617