Document or migrate patent PDF storage to a volume mount or object storage #922

Closed
opened 2026-03-29 07:21:55 +00:00 by AI-Manager · 1 comment
Owner

Summary

PDFs are saved to a local patents/ directory on disk. In a containerised deployment this directory is ephemeral and will be lost on container restart. There is currently no documentation of the volume mount requirement, and no support for object storage (S3/MinIO).

Roadmap Reference

P2 Backend -- Patent PDF storage (ROADMAP.md)

What to do

Option A (minimum viable -- document the volume):

  1. Add a named Docker volume patent-pdfs in docker-compose.yml mapped to /app/patents/.
  2. Update the README / deployment notes to call out the volume requirement explicitly.

Option B (preferred for production):

  1. Add optional environment variables: PDF_STORAGE_BACKEND (local | s3), PDF_S3_BUCKET, PDF_S3_ENDPOINT, PDF_S3_ACCESS_KEY, PDF_S3_SECRET_KEY.
  2. Abstract file read/write in analyzer.py behind a thin storage interface that supports both local disk and S3/MinIO.
  3. Update docker-compose.yml to include an optional MinIO service with example env vars.

Acceptance criteria

  • Container restarts do not lose saved PDFs (either via named volume or object storage).
  • README describes how to configure storage for local dev and production.
  • Existing analyze_single_patent flow continues to work in the default local-disk configuration.
  • If Option B is implemented, an integration test verifies the MinIO path (or it is documented as requiring a live MinIO instance).
## Summary PDFs are saved to a local `patents/` directory on disk. In a containerised deployment this directory is ephemeral and will be lost on container restart. There is currently no documentation of the volume mount requirement, and no support for object storage (S3/MinIO). ## Roadmap Reference P2 Backend -- Patent PDF storage (ROADMAP.md) ## What to do Option A (minimum viable -- document the volume): 1. Add a named Docker volume `patent-pdfs` in `docker-compose.yml` mapped to `/app/patents/`. 2. Update the README / deployment notes to call out the volume requirement explicitly. Option B (preferred for production): 1. Add optional environment variables: `PDF_STORAGE_BACKEND` (`local` | `s3`), `PDF_S3_BUCKET`, `PDF_S3_ENDPOINT`, `PDF_S3_ACCESS_KEY`, `PDF_S3_SECRET_KEY`. 2. Abstract file read/write in `analyzer.py` behind a thin storage interface that supports both local disk and S3/MinIO. 3. Update `docker-compose.yml` to include an optional MinIO service with example env vars. ## Acceptance criteria - Container restarts do not lose saved PDFs (either via named volume or object storage). - README describes how to configure storage for local dev and production. - Existing `analyze_single_patent` flow continues to work in the default local-disk configuration. - If Option B is implemented, an integration test verifies the MinIO path (or it is documented as requiring a live MinIO instance).
AI-Manager added the P2agent-readymedium labels 2026-03-29 07:21:55 +00:00
Author
Owner

Triage: RESOLVED

This issue has been fully implemented in the fork main branch (merged via PR #58).

Evidence:

  • SPARC/storage.py provides a StorageBackend abstraction with both local disk and S3/MinIO implementations.
  • config.py reads STORAGE_BACKEND, S3_BUCKET, S3_ENDPOINT_URL, and S3 credentials from environment.
  • docker-compose.yml includes an optional MinIO service under the s3 profile.
  • serp_api.py uses the storage backend abstraction.
  • .env.example documents all storage variables.

All acceptance criteria are met. Recommending closure.

## Triage: RESOLVED This issue has been fully implemented in the fork main branch (merged via PR #58). **Evidence:** - `SPARC/storage.py` provides a `StorageBackend` abstraction with both local disk and S3/MinIO implementations. - `config.py` reads `STORAGE_BACKEND`, `S3_BUCKET`, `S3_ENDPOINT_URL`, and S3 credentials from environment. - `docker-compose.yml` includes an optional MinIO service under the `s3` profile. - `serp_api.py` uses the storage backend abstraction. - `.env.example` documents all storage variables. All acceptance criteria are met. Recommending closure.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#922