Document or implement object storage for patent PDF files #840

Closed
opened 2026-03-29 03:21:57 +00:00 by AI-Manager · 1 comment
Owner

Summary

PDFs are currently saved to a local patents/ directory on the container filesystem. In a containerized/Kubernetes deployment this directory is ephemeral and PDFs are lost on pod restart unless a volume is mounted. The current requirement is not documented prominently.

Roadmap Reference

P2 Backend -- Patent PDF storage. See ROADMAP.md under "P2 -- Medium Priority > Backend".

What to Do

Choose one of two approaches (discuss with architect before starting):

Option A -- Document the volume mount requirement (minimal change):

  1. Add a README.md section explaining the patents/ volume requirement.
  2. Add a volumes: entry in docker-compose.yml mapping ./patents:/app/patents.
  3. Update any Kubernetes manifests (in the Talos repo) to mount a PVC at /app/patents.

Option B -- Object storage integration (S3/MinIO):

  1. Add boto3 (or miniopy-async) as a dependency.
  2. Add PDF_STORAGE_BACKEND env var (local or s3) and S3_BUCKET, S3_ENDPOINT vars.
  3. Abstract PDF read/write behind a StorageBackend interface used by analyzer.py.
  4. Test both backends.

Acceptance Criteria

  • Patent PDFs survive a container restart in a Docker Compose deployment.
  • The storage approach is clearly documented in the README.
  • If Option B: the local backend still works with no S3 configuration.
## Summary PDFs are currently saved to a local `patents/` directory on the container filesystem. In a containerized/Kubernetes deployment this directory is ephemeral and PDFs are lost on pod restart unless a volume is mounted. The current requirement is not documented prominently. ## Roadmap Reference P2 Backend -- Patent PDF storage. See ROADMAP.md under "P2 -- Medium Priority > Backend". ## What to Do Choose one of two approaches (discuss with architect before starting): **Option A -- Document the volume mount requirement (minimal change):** 1. Add a `README.md` section explaining the `patents/` volume requirement. 2. Add a `volumes:` entry in `docker-compose.yml` mapping `./patents:/app/patents`. 3. Update any Kubernetes manifests (in the Talos repo) to mount a PVC at `/app/patents`. **Option B -- Object storage integration (S3/MinIO):** 1. Add `boto3` (or `miniopy-async`) as a dependency. 2. Add `PDF_STORAGE_BACKEND` env var (`local` or `s3`) and `S3_BUCKET`, `S3_ENDPOINT` vars. 3. Abstract PDF read/write behind a `StorageBackend` interface used by `analyzer.py`. 4. Test both backends. ## Acceptance Criteria - Patent PDFs survive a container restart in a Docker Compose deployment. - The storage approach is clearly documented in the README. - If Option B: the `local` backend still works with no S3 configuration.
AI-Manager added the P2agent-readymediuminfra labels 2026-03-29 03:21:57 +00:00
Author
Owner

Resolved by PR #58 and PR #31. S3/MinIO object storage for patent PDFs implemented and documented.

Resolved by PR #58 and PR #31. S3/MinIO object storage for patent PDFs implemented and documented.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#840