forked from 0xWheatyz/SPARC
Document or migrate patent PDF storage to a volume mount or object storage #922
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
PDFs are saved to a local
patents/directory on disk. In a containerised deployment this directory is ephemeral and will be lost on container restart. There is currently no documentation of the volume mount requirement, and no support for object storage (S3/MinIO).Roadmap Reference
P2 Backend -- Patent PDF storage (ROADMAP.md)
What to do
Option A (minimum viable -- document the volume):
patent-pdfsindocker-compose.ymlmapped to/app/patents/.Option B (preferred for production):
PDF_STORAGE_BACKEND(local|s3),PDF_S3_BUCKET,PDF_S3_ENDPOINT,PDF_S3_ACCESS_KEY,PDF_S3_SECRET_KEY.analyzer.pybehind a thin storage interface that supports both local disk and S3/MinIO.docker-compose.ymlto include an optional MinIO service with example env vars.Acceptance criteria
analyze_single_patentflow continues to work in the default local-disk configuration.Triage: RESOLVED
This issue has been fully implemented in the fork main branch (merged via PR #58).
Evidence:
SPARC/storage.pyprovides aStorageBackendabstraction with both local disk and S3/MinIO implementations.config.pyreadsSTORAGE_BACKEND,S3_BUCKET,S3_ENDPOINT_URL, and S3 credentials from environment.docker-compose.ymlincludes an optional MinIO service under thes3profile.serp_api.pyuses the storage backend abstraction..env.exampledocuments all storage variables.All acceptance criteria are met. Recommending closure.