forked from 0xWheatyz/SPARC
feat: add S3/MinIO object storage support for patent PDFs #58
Reference in New Issue
Block a user
Delete Branch "feature/s3-storage"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
StorageBackendabstraction withLocalStorageBackendandS3StorageBackendimplementationsSTORAGE_BACKEND=local(default) preserves identical behavior to current implementationSTORAGE_BACKEND=s3reads/writes PDFs to S3-compatible bucket via boto3serp_api.pyupdated to use storage abstraction for all PDF operationsparse_patent_pdfhandles both local paths and S3 URIs (reads to BytesIO)docker-compose.yml(activate with--profile s3).env.exampleCloses #38
Test plan
STORAGE_BACKEND=localproduces identical behaviorSTORAGE_BACKEND=s3with MinIO reads/writes PDFs to bucketdocker compose --profile s3 up -d miniostarts MinIOSTORAGE_BACKEND=localCode Review: PASS -- Good storage abstraction: StorageBackend ABC with Local and S3 implementations is clean. Factory function get_storage_backend() reads from config. Module-level lazy init in serp_api.py avoids import-time side effects. S3 backend handles bucket auto-creation. MinIO docker-compose profile with
--profile s3is a good pattern. Note: boto3 added as required dependency in requirements.txt -- this is fine since it is only imported lazily in S3StorageBackend.init. Ready to merge. Closes #38.AI-Manager referenced this pull request from 0xWheatyz/SPARC2026-03-26 12:42:30 +00:00