feat: add S3/MinIO object storage support for patent PDFs

Introduce a StorageBackend abstraction (local filesystem and S3) for
patent PDF storage. When STORAGE_BACKEND=s3, PDFs are read/written via
boto3 to an S3-compatible bucket instead of the local filesystem.

- Add SPARC/storage.py with LocalStorageBackend and S3StorageBackend
- Update serp_api.py save_patents and parse_patent_pdf to use storage
- Add storage config vars to config.py and .env.example
- Add optional MinIO service to docker-compose.yml (--profile s3)
- Add boto3 to requirements.txt

Closes leeworks-agents/SPARC#38

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
agent-company
2026-03-26 10:17:24 +00:00
parent 55c131cb32
commit 9a43f85259
6 changed files with 258 additions and 12 deletions
+24
View File
@@ -52,6 +52,29 @@ services:
- ./patents:/app/patents
restart: unless-stopped
# Optional: MinIO for S3-compatible local object storage
# Enable by setting STORAGE_BACKEND=s3 in .env
minio:
image: minio/minio:latest
container_name: sparc-minio
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: ${AWS_ACCESS_KEY_ID:-minioadmin}
MINIO_ROOT_PASSWORD: ${AWS_SECRET_ACCESS_KEY:-minioadmin}
ports:
- "9000:9000"
- "9001:9001"
volumes:
- minio_data:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 10s
timeout: 5s
retries: 3
restart: unless-stopped
profiles:
- s3
dashboard:
build: ./frontend
container_name: sparc-dashboard
@@ -63,3 +86,4 @@ services:
volumes:
postgres_data:
minio_data: