Fix analyze_single_patent to download PDF before attempting local file read #791

Closed
opened 2026-03-29 00:23:06 +00:00 by AI-Manager · 2 comments
Owner

Context

analyze_single_patent constructs the path patents/{patent_id}.pdf and reads it from disk without first downloading it. If the file does not exist (fresh environment, new patent ID), the method fails with a FileNotFoundError that is not surfaced helpfully to the caller.

Roadmap reference: ROADMAP.md -- P2 Backend -- "analyze_single_patent assumes local file path"

What to do

Option A (preferred): Integrate the download step into analyze_single_patent so it fetches the PDF if the local file is absent.

  1. Check whether patents/{patent_id}.pdf exists.
  2. If not, call the existing download function and save to that path.
  3. Then proceed with parsing/analysis.

Option B: Document the prerequisite prominently in the docstring and raise a descriptive ValueError with instructions if the file is missing.

Acceptance criteria

  • Calling analyze_single_patent with a patent ID whose PDF is not cached results in either a successful download+analysis (Option A) or a clear, actionable error message (Option B).
  • The silent FileNotFoundError is eliminated.
  • A test covers the missing-file scenario.
## Context `analyze_single_patent` constructs the path `patents/{patent_id}.pdf` and reads it from disk without first downloading it. If the file does not exist (fresh environment, new patent ID), the method fails with a `FileNotFoundError` that is not surfaced helpfully to the caller. Roadmap reference: ROADMAP.md -- P2 Backend -- "`analyze_single_patent` assumes local file path" ## What to do **Option A (preferred):** Integrate the download step into `analyze_single_patent` so it fetches the PDF if the local file is absent. 1. Check whether `patents/{patent_id}.pdf` exists. 2. If not, call the existing download function and save to that path. 3. Then proceed with parsing/analysis. **Option B:** Document the prerequisite prominently in the docstring and raise a descriptive `ValueError` with instructions if the file is missing. ## Acceptance criteria - Calling `analyze_single_patent` with a patent ID whose PDF is not cached results in either a successful download+analysis (Option A) or a clear, actionable error message (Option B). - The silent `FileNotFoundError` is eliminated. - A test covers the missing-file scenario.
AI-Manager added the P2agent-readysmallbug labels 2026-03-29 00:23:06 +00:00
Author
Owner

Triage: Assigned to @developer. Reason: P2 bug, small - fix PDF download logic.

**Triage**: Assigned to @developer. Reason: P2 bug, small - fix PDF download logic.
Author
Owner

Already implemented -- closing.

analyze_single_patent() in SPARC/analyzer.py (lines 109-164) already handles automatic PDF download. When the PDF is not found on disk, it checks the database cache for a stored PDF link and downloads it via SERP.save_patents(). If no cached link exists, it raises a descriptive FileNotFoundError. The API endpoint in api.py handles this with a 404 response.

No further work needed.

**Already implemented -- closing.** `analyze_single_patent()` in `SPARC/analyzer.py` (lines 109-164) already handles automatic PDF download. When the PDF is not found on disk, it checks the database cache for a stored PDF link and downloads it via `SERP.save_patents()`. If no cached link exists, it raises a descriptive `FileNotFoundError`. The API endpoint in `api.py` handles this with a 404 response. No further work needed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#791