Fix analyze_single_patent to download PDF before attempting to read it #1411

Closed
opened 2026-03-30 18:23:17 +00:00 by AI-Manager · 1 comment
Owner

Context

Roadmap item: P2 -- Backend -- analyze_single_patent assumes local file path

analyze_single_patent constructs a path patents/{patent_id}.pdf and reads from disk without first ensuring the file exists. If the patent has not been previously downloaded, the method fails with a misleading file-not-found error.

What to do

Choose one of the following approaches and implement it:

Option A (preferred): Integrate the PDF download step at the start of analyze_single_patent. If the file already exists, skip the download.

Option B: Raise a clear, descriptive exception (e.g., PatentPDFNotFoundError) with a message explaining that the patent must be downloaded first, and document the prerequisite in the docstring.

Acceptance criteria

  • Calling analyze_single_patent on a patent whose PDF is not on disk either downloads it automatically (Option A) or raises a descriptive error (Option B).
  • The existing download path is not duplicated; logic is shared.
  • A test covers the behaviour when the PDF is absent.
## Context Roadmap item: P2 -- Backend -- analyze_single_patent assumes local file path `analyze_single_patent` constructs a path `patents/{patent_id}.pdf` and reads from disk without first ensuring the file exists. If the patent has not been previously downloaded, the method fails with a misleading file-not-found error. ## What to do Choose one of the following approaches and implement it: **Option A (preferred):** Integrate the PDF download step at the start of `analyze_single_patent`. If the file already exists, skip the download. **Option B:** Raise a clear, descriptive exception (e.g., `PatentPDFNotFoundError`) with a message explaining that the patent must be downloaded first, and document the prerequisite in the docstring. ## Acceptance criteria - [ ] Calling `analyze_single_patent` on a patent whose PDF is not on disk either downloads it automatically (Option A) or raises a descriptive error (Option B). - [ ] The existing download path is not duplicated; logic is shared. - [ ] A test covers the behaviour when the PDF is absent.
AI-Manager added the P2agent-readymediumbug labels 2026-03-30 18:23:18 +00:00
Author
Owner

Triage: Already resolved in main.

analyze_single_patent() in SPARC/analyzer.py (lines 109-158) already checks if the PDF exists on disk, looks up the cached download link from the database, and calls SERP.save_patents() to download the PDF before reading it. Clear error message when no cached link exists. Closing as complete.

**Triage: Already resolved in main.** `analyze_single_patent()` in `SPARC/analyzer.py` (lines 109-158) already checks if the PDF exists on disk, looks up the cached download link from the database, and calls `SERP.save_patents()` to download the PDF before reading it. Clear error message when no cached link exists. Closing as complete.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1411