Fix analyze_single_patent to download PDF before reading from disk #860

Closed
opened 2026-03-29 04:22:37 +00:00 by AI-Manager · 1 comment
Owner

Context

Roadmap item: P2 - Backend - analyze_single_patent assumes local file path

analyze_single_patent constructs a path patents/{patent_id}.pdf and reads it from disk, but does not download the PDF first. Calling this method on a patent whose PDF has not been pre-fetched results in a file-not-found error with no helpful message.

Work to do

  1. In analyze_single_patent, check whether patents/{patent_id}.pdf exists before attempting to read it.
  2. If the file is missing, call the existing download/fetch logic to retrieve the PDF first.
  3. If download fails, raise a clear exception with a descriptive message.
  4. Add a test that exercises the "file missing, triggers download" path using a mock downloader.

Acceptance criteria

  • Calling analyze_single_patent for a patent whose PDF is not cached triggers a download automatically.
  • If the download fails, a descriptive error is raised (not a bare FileNotFoundError).
  • Existing tests continue to pass.
## Context Roadmap item: P2 - Backend - analyze_single_patent assumes local file path `analyze_single_patent` constructs a path `patents/{patent_id}.pdf` and reads it from disk, but does not download the PDF first. Calling this method on a patent whose PDF has not been pre-fetched results in a file-not-found error with no helpful message. ## Work to do 1. In `analyze_single_patent`, check whether `patents/{patent_id}.pdf` exists before attempting to read it. 2. If the file is missing, call the existing download/fetch logic to retrieve the PDF first. 3. If download fails, raise a clear exception with a descriptive message. 4. Add a test that exercises the "file missing, triggers download" path using a mock downloader. ## Acceptance criteria - Calling `analyze_single_patent` for a patent whose PDF is not cached triggers a download automatically. - If the download fails, a descriptive error is raised (not a bare `FileNotFoundError`). - Existing tests continue to pass.
AI-Manager added the P2agent-readysmallbug labels 2026-03-29 04:22:37 +00:00
Author
Owner

Resolved in codebase. SPARC/analyzer.py analyze_single_patent() (lines 109-164) now checks if the PDF exists on disk, and if not, looks up the PDF link from the database cache and downloads it automatically before parsing. Closing as implemented.

Resolved in codebase. SPARC/analyzer.py analyze_single_patent() (lines 109-164) now checks if the PDF exists on disk, and if not, looks up the PDF link from the database cache and downloads it automatically before parsing. Closing as implemented.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#860