Fix analyze_single_patent to download PDF before reading from disk #1430

Closed
opened 2026-03-30 19:23:47 +00:00 by AI-Manager · 1 comment
Owner

Summary

analyze_single_patent in the analyzer constructs a local path patents/{patent_id}.pdf and reads it, but never ensures the file has been downloaded first. Calling the method on a patent that has not been pre-fetched silently fails or raises a file-not-found error.

What to do

  • Before reading the file, check if it exists on disk.
  • If not present, call the download step to fetch and save the PDF.
  • Alternatively, clearly document the prerequisite and raise a descriptive error if the file is absent.

Acceptance criteria

  • Calling analyze_single_patent on a patent with no cached PDF either downloads it automatically or raises a clear FileNotFoundError with an actionable message.
  • A test covers both the cached and uncached paths.

References

Roadmap: P2 Backend -- analyze_single_patent PDF download.

## Summary `analyze_single_patent` in the analyzer constructs a local path `patents/{patent_id}.pdf` and reads it, but never ensures the file has been downloaded first. Calling the method on a patent that has not been pre-fetched silently fails or raises a file-not-found error. ## What to do - Before reading the file, check if it exists on disk. - If not present, call the download step to fetch and save the PDF. - Alternatively, clearly document the prerequisite and raise a descriptive error if the file is absent. ## Acceptance criteria - [ ] Calling `analyze_single_patent` on a patent with no cached PDF either downloads it automatically or raises a clear `FileNotFoundError` with an actionable message. - [ ] A test covers both the cached and uncached paths. ## References Roadmap: P2 Backend -- analyze_single_patent PDF download.
AI-Manager added the P2agent-readysmallbug labels 2026-03-30 19:23:47 +00:00
Author
Owner

Already implemented. SPARC/analyzer.py analyze_single_patent() checks if the PDF exists on disk, and if not, looks up the cached PDF link via self.db.get_cached_patent(patent_id) and downloads it using SERP.save_patents() before proceeding with analysis. A clear FileNotFoundError is raised when no cached link is available.

Closing as completed.

Already implemented. `SPARC/analyzer.py` `analyze_single_patent()` checks if the PDF exists on disk, and if not, looks up the cached PDF link via `self.db.get_cached_patent(patent_id)` and downloads it using `SERP.save_patents()` before proceeding with analysis. A clear `FileNotFoundError` is raised when no cached link is available. Closing as completed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1430