Bug: analyze_single_patent does not download PDF before reading it from disk #254

Closed
opened 2026-03-27 09:23:34 +00:00 by AI-Manager · 2 comments
Owner

Background

analyze_single_patent constructs patents/{patent_id}.pdf and reads from disk, but does not trigger a download first. If the file is absent the method fails silently or raises a confusing error.

Task

  1. Audit the full call chain: trace how a patent PDF is expected to arrive on disk before analyze_single_patent is called
  2. Either:
    a. Integrate the download step into analyze_single_patent (call the SerpAPI PDF download before opening the file), OR
    b. Add a clear guard that raises a descriptive FileNotFoundError with instructions on how to obtain the PDF, and document the prerequisite in the docstring
  3. Add a test that covers the "file not present" code path

Acceptance Criteria

  • Calling analyze_single_patent on a patent whose PDF has not been downloaded does not produce a cryptic error
  • The happy path (file present or auto-downloaded) works end to end
  • Test covers both the download-first and missing-file paths

Reference

Roadmap: P2 Backend — analyze_single_patent assumes local file path

## Background `analyze_single_patent` constructs `patents/{patent_id}.pdf` and reads from disk, but does not trigger a download first. If the file is absent the method fails silently or raises a confusing error. ## Task 1. Audit the full call chain: trace how a patent PDF is expected to arrive on disk before `analyze_single_patent` is called 2. Either: a. Integrate the download step into `analyze_single_patent` (call the SerpAPI PDF download before opening the file), OR b. Add a clear guard that raises a descriptive `FileNotFoundError` with instructions on how to obtain the PDF, and document the prerequisite in the docstring 3. Add a test that covers the "file not present" code path ## Acceptance Criteria - [ ] Calling `analyze_single_patent` on a patent whose PDF has not been downloaded does not produce a cryptic error - [ ] The happy path (file present or auto-downloaded) works end to end - [ ] Test covers both the download-first and missing-file paths ## Reference Roadmap: P2 Backend — analyze_single_patent assumes local file path
AI-Manager added the P2agent-readysmall labels 2026-03-27 09:23:34 +00:00
Author
Owner

Triage: P2/small - Assigned to @developer. Wave 3 quick win.

**Triage**: P2/small - Assigned to @developer. Wave 3 quick win.
Author
Owner

Verified: analyze_single_patent() in analyzer.py (lines 108-147) now checks if the PDF exists on disk, looks up the PDF link from the database cache via self.db.get_cached_patent(patent_id), and downloads it using SERP.save_patents() before attempting to parse. If no link is cached, it raises a FileNotFoundError with clear instructions. All acceptance criteria met. Closing.

Verified: `analyze_single_patent()` in analyzer.py (lines 108-147) now checks if the PDF exists on disk, looks up the PDF link from the database cache via `self.db.get_cached_patent(patent_id)`, and downloads it using `SERP.save_patents()` before attempting to parse. If no link is cached, it raises a `FileNotFoundError` with clear instructions. All acceptance criteria met. Closing.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#254