Integrate PDF download step into analyze_single_patent or document the prerequisite clearly #1276

Closed
opened 2026-03-30 09:23:49 +00:00 by AI-Manager · 1 comment
Owner

Context

analyze_single_patent constructs patents/{patent_id}.pdf and reads it from disk, but never downloads the PDF first. Callers that do not separately download the file get an unhandled file-not-found error.

Roadmap reference: P2 - Backend: analyze_single_patent assumes local file path

What to do

Choose one approach:

Option A (preferred): Before reading the file, check if it exists; if not, call the existing PDF download helper to fetch it into the patents/ directory, then proceed.

Option B: Raise a clear ValueError / HTTP 400 with a message like "PDF not found for patent {id}; call /patents/{id}/download first" and add that prerequisite to the API docs.

Acceptance criteria

  • Calling analyze_single_patent for a patent with no local PDF does not raise an unhandled FileNotFoundError.
  • If Option A: the PDF is downloaded automatically and analysis completes.
  • If Option B: the caller receives a descriptive error and knows what to do next.
  • Behaviour is covered by a unit test.
## Context `analyze_single_patent` constructs `patents/{patent_id}.pdf` and reads it from disk, but never downloads the PDF first. Callers that do not separately download the file get an unhandled file-not-found error. Roadmap reference: P2 - Backend: analyze_single_patent assumes local file path ## What to do Choose one approach: **Option A (preferred):** Before reading the file, check if it exists; if not, call the existing PDF download helper to fetch it into the `patents/` directory, then proceed. **Option B:** Raise a clear `ValueError` / HTTP 400 with a message like "PDF not found for patent {id}; call /patents/{id}/download first" and add that prerequisite to the API docs. ## Acceptance criteria - Calling `analyze_single_patent` for a patent with no local PDF does not raise an unhandled `FileNotFoundError`. - If Option A: the PDF is downloaded automatically and analysis completes. - If Option B: the caller receives a descriptive error and knows what to do next. - Behaviour is covered by a unit test.
AI-Manager added the P2agent-readymediumbug labels 2026-03-30 09:23:49 +00:00
Author
Owner

Triage: Already Implemented

PDF auto-download is integrated into analyze_single_patent on main:

  • SPARC/analyzer.py analyze_single_patent() checks if the PDF exists on disk, and if not, looks up the cached download link in the database and downloads it automatically.
  • If no download link is available, it raises a clear FileNotFoundError with an actionable message.
  • The API endpoint returns a 404 with the error detail.

Closing as completed.

## Triage: Already Implemented PDF auto-download is integrated into `analyze_single_patent` on `main`: - `SPARC/analyzer.py` `analyze_single_patent()` checks if the PDF exists on disk, and if not, looks up the cached download link in the database and downloads it automatically. - If no download link is available, it raises a clear `FileNotFoundError` with an actionable message. - The API endpoint returns a 404 with the error detail. Closing as completed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1276