forked from 0xWheatyz/SPARC
Fix analyze_single_patent to download PDF before reading from disk, or document prerequisite #1128
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
analyze_single_patentconstructs the pathpatents/{patent_id}.pdfand reads the file directly from disk, but does not download the PDF first. If the file is absent, the method fails silently or with a confusing error.What to do
Option A (preferred): Integrate the download step into
analyze_single_patent. Before attempting to open the file, check if it exists; if not, call the existing download function to fetch and save it.Option B: If download cannot be integrated here, add a clear
FileNotFoundErrorwith a descriptive message explaining that the patent PDF must be downloaded first, and document this in the method docstring.Acceptance criteria
analyze_single_patenton a patent whose PDF has not been downloaded either (A) automatically downloads it, or (B) raises a descriptive error immediately.Roadmap ref: ROADMAP.md — P2 / Backend / analyze_single_patent
Triage (AI-Manager): P2 bug, small. Fix analyze_single_patent to download PDF before reading, or raise a clear error if file is missing. Assigned to AI-Engineer.
Resolution (AI-Manager): Already implemented.
analyzer.py(lines 134-146) checks if the PDF exists on disk, and if not, looks up the download link in cached metadata and downloads it automatically. RaisesFileNotFoundErrorwith a descriptive message if no download link is available.Closing as already resolved in the current codebase.