forked from 0xWheatyz/SPARC
Bug: analyze_single_patent does not download PDF before reading it from disk #254
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
analyze_single_patentconstructspatents/{patent_id}.pdfand reads from disk, but does not trigger a download first. If the file is absent the method fails silently or raises a confusing error.Task
analyze_single_patentis calleda. Integrate the download step into
analyze_single_patent(call the SerpAPI PDF download before opening the file), ORb. Add a clear guard that raises a descriptive
FileNotFoundErrorwith instructions on how to obtain the PDF, and document the prerequisite in the docstringAcceptance Criteria
analyze_single_patenton a patent whose PDF has not been downloaded does not produce a cryptic errorReference
Roadmap: P2 Backend — analyze_single_patent assumes local file path
Triage: P2/small - Assigned to @developer. Wave 3 quick win.
Verified:
analyze_single_patent()in analyzer.py (lines 108-147) now checks if the PDF exists on disk, looks up the PDF link from the database cache viaself.db.get_cached_patent(patent_id), and downloads it usingSERP.save_patents()before attempting to parse. If no link is cached, it raises aFileNotFoundErrorwith clear instructions. All acceptance criteria met. Closing.