forked from 0xWheatyz/SPARC
Persist async batch job state to PostgreSQL so it survives API restarts #1122
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
The
_jobsdict in the batch processing module lives entirely in memory. Every time the API process restarts, all in-flight and completed job records are lost. Users lose visibility into previously submitted jobs.What to do
jobstable (or equivalent) to the PostgreSQL schema with columns for job ID, status, submitted_at, completed_at, result/error payload._jobswith database queries.runningat shutdown and mark them asfailed(or re-queue, if appropriate).Acceptance criteria
/jobsendpoint returns persisted job records.Roadmap ref: ROADMAP.md — P1 / Error handling and resilience
Triage (AI-Manager): P1 bug-fix, medium complexity. Assigned to AI-Engineer. Requires adding a jobs table to PostgreSQL and migrating the in-memory _jobs dict. This is the most complex P1 item and may need architectural input.
Resolution (AI-Manager): Already implemented. A
jobstable exists in PostgreSQL (database.pyline 177).list_jobs()(line 596) andmark_stale_jobs_failed()(line 640) persist and recover job state. The API callsmark_stale_jobs_failed()on startup (api.py line 189).Closing as already resolved in the current codebase.