forked from 0xWheatyz/SPARC
Fix: persist job state to PostgreSQL so async batch results survive API restarts #1313
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
The
_jobsdictionary in the API server is in-memory only. If the API process restarts for any reason (deploy, crash, OOM kill), all pending and completed job results are lost. Users cannot retrieve results for jobs submitted before the restart.What to do
jobstable (or equivalent) in PostgreSQL to store job ID, status, created_at, updated_at, and result payload._jobswith database operations.in_progressjobs and decide whether to re-queue or mark them asfailed.GET /jobs/{job_id},GET /jobs).Acceptance criteria
References
Roadmap: P1 Error handling and resilience — _jobs dict is in-memory only.
Already resolved. Job state is persisted to PostgreSQL via
DatabaseClient.create_job(),update_job(),get_job(),list_jobs()inSPARC/database.py. The API reads/writes jobs through this DB layer.mark_stale_jobs_failed()handles recovery on restart.