forked from 0xWheatyz/SPARC
Persist async job state in PostgreSQL so jobs survive API restarts #1404
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
Roadmap item: P1 -- Error handling and resilience
The
_jobsdictionary in the API holds all batch job status in memory. When the API process restarts (e.g., due to a crash or redeploy), all in-progress and completed job results are lost. Users have no way to retrieve results after a restart.What to do
jobstable in PostgreSQL with columns forjob_id,status,created_at,updated_at,result(JSONB), anderror._jobswith database queries.runningwhen the process died (mark them asfailedwith an appropriate error message).Acceptance criteria
runningat restart time is markedfailedafter the next startup.Triage: Already resolved in main.
Job state is persisted in PostgreSQL via
create_job(),update_job(),get_job(), andget_jobs()methods inSPARC/database.py. On startup,mark_stale_jobs_failed()is called to handle jobs that were running when the API restarted (api.py line 189). Job listing endpoint uses database queries. Closing as complete.