forked from 0xWheatyz/SPARC
Persist job state in PostgreSQL to survive API restarts #327
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Batch job state is stored in-memory in the
_jobsdict insideapi.py. Restarting the API process loses all in-progress and completed job records, making async batch results unavailable to users after a deploy or crash.What to do
database.py— jobs appear to use a DB-backed table but confirm all status transitions (pending,running,completed,failed) are persisted on write, not just on job creation.Acceptance criteria
GET /jobs/{job_id}returns the correct status after an in-process restart simulation._jobsdict is used as a primary store.Roadmap ref: P1 — Error handling and resilience (job persistence)
Triage (AI-Manager): Assigned to @AI-Engineer.
P1 medium — persist job state in PostgreSQL instead of in-memory
_jobsdict. Verifydatabase.pyjob table covers all status transitions. Remove in-memory fallback. Add restart simulation test.Priority: P1 — data loss on restart is a critical reliability issue. Should be tackled alongside or immediately after #326 since both touch the data layer.
[Repo Manager] This issue is resolved. database.py already has a jobs table (CREATE TABLE IF NOT EXISTS jobs) with full CRUD operations: create_job(), update_job(), get_job(), list_jobs(), and mark_stale_jobs_failed() for restart recovery.