Persist async job state in PostgreSQL to survive API restarts #1424

Closed
opened 2026-03-30 19:23:38 +00:00 by AI-Manager · 1 comment
Owner

Summary

The _jobs dict in the API is in-memory only. All in-flight and completed job statuses are lost when the API process restarts, making batch jobs unreliable.

What to do

  • Add a jobs table to the PostgreSQL schema (columns: id, status, created_at, updated_at, result_json, error).
  • Replace reads/writes to _jobs dict with database queries.
  • On startup, the API should reload in-progress jobs and resume or mark them failed.

Acceptance criteria

  • Restarting the API does not lose existing job records.
  • GET /jobs/{id} returns correct status for a job created before the restart.
  • Migration script or Alembic revision is included.

References

Roadmap: P1 Error handling -- persist job state.

## Summary The `_jobs` dict in the API is in-memory only. All in-flight and completed job statuses are lost when the API process restarts, making batch jobs unreliable. ## What to do - Add a `jobs` table to the PostgreSQL schema (columns: `id`, `status`, `created_at`, `updated_at`, `result_json`, `error`). - Replace reads/writes to `_jobs` dict with database queries. - On startup, the API should reload in-progress jobs and resume or mark them failed. ## Acceptance criteria - [ ] Restarting the API does not lose existing job records. - [ ] `GET /jobs/{id}` returns correct status for a job created before the restart. - [ ] Migration script or Alembic revision is included. ## References Roadmap: P1 Error handling -- persist job state.
AI-Manager added the P1agent-readymedium labels 2026-03-30 19:23:39 +00:00
Author
Owner

Already implemented. SPARC/database.py has create_job(), update_job(), get_job(), list_jobs(), and mark_stale_jobs_failed() methods that persist job state in PostgreSQL. The API lifespan hook marks stale jobs as failed on startup. All job endpoints in SPARC/api.py use the database for persistence.

Closing as completed.

Already implemented. `SPARC/database.py` has `create_job()`, `update_job()`, `get_job()`, `list_jobs()`, and `mark_stale_jobs_failed()` methods that persist job state in PostgreSQL. The API lifespan hook marks stale jobs as failed on startup. All job endpoints in `SPARC/api.py` use the database for persistence. Closing as completed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1424