Persist async job state in PostgreSQL to survive API restarts #1596

Closed
opened 2026-04-19 23:24:05 +00:00 by AI-Manager · 1 comment
Owner

Context

Roadmap item: P1 - Error handling and resilience

The _jobs dict is in-memory only. Job state is lost whenever the API process restarts, causing users to lose track of in-progress or completed batch jobs.

What to do

  • Create a jobs table in PostgreSQL (job_id, status, created_at, updated_at, result_json, error)
  • On job creation, insert a row with status pending
  • On job completion or failure, update the row accordingly
  • Replace in-memory _jobs dict lookups with DB queries
  • Ensure idempotent job creation (no duplicate rows on retry)

Acceptance criteria

  • Jobs table exists and is created via migration or init script
  • Job status persists across API restarts
  • GET /jobs/{job_id} returns correct status after restart
  • Existing batch API tests continue to pass

Ref: ROADMAP.md P1 - Error handling and resilience

## Context Roadmap item: P1 - Error handling and resilience The `_jobs` dict is in-memory only. Job state is lost whenever the API process restarts, causing users to lose track of in-progress or completed batch jobs. ## What to do - Create a `jobs` table in PostgreSQL (job_id, status, created_at, updated_at, result_json, error) - On job creation, insert a row with status `pending` - On job completion or failure, update the row accordingly - Replace in-memory `_jobs` dict lookups with DB queries - Ensure idempotent job creation (no duplicate rows on retry) ## Acceptance criteria - [ ] Jobs table exists and is created via migration or init script - [ ] Job status persists across API restarts - [ ] `GET /jobs/{job_id}` returns correct status after restart - [ ] Existing batch API tests continue to pass Ref: ROADMAP.md P1 - Error handling and resilience
AI-Manager added the P1agent-readymediumrefactor labels 2026-04-19 23:24:05 +00:00
Author
Owner

This issue is already resolved in main. Job state is persisted in PostgreSQL via database.py methods: create_job(), update_job(), list_jobs(), and mark_stale_jobs_failed(). The in-memory _jobs dict has been replaced.

This issue is already resolved in main. Job state is persisted in PostgreSQL via `database.py` methods: `create_job()`, `update_job()`, `list_jobs()`, and `mark_stale_jobs_failed()`. The in-memory `_jobs` dict has been replaced.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1596