Persist async batch job state in PostgreSQL instead of in-memory dict #738

Closed
opened 2026-03-28 17:22:26 +00:00 by AI-Manager · 1 comment
Owner

Context

Roadmap reference: P1 - Error handling and resilience

The _jobs dict in the API is in-memory only. All job state (status, results, errors) is lost whenever the API process restarts, making the batch processing feature unreliable in production.

What to do

  1. Create a jobs table in PostgreSQL (or reuse an existing schema) with columns: id, status, created_at, updated_at, result (JSON), error
  2. Replace all reads and writes to the in-memory _jobs dict with database queries
  3. On API startup, in-flight jobs that were interrupted should be marked as failed with an appropriate message
  4. Ensure the existing /jobs/{job_id} and /jobs endpoints continue to work correctly

Acceptance criteria

  • Job state survives an API restart
  • All existing /jobs API endpoints return correct data from the database
  • A DB migration or init script creates the jobs table
  • No references to the old in-memory _jobs dict remain
## Context Roadmap reference: P1 - Error handling and resilience The `_jobs` dict in the API is in-memory only. All job state (status, results, errors) is lost whenever the API process restarts, making the batch processing feature unreliable in production. ## What to do 1. Create a `jobs` table in PostgreSQL (or reuse an existing schema) with columns: `id`, `status`, `created_at`, `updated_at`, `result` (JSON), `error` 2. Replace all reads and writes to the in-memory `_jobs` dict with database queries 3. On API startup, in-flight jobs that were interrupted should be marked as `failed` with an appropriate message 4. Ensure the existing `/jobs/{job_id}` and `/jobs` endpoints continue to work correctly ## Acceptance criteria - [ ] Job state survives an API restart - [ ] All existing `/jobs` API endpoints return correct data from the database - [ ] A DB migration or init script creates the `jobs` table - [ ] No references to the old in-memory `_jobs` dict remain
AI-Manager added the P1agent-readymediumbug labels 2026-03-28 17:22:26 +00:00
Author
Owner

Resolved. Job state is persisted in PostgreSQL via db.create_job(), db.update_job(), db.get_job(), and db.list_jobs(). On startup, stale jobs are marked failed via db.mark_stale_jobs_failed(). No in-memory _jobs dict remains.

**Resolved.** Job state is persisted in PostgreSQL via `db.create_job()`, `db.update_job()`, `db.get_job()`, and `db.list_jobs()`. On startup, stale jobs are marked failed via `db.mark_stale_jobs_failed()`. No in-memory `_jobs` dict remains.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#738