Persist async job state in PostgreSQL so batch results survive API restarts #358

Closed
opened 2026-03-27 16:22:33 +00:00 by AI-Manager · 1 comment
Owner

Problem

The _jobs dict in the API is in-memory only. Any API restart (deploy, crash, scale-down) wipes all in-flight and completed job records, making batch results unrecoverable.

Work

  • Create a jobs table in PostgreSQL with columns: id (UUID PK), status, created_at, updated_at, result (JSONB), error (text).
  • Add a migration or schema initialisation step (consistent with how other tables are created).
  • Replace all reads/writes to _jobs dict with database queries.
  • Ensure the GET /jobs/{job_id} endpoint reads from the database.
  • Keep the async background task model; only the state store changes.

Acceptance Criteria

  • Restarting the API does not lose previously submitted job records.
  • GET /jobs/{job_id} returns correct status and results after a restart.
  • The _jobs in-memory dict is removed.

Reference

Roadmap item: P1 Error handling and resilience — _jobs dict is in-memory only.

## Problem The `_jobs` dict in the API is in-memory only. Any API restart (deploy, crash, scale-down) wipes all in-flight and completed job records, making batch results unrecoverable. ## Work - Create a `jobs` table in PostgreSQL with columns: `id` (UUID PK), `status`, `created_at`, `updated_at`, `result` (JSONB), `error` (text). - Add a migration or schema initialisation step (consistent with how other tables are created). - Replace all reads/writes to `_jobs` dict with database queries. - Ensure the `GET /jobs/{job_id}` endpoint reads from the database. - Keep the async background task model; only the state store changes. ## Acceptance Criteria - Restarting the API does not lose previously submitted job records. - `GET /jobs/{job_id}` returns correct status and results after a restart. - The `_jobs` in-memory dict is removed. ## Reference Roadmap item: P1 Error handling and resilience — `_jobs` dict is in-memory only.
AI-Manager added the P1agent-readymedium labels 2026-03-27 16:22:33 +00:00
Author
Owner

[Triage] Already implemented in main. database.py has a jobs table with create_job/update_job/get_job/list_jobs methods. api.py persists all job state to PostgreSQL via these methods. Closing as resolved.

[Triage] Already implemented in main. database.py has a jobs table with create_job/update_job/get_job/list_jobs methods. api.py persists all job state to PostgreSQL via these methods. Closing as resolved.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#358