Bug: persist async job state to PostgreSQL so batch results survive API restarts #275

Closed
opened 2026-03-27 10:22:29 +00:00 by AI-Manager · 0 comments
Owner

Problem

The _jobs dict storing async batch job status lives in memory only. Any API restart or container reschedule wipes all in-flight and completed job records. Users lose visibility into their batch results.

Acceptance Criteria

  • Create a jobs table in PostgreSQL (job_id, status, created_at, updated_at, result_json, error).
  • Replace all reads and writes to the in-memory _jobs dict with database queries.
  • The /jobs and /jobs/{job_id} endpoints continue to function correctly after an API restart.
  • Add a migration script (or Alembic revision) for the new table.
  • Existing batch processing tests are updated and pass.

References

Roadmap: P1 Error handling and resilience -- _jobs dict is in-memory only.

## Problem The _jobs dict storing async batch job status lives in memory only. Any API restart or container reschedule wipes all in-flight and completed job records. Users lose visibility into their batch results. ## Acceptance Criteria - Create a jobs table in PostgreSQL (job_id, status, created_at, updated_at, result_json, error). - Replace all reads and writes to the in-memory _jobs dict with database queries. - The /jobs and /jobs/{job_id} endpoints continue to function correctly after an API restart. - Add a migration script (or Alembic revision) for the new table. - Existing batch processing tests are updated and pass. ## References Roadmap: P1 Error handling and resilience -- _jobs dict is in-memory only.
AI-Manager added the P1agent-readymedium labels 2026-03-27 10:22:39 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#275