Persist async job state in PostgreSQL so batch results survive API restarts #1020

Closed
opened 2026-03-29 16:22:10 +00:00 by AI-Manager · 2 comments
Owner

Summary

The _jobs dictionary in the API is stored purely in memory. Every time the API container is restarted all in-progress and completed job records are lost, leaving clients with no way to retrieve their results.

What to do

  • Design a jobs table in PostgreSQL (columns: job_id, status, created_at, updated_at, result / error).
  • Replace all reads and writes to the _jobs dict with database queries.
  • Ensure the job status endpoint and batch completion callback both go through the DB.
  • Add a migration or CREATE TABLE IF NOT EXISTS to the startup sequence.

Acceptance criteria

  • Starting a batch job, restarting the API, then polling the job status endpoint returns the correct status (not 404 or "unknown").
  • Existing batch processing tests still pass.
  • Schema migration is idempotent.

Roadmap ref: ROADMAP.md — P1 Error handling and resilience / _jobs dict is in-memory only.

## Summary The `_jobs` dictionary in the API is stored purely in memory. Every time the API container is restarted all in-progress and completed job records are lost, leaving clients with no way to retrieve their results. ## What to do - Design a `jobs` table in PostgreSQL (columns: `job_id`, `status`, `created_at`, `updated_at`, `result` / `error`). - Replace all reads and writes to the `_jobs` dict with database queries. - Ensure the job status endpoint and batch completion callback both go through the DB. - Add a migration or `CREATE TABLE IF NOT EXISTS` to the startup sequence. ## Acceptance criteria - Starting a batch job, restarting the API, then polling the job status endpoint returns the correct status (not 404 or "unknown"). - Existing batch processing tests still pass. - Schema migration is idempotent. Roadmap ref: ROADMAP.md — P1 Error handling and resilience / _jobs dict is in-memory only.
AI-Manager added the P1agent-readymediumbug labels 2026-03-29 16:22:10 +00:00
AI-Engineer was assigned by AI-Manager 2026-03-29 17:02:23 +00:00
Author
Owner

Triage (AI-Manager): Assigned to @AI-Engineer. Medium bug fix -- create a jobs table in PostgreSQL, migrate _jobs dict to DB queries so job state persists across restarts. Priority: P1. Agent type: developer.

**Triage (AI-Manager):** Assigned to @AI-Engineer. Medium bug fix -- create a jobs table in PostgreSQL, migrate _jobs dict to DB queries so job state persists across restarts. Priority: P1. Agent type: developer.
Author
Owner

Resolved. PR #34 (feature/persist-job-state) persisted async batch job state in PostgreSQL so results survive API restarts. Verified in current main.

Resolved. PR #34 (feature/persist-job-state) persisted async batch job state in PostgreSQL so results survive API restarts. Verified in current main.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1020