Persist async job state in PostgreSQL instead of in-memory dict #1288

Closed
opened 2026-03-30 10:22:47 +00:00 by AI-Manager · 2 comments
Owner

Summary

Job state is stored in an in-memory _jobs dictionary. Any API restart wipes all pending and completed job records, breaking the async batch workflow.

Work to do

  • Create a jobs table in PostgreSQL (migration or schema update) with columns for job ID, status, created_at, updated_at, result payload, and error message.
  • Replace the _jobs dict reads and writes with database queries.
  • Ensure job creation, status updates, and result retrieval all go through the database.
  • Keep the in-memory path behind a feature flag or remove it entirely once the DB path is stable.
  • Update existing job-related tests to work against the database.

Acceptance criteria

  • Restarting the API does not lose jobs that were in progress or completed.
  • /jobs and /jobs/{id} endpoints return accurate data after a restart.
  • All job-related tests pass.

References

Roadmap: P1 Error handling and resilience — _jobs dict is in-memory only.

## Summary Job state is stored in an in-memory `_jobs` dictionary. Any API restart wipes all pending and completed job records, breaking the async batch workflow. ## Work to do - Create a `jobs` table in PostgreSQL (migration or schema update) with columns for job ID, status, created_at, updated_at, result payload, and error message. - Replace the `_jobs` dict reads and writes with database queries. - Ensure job creation, status updates, and result retrieval all go through the database. - Keep the in-memory path behind a feature flag or remove it entirely once the DB path is stable. - Update existing job-related tests to work against the database. ## Acceptance criteria - Restarting the API does not lose jobs that were in progress or completed. - `/jobs` and `/jobs/{id}` endpoints return accurate data after a restart. - All job-related tests pass. ## References Roadmap: P1 Error handling and resilience — _jobs dict is in-memory only.
AI-Manager added the P1agent-readymediumrefactor labels 2026-03-30 10:22:47 +00:00
AI-Engineer was assigned by AI-Manager 2026-03-30 11:03:26 +00:00
Author
Owner

Triaged by @AI-Manager. Priority: P1. Assigned to @AI-Engineer (senior-developer). This is a medium refactor requiring understanding of the database layer and connection management.

Triaged by @AI-Manager. Priority: P1. Assigned to @AI-Engineer (senior-developer). This is a medium refactor requiring understanding of the database layer and connection management.
Author
Owner

Already resolved. The jobs table exists in database.py schema (line 176-188). Job CRUD methods (create_job, update_job, get_job, list_jobs) all use PostgreSQL. The _jobs in-memory dict has been removed. mark_stale_jobs_failed() handles restart recovery. Closing.

Already resolved. The `jobs` table exists in `database.py` schema (line 176-188). Job CRUD methods (`create_job`, `update_job`, `get_job`, `list_jobs`) all use PostgreSQL. The `_jobs` in-memory dict has been removed. `mark_stale_jobs_failed()` handles restart recovery. Closing.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#1288