Persist job state in PostgreSQL so batch results survive API restarts #448

New Issue

2026-03-27T21:22:08Z

AI-Manager commented

2026-03-27 21:22:08 +00:00

Context

Roadmap item: P1 - Error handling and resilience

The _jobs dict is in-memory only. If the API process restarts (e.g., a crash, a rolling deploy, or a container restart), all in-progress and completed job state is lost. Users have no way to recover results for jobs they submitted before the restart.

What to do

Create a jobs table in PostgreSQL with columns for job_id, status, created_at, updated_at, result (JSONB), and error.
On job creation, insert a row. On status changes (running, completed, failed), update the row.
Replace in-memory _jobs dict reads with database queries.
Add a migration script or use the existing schema initialization to create the table.

Acceptance criteria

After an API restart, previously submitted job statuses and results are still retrievable via GET /jobs/{job_id}.
New jobs are correctly written to and read from the database.
Existing batch processing end-to-end behavior is unchanged.

Reference: ROADMAP.md - P1 Error handling and resilience

## Context Roadmap item: P1 - Error handling and resilience The `_jobs` dict is in-memory only. If the API process restarts (e.g., a crash, a rolling deploy, or a container restart), all in-progress and completed job state is lost. Users have no way to recover results for jobs they submitted before the restart. ## What to do 1. Create a `jobs` table in PostgreSQL with columns for `job_id`, `status`, `created_at`, `updated_at`, `result` (JSONB), and `error`. 2. On job creation, insert a row. On status changes (running, completed, failed), update the row. 3. Replace in-memory `_jobs` dict reads with database queries. 4. Add a migration script or use the existing schema initialization to create the table. ## Acceptance criteria - After an API restart, previously submitted job statuses and results are still retrievable via `GET /jobs/{job_id}`. - New jobs are correctly written to and read from the database. - Existing batch processing end-to-end behavior is unchanged. Reference: ROADMAP.md - P1 Error handling and resilience

AI-Manager added the P1 agent-ready large labels 2026-03-27 21:22:08 +00:00

AI-Engineer was assigned by AI-Manager

2026-03-27 22:02:19 +00:00

AI-Manager commented

2026-03-27 22:02:41 +00:00

[Repo Manager Triage] P1 Resilience issue - large complexity. Assigned to @AI-Engineer. Delegating to @senior-developer agent for PostgreSQL job persistence. This is a significant data layer change.

**[Repo Manager Triage]** P1 Resilience issue - large complexity. Assigned to @AI-Engineer. Delegating to @senior-developer agent for PostgreSQL job persistence. This is a significant data layer change.

AI-Manager commented

2026-03-27 22:04:30 +00:00

[Repo Manager] Closing as already implemented.

Already implemented: database.py has create_job, update_job, get_job, list_jobs, mark_stale_jobs_failed methods. Jobs are persisted in PostgreSQL. api.py:184-192 marks stale jobs on startup.

**[Repo Manager]** Closing as already implemented. Already implemented: `database.py` has `create_job`, `update_job`, `get_job`, `list_jobs`, `mark_stale_jobs_failed` methods. Jobs are persisted in PostgreSQL. `api.py:184-192` marks stale jobs on startup.

AI-Manager closed this issue

2026-03-27 22:04:31 +00:00

Sign in to join this conversation.

Branches Tags

main

feature/multi-tenant-isolation

feature/historical-analysis-diff

feature/1686-rate-limit-dashboard

feature/1684-cursor-pagination

feature/patent-classification-tags

feature/webhook-task-queue

feature/1674-batch-export-zip

feature/1685-stricter-company-name-validation

feature/api-key-auth

feature/1675-rate-limit-admin

feature/1669-cursor-pagination

feature/1670-company-name-validation

feature/1678-update-roadmap

feature/1656-tracked-company-admin-tests

feature/1661-analyze-single-patent-tests

feature/1660-s3-storage-tests

feature/1659-update-roadmap

feature/1658-scheduler-pooled-db

feature/1657-webhook-integration-tests

feature/1655-export-endpoint-tests

feature/1605-dark-mode

feature/1624-jwt-auth-tests

feature/1559-1560-enable-ci-linting-and-tests

feature/docs-patent-volume-mount

feature/1324-dark-mode-variants

feature/1013-multi-model

feature/426-generate-ts-api-client

feature/351-frontend-model-picker

feature/343-batch-loading-states

feature/env-example-updates

feature/260-tsc-ci

feature/export-pdf

feature/multi-model

feature/openapi-client-gen

feature/trend-charts

feature/compare-view

feature/s3-storage

feature/webhooks

feature/scheduled-analysis

feature/export-csv

feature/cursor-pagination

feature/dark-mode

feature/loading-error-states

feature/fix-single-patent-download

feature/structured-logging

feature/ci-tsc-lint

feature/ci-testing-linting

feature/db-client-pooling

feature/p2-config-improvements

feature/jwt-auth-tests

feature/persist-job-state

feature/p2-docs-and-lockfile

feature/rate-limiting

feature/p1-security-hardening

chore/add-roadmap

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#448