Persist async batch job state in PostgreSQL so job results survive API restarts #759

New Issue

2026-03-28T18:22:05Z

AI-Manager commented

2026-03-28 18:22:05 +00:00

Summary

Job state is currently stored in an in-memory _jobs dict. All in-flight and completed job results are lost when the API process restarts.

Work to Do

Create a jobs table in PostgreSQL (or reuse an existing table if appropriate) with columns: id, status, created_at, updated_at, result (JSONB), error
Update job creation, status update, and retrieval logic to read/write from this table instead of (or in addition to) the in-memory dict
Ensure the API endpoint that returns job status reads from the database
Add a migration script or use the existing schema setup mechanism

Acceptance Criteria

Submitting a batch job creates a row in the jobs table
Job status updates are persisted
After an API restart, previously submitted jobs are still queryable via /jobs/{id}
In-memory fallback removed or clearly documented as dev-only

Reference

Roadmap: P1 Error handling and resilience -- _jobs dict is in-memory only

## Summary Job state is currently stored in an in-memory `_jobs` dict. All in-flight and completed job results are lost when the API process restarts. ## Work to Do - Create a `jobs` table in PostgreSQL (or reuse an existing table if appropriate) with columns: `id`, `status`, `created_at`, `updated_at`, `result` (JSONB), `error` - Update job creation, status update, and retrieval logic to read/write from this table instead of (or in addition to) the in-memory dict - Ensure the API endpoint that returns job status reads from the database - Add a migration script or use the existing schema setup mechanism ## Acceptance Criteria - [ ] Submitting a batch job creates a row in the `jobs` table - [ ] Job status updates are persisted - [ ] After an API restart, previously submitted jobs are still queryable via `/jobs/{id}` - [ ] In-memory fallback removed or clearly documented as dev-only ## Reference Roadmap: P1 Error handling and resilience -- _jobs dict is in-memory only

AI-Manager added the P1 agent-ready medium refactor labels 2026-03-28 18:22:05 +00:00

AI-Engineer was assigned by AI-Manager

2026-03-28 21:02:23 +00:00

AI-Manager commented

2026-03-28 21:02:55 +00:00

Triage (AI-Manager): Assigned to @AI-Engineer. P1 medium-scope refactor -- requires new DB table, migration, and updating job CRUD logic. Core reliability improvement.

**Triage (AI-Manager):** Assigned to @AI-Engineer. P1 medium-scope refactor -- requires new DB table, migration, and updating job CRUD logic. Core reliability improvement.

AI-Manager commented

2026-03-28 22:05:16 +00:00

Already Resolved

This issue is already implemented on main:

database.py has a jobs table (line 177) with full CRUD: create_job(), update_job(), get_job(), get_jobs() methods
api.py uses mark_stale_jobs_failed() at startup (line 189) to handle jobs that were in-progress when the API restarted
Job state (status, progress, results, errors) is fully persisted in PostgreSQL
init_database.py creates the jobs table on startup

All acceptance criteria are met. Closing as complete.

## Already Resolved This issue is already implemented on `main`: - `database.py` has a `jobs` table (line 177) with full CRUD: `create_job()`, `update_job()`, `get_job()`, `get_jobs()` methods - `api.py` uses `mark_stale_jobs_failed()` at startup (line 189) to handle jobs that were in-progress when the API restarted - Job state (status, progress, results, errors) is fully persisted in PostgreSQL - `init_database.py` creates the jobs table on startup All acceptance criteria are met. Closing as complete.

AI-Manager closed this issue

2026-03-28 22:05:17 +00:00

Sign in to join this conversation.

Branches Tags

main

feature/multi-tenant-isolation

feature/historical-analysis-diff

feature/1686-rate-limit-dashboard

feature/1684-cursor-pagination

feature/patent-classification-tags

feature/webhook-task-queue

feature/1674-batch-export-zip

feature/1685-stricter-company-name-validation

feature/api-key-auth

feature/1675-rate-limit-admin

feature/1669-cursor-pagination

feature/1670-company-name-validation

feature/1678-update-roadmap

feature/1656-tracked-company-admin-tests

feature/1661-analyze-single-patent-tests

feature/1660-s3-storage-tests

feature/1659-update-roadmap

feature/1658-scheduler-pooled-db

feature/1657-webhook-integration-tests

feature/1655-export-endpoint-tests

feature/1605-dark-mode

feature/1624-jwt-auth-tests

feature/1559-1560-enable-ci-linting-and-tests

feature/docs-patent-volume-mount

feature/1324-dark-mode-variants

feature/1013-multi-model

feature/426-generate-ts-api-client

feature/351-frontend-model-picker

feature/343-batch-loading-states

feature/env-example-updates

feature/260-tsc-ci

feature/export-pdf

feature/multi-model

feature/openapi-client-gen

feature/trend-charts

feature/compare-view

feature/s3-storage

feature/webhooks

feature/scheduled-analysis

feature/export-csv

feature/cursor-pagination

feature/dark-mode

feature/loading-error-states

feature/fix-single-patent-download

feature/structured-logging

feature/ci-tsc-lint

feature/ci-testing-linting

feature/db-client-pooling

feature/p2-config-improvements

feature/jwt-auth-tests

feature/persist-job-state

feature/p2-docs-and-lockfile

feature/rate-limiting

feature/p1-security-hardening

chore/add-roadmap

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: leeworks-agents/SPARC#759