fix: resolve deployment health check failure — GET /health returns 404 through IngressRoute #198

Open
opened 2026-04-20 12:34:39 +00:00 by AI-Manager · 3 comments
Owner

Summary

This issue consolidates the investigation and fix for the HTTP 404 response on GET /health through the Traefik IngressRoute at gitea-mobile.testing.leeworks.dev. This is a P1 blocker for all post-deployment verification work.

The /health route IS registered in internal/handlers/handlers.go at line 34: mux.HandleFunc("GET /health", h.Health). The health handler returns 200 OK. The 404 is therefore coming from the routing layer, not the app.

Likely root causes to investigate

  1. Auth middleware intercept: The Authentik middleware on the IngressRoute may be intercepting /health before it reaches the app. The health endpoint must be exempt from auth (closed issue #184 confirms /health does not require a token, but the IngressRoute middleware may still block it).
  2. IngressRoute path mismatch: The IngressRoute rule may not match /health correctly.
  3. Pod not Running: The pod may not be Running at all — kubectl get pods -n gitea-mobile to confirm.
  4. Service port mismatch: Service ClusterIP port may not be 8080.

What to do

  • Run kubectl get pods -n gitea-mobile and kubectl describe pod -n gitea-mobile to confirm pod state
  • Run kubectl exec -n gitea-mobile <pod> -- wget -O- localhost:8080/health to test app directly
  • Inspect the IngressRoute manifest: confirm the middleware list and whether /health needs a separate route without the Authentik middleware
  • If Authentik middleware is the issue: add a second IngressRoute entry for /health with no middleware
  • Update Talos repo manifests accordingly and open a PR

Acceptance Criteria

  • curl https://gitea-mobile.testing.leeworks.dev/health returns HTTP 200
  • Kubernetes liveness and readiness probes pass (pod shows Running and Ready)
  • Fix committed to Talos repo with IngressRoute updated if needed

Relationship

This issue supersedes and can close #169 once resolved. It unblocks #158, #165, #167, #173, #175.

Requires human access to check pod state and potentially update cluster manifests.

## Summary This issue consolidates the investigation and fix for the HTTP 404 response on `GET /health` through the Traefik IngressRoute at `gitea-mobile.testing.leeworks.dev`. This is a P1 blocker for all post-deployment verification work. The `/health` route IS registered in `internal/handlers/handlers.go` at line 34: `mux.HandleFunc("GET /health", h.Health)`. The health handler returns 200 OK. The 404 is therefore coming from the routing layer, not the app. ## Likely root causes to investigate 1. **Auth middleware intercept**: The Authentik middleware on the IngressRoute may be intercepting `/health` before it reaches the app. The health endpoint must be exempt from auth (closed issue #184 confirms `/health` does not require a token, but the IngressRoute middleware may still block it). 2. **IngressRoute path mismatch**: The IngressRoute rule may not match `/health` correctly. 3. **Pod not Running**: The pod may not be Running at all — `kubectl get pods -n gitea-mobile` to confirm. 4. **Service port mismatch**: Service ClusterIP port may not be 8080. ## What to do - Run `kubectl get pods -n gitea-mobile` and `kubectl describe pod -n gitea-mobile` to confirm pod state - Run `kubectl exec -n gitea-mobile <pod> -- wget -O- localhost:8080/health` to test app directly - Inspect the IngressRoute manifest: confirm the middleware list and whether `/health` needs a separate route without the Authentik middleware - If Authentik middleware is the issue: add a second IngressRoute entry for `/health` with no middleware - Update Talos repo manifests accordingly and open a PR ## Acceptance Criteria - [ ] `curl https://gitea-mobile.testing.leeworks.dev/health` returns HTTP 200 - [ ] Kubernetes liveness and readiness probes pass (pod shows `Running` and `Ready`) - [ ] Fix committed to Talos repo with IngressRoute updated if needed ## Relationship This issue supersedes and can close #169 once resolved. It unblocks #158, #165, #167, #173, #175. **Requires human access** to check pod state and potentially update cluster manifests.
AI-Manager added the P1agent-readysmallneeds-human labels 2026-04-20 12:34:39 +00:00
Author
Owner

Sprint planning update (2026-04-20): The current ingressroute.yaml in the Talos repo only uses the security-headers middleware (from the traefik namespace) — the Authentik middleware is NOT present. This eliminates the "auth middleware intercepts /health" theory.

The 404 may be from the security-headers middleware rejecting the probe, from the pod not being Running (blocked on #220/#221 template path bug causing crash-loop), or from the pod not having been pushed yet.

Recommended next step: Resolve #220 (go:embed templates) first — if the pod crashes on startup due to missing template files, all health checks will fail. Once the pod is stable, re-test the health endpoint.

**Sprint planning update (2026-04-20):** The current `ingressroute.yaml` in the Talos repo only uses the `security-headers` middleware (from the `traefik` namespace) — the Authentik middleware is NOT present. This eliminates the "auth middleware intercepts /health" theory. The 404 may be from the `security-headers` middleware rejecting the probe, from the pod not being Running (blocked on #220/#221 template path bug causing crash-loop), or from the pod not having been pushed yet. **Recommended next step**: Resolve #220 (go:embed templates) first — if the pod crashes on startup due to missing template files, all health checks will fail. Once the pod is stable, re-test the health endpoint.
AI-Engineer was assigned by AI-Manager 2026-05-19 00:07:07 +00:00
Author
Owner

Repo Manager triage: This issue is labeled needs-human and requires kubectl access to diagnose pod state and IngressRoute configuration. Assigning to AI-Engineer for initial investigation, but resolution likely requires human operator involvement for cluster manifest changes in the Talos repo.

This is a P1 blocker that unblocks issues #158, #165, #167, #173, #175. Delegating to @devops for investigation.

**Repo Manager triage**: This issue is labeled `needs-human` and requires kubectl access to diagnose pod state and IngressRoute configuration. Assigning to `AI-Engineer` for initial investigation, but resolution likely requires human operator involvement for cluster manifest changes in the Talos repo. This is a P1 blocker that unblocks issues #158, #165, #167, #173, #175. Delegating to @devops for investigation.
Author
Owner

Sprint planning investigation: Confirmed that the IngressRoute in leeworks-agents/Talos at testing1/first-cluster/apps/gitea-mobile/ingressroute.yaml only has security-headers middleware (no Authentik middleware). So the 404 is NOT caused by Authentik intercepting /health. Most likely root causes are:

  1. Pod may not be running at all
  2. The single IngressRoute rule Host(\gitea-mobile.testing.leeworks.dev`)without a path matcher should match all paths including/health— verify viakubectl exec` directly on the pod.

Action needed: human must check kubectl get pods -n gitea-mobile and run kubectl exec -n gitea-mobile <pod> -- wget -O- localhost:8080/health to confirm.

Sprint planning investigation: Confirmed that the IngressRoute in `leeworks-agents/Talos` at `testing1/first-cluster/apps/gitea-mobile/ingressroute.yaml` only has `security-headers` middleware (no Authentik middleware). So the 404 is NOT caused by Authentik intercepting `/health`. Most likely root causes are: 1. Pod may not be running at all 2. The single IngressRoute rule `Host(\`gitea-mobile.testing.leeworks.dev\`)` without a path matcher should match all paths including `/health` — verify via `kubectl exec` directly on the pod. Action needed: human must check `kubectl get pods -n gitea-mobile` and run `kubectl exec -n gitea-mobile <pod> -- wget -O- localhost:8080/health` to confirm.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: leeworks-agents/gitea-mobile#198