_analyze_company_safe was calling SERP.query directly, bypassing the
SERP query cache in analyze_company. Now delegates fully to
analyze_company() and reads patent_count from the serp_queries cache.
- Add TestSingleQueryBugFix: verify SERP.query called once per analysis
- Add TestPatentCaching: DB cache hit/miss, SERP query cache hit/miss
- Add TestDynamicDateRange: rolling window, days_back param
- Add TestFilesystemPDFCaching: skip download, redownload empty files
- Add autouse mock_db fixture to prevent real DB connections in all tests
Before querying SERP API, check serp_queries cache (24h TTL). Before
downloading/parsing each patent, check patents table for cached
minimized_content. Store results after processing so repeated analyses
skip all network I/O and PDF parsing entirely.
Replace the sequential per-patent loop with a ThreadPoolExecutor
(workers controlled by PATENT_THREAD_WORKERS config). Each patent is
processed independently in _process_single_patent, which is thread-safe
since SERP methods are stateless and operate on separate files.
The SERP query had a frozen date range (Oct-Nov 2025) that returned
stale patents. Now computes a rolling window from config
(PATENT_SEARCH_DAYS, default 90 days). Also adds filesystem-level PDF
caching to skip re-downloading existing patent PDFs, and adds
PATENT_THREAD_WORKERS config for upcoming parallel processing.
_analyze_company_safe called SERP.query() then passed the company name
to analyze_company() which called SERP.query() again — doubling API
usage. Now analyze_company() accepts an optional patents param so callers
can pass pre-fetched results through.
Add ROOT_PATH environment variable support so FastAPI generates correct
URLs for Swagger UI when served behind nginx at /api. This fixes the
"invalid version field" error when accessing /api/docs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add trailing slash to proxy_pass directive so nginx strips the /api/
prefix before forwarding requests to the API container. This fixes
routes like /api/docs being passed as /api/docs instead of /docs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Push images with versioned tags in format TIMESTAMP-COMMIT and
frontend-TIMESTAMP-COMMIT for better traceability and rollback support.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Push images with versioned tags in format TIMESTAMP-COMMIT and
frontend-TIMESTAMP-COMMIT for better traceability and rollback support.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Use nginx template support to allow API_URL to be passed at container
runtime, enabling the same image to be deployed to different environments.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add insecure-registries configuration to allow HTTP connections
to gitea.gitea.svc.cluster.local instead of HTTPS.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Split the single build job into two parallel jobs (build-api and
build-frontend) to enable simultaneous container builds when multiple
runners are available. Frontend images are tagged with frontend- prefix.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Update all documentation to reflect recent changes:
- Replace Streamlit dashboard references with React TypeScript dashboard
- Update dashboard port from 8501 to 8080
- Add auth.py and database.py to architecture section
- Change USE_DATABASE terminology to USE_CACHE
- Add JWT_SECRET to environment variables reference
- Document default admin credentials and user seeding
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Change TokenPayload.sub type from int to str per JWT RFC 7519
- Add user_id property to TokenPayload for int conversion
- Update token creation to serialize user_id as string
- Update token consumers to use payload.user_id
- Change dashboard port from 3000 to 8080
- Add pydantic[email] for email validation
- Update default admin email to admin@sparc.dev🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Generate a random 16-character password and create an admin user
(admin@sparc.local) during first database initialization. Credentials
are printed to stdout so they can be captured from container logs.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Rename database mode tests to cache mode to reflect new architecture:
- Replace USE_DATABASE with USE_CACHE references
- Update test assertions for cache behavior
- Maintain backward compatibility testing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace Streamlit dashboard service with React frontend:
- Build from frontend/ directory
- Serve on port 3000 via nginx
- Remove volume mount (now using built assets)
- Add JWT_SECRET env var to api service
- Replace USE_DATABASE with USE_CACHE
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add modern React frontend to replace Streamlit dashboard:
- Vite build system with TypeScript
- Tailwind CSS for styling
- Component structure in src/
- Production Dockerfile with nginx
- Development server on port 5173
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Protect all analysis endpoints with JWT authentication:
- Require valid access token for analysis operations
- Add CORS middleware for React frontend (localhost:3000, 5173)
Add auth endpoints:
- POST /auth/register - user registration (first user becomes admin)
- POST /auth/login - JWT token issuance
- POST /auth/refresh - token refresh
- GET /auth/me - current user info
Add admin endpoints:
- GET /admin/users - list all users
- PATCH /admin/users/{id}/role - update user role
- DELETE /admin/users/{id} - delete user
Add analytics endpoint:
- GET /analytics - usage statistics by company and type
Update .env.example with USE_CACHE and JWT_SECRET config
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace USE_DATABASE toggle with USE_CACHE for smarter LLM response handling:
- Add prompt hashing for efficient cache lookups
- Cache API responses in database to reduce token usage
- Always store responses for analytics (cache or fresh)
Add user authentication infrastructure:
- User table with bcrypt password hashing
- CRUD operations for user management
- Role-based access control (admin/user)
Dependencies: add bcrypt and PyJWT for auth
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Switch from Alpine to Debian slim for better package compatibility
- Add system dependencies for pdfplumber and psycopg2
- Configure separate services for API (port 8000) and dashboard (port 8501)
- Add automatic database initialization via init-db service
- Update documentation with simplified Docker setup
- Remove need for separate docker-compose.prod.yml
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add script to estimate token usage and costs for patent analysis.
Uses tiktoken with cl100k_base encoding to approximate Claude's
tokenizer. Includes cost calculations based on OpenRouter pricing
and supports both sample-based and actual patent content estimation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Not all Google Patents results include PDF download links. Previously
this caused a KeyError when accessing patent["pdf"]. Now patents
without PDF links are gracefully skipped with documentation explaining
when this occurs (recent filings, international patents, restricted access).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace sidebar navigation with horizontal tabs and add comprehensive
CSS styling with dark theme, glassmorphism cards, gradient accents,
and improved visual hierarchy. Updates all page components with
consistent modern design language.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add numpy to requirements.txt and configure flake.nix with zlib and
stdenv.cc.cc.lib to support C extension packages. Sets LD_LIBRARY_PATH
for proper runtime linking.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Move CONTAINER_REGISTRY.md and DATABASE_MODE.md to docs/
- Add comprehensive DEPLOYMENT.md with full deployment instructions
- Update README.md with documentation section linking to docs/
- Keep README.md at root for GitHub visibility
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Create interactive dashboard with company analysis page
- Add batch analysis with progress tracking and charts
- Include analytics page for historical data visualization
- Add system status monitoring
- Update requirements.txt with streamlit, plotly, pandas
- Document dashboard usage in README
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Create REST API with endpoints for single and batch analysis
- Add async job support for long-running batch operations
- Implement job status tracking and listing endpoints
- Add 9 tests for API endpoints
- Update requirements.txt with fastapi, uvicorn, httpx
- Document API usage in README
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add CompanyAnalysisResult and BatchAnalysisResult dataclasses
- Implement analyze_companies() for concurrent batch analysis
- Implement analyze_companies_sequential() for rate-limited scenarios
- Add progress callback support for monitoring batch jobs
- Include 5 new tests for batch processing functionality
- Fix pre-existing test mock issue in test_llm.py
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements a database mode that stores LLM prompts and responses in PostgreSQL
instead of making API calls. This enables:
- Testing without consuming API credits
- Collecting analytics on usage patterns
- Development and debugging workflows
Changes:
- Added DatabaseClient class for PostgreSQL operations
- Modified LLMAnalyzer to support database/API mode toggle
- Added USE_DATABASE config flag to switch between modes
- Included Docker Compose setup for PostgreSQL
- Added utility scripts for database init and analytics viewing
- Comprehensive documentation in DATABASE_MODE.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>