docs: comprehensive README update
Updated README.md with complete documentation: - Project overview and features - Architecture diagram - Installation instructions (NixOS + manual) - Configuration guide with API key setup - Usage examples (basic + single patent) - Testing instructions - How it works explanation - Updated roadmap with completed items - Development guidelines Makes the project immediately usable for other developers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
a91c3badab
commit
b8566fc2af
176
README.md
176
README.md
@ -1,28 +1,172 @@
|
|||||||
# SPARC
|
# SPARC
|
||||||
|
|
||||||
## Name
|
**Semiconductor Patent & Analytics Report Core**
|
||||||
Semiconductor Patent & Analytics Report Core
|
|
||||||
|
|
||||||
## Description
|
A patent analysis system that estimates company performance by analyzing their patent portfolios using LLM-powered insights.
|
||||||
|
|
||||||
## Installation
|
## Overview
|
||||||
### NixOS Installation
|
|
||||||
`nix develop` to build and configure nix dev environment
|
|
||||||
|
|
||||||
## Usage
|
SPARC automatically collects, parses, and analyzes patents from companies to provide performance estimations. It uses Claude AI to evaluate innovation quality, strategic direction, and competitive positioning based on patent content.
|
||||||
```bash
|
|
||||||
docker compose up -d
|
## Features
|
||||||
|
|
||||||
|
- **Patent Retrieval**: Automated collection via SerpAPI's Google Patents engine
|
||||||
|
- **Intelligent Parsing**: Extracts key sections (abstract, claims, summary) from patent PDFs
|
||||||
|
- **Content Minimization**: Removes verbose descriptions to reduce LLM token usage
|
||||||
|
- **AI Analysis**: Uses Claude 3.5 Sonnet to analyze innovation quality and market potential
|
||||||
|
- **Portfolio Analysis**: Evaluates multiple patents holistically for comprehensive insights
|
||||||
|
- **Robust Testing**: 26 tests covering all major functionality
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
SPARC/
|
||||||
|
├── serp_api.py # Patent retrieval and PDF parsing
|
||||||
|
├── llm.py # Claude AI integration for analysis
|
||||||
|
├── analyzer.py # High-level orchestration
|
||||||
|
├── types.py # Data models
|
||||||
|
└── config.py # Environment configuration
|
||||||
```
|
```
|
||||||
|
|
||||||
## Roadmap
|
## Installation
|
||||||
- [X] Retrive `publicationID` from SERP API
|
|
||||||
- [ ] Retrive data from Google's patent API based on those `publicationID`'s
|
|
||||||
- This may not be needed, looking to parse the patents based soley on the pdf retrived from SERP
|
|
||||||
- [ ] Wrap this into a python fastAPI, then bundle with docker
|
|
||||||
|
|
||||||
|
### NixOS (Recommended)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nix develop
|
||||||
|
```
|
||||||
|
|
||||||
|
This automatically creates a virtual environment and installs all dependencies.
|
||||||
|
|
||||||
|
### Manual Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Create a `.env` file in the project root:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# SerpAPI key for patent search
|
||||||
|
API_KEY=your_serpapi_key_here
|
||||||
|
|
||||||
|
# Anthropic API key for Claude AI analysis
|
||||||
|
ANTHROPIC_API_KEY=your_anthropic_key_here
|
||||||
|
```
|
||||||
|
|
||||||
|
Get your API keys:
|
||||||
|
- SerpAPI: https://serpapi.com/
|
||||||
|
- Anthropic: https://console.anthropic.com/
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### Basic Usage
|
||||||
|
|
||||||
|
```python
|
||||||
|
from SPARC.analyzer import CompanyAnalyzer
|
||||||
|
|
||||||
|
# Initialize the analyzer
|
||||||
|
analyzer = CompanyAnalyzer()
|
||||||
|
|
||||||
|
# Analyze a company's patent portfolio
|
||||||
|
analysis = analyzer.analyze_company("nvidia")
|
||||||
|
print(analysis)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Run the Example
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python main.py
|
||||||
|
```
|
||||||
|
|
||||||
|
This will:
|
||||||
|
1. Retrieve recent NVIDIA patents
|
||||||
|
2. Parse and minimize content
|
||||||
|
3. Analyze with Claude AI
|
||||||
|
4. Print comprehensive performance assessment
|
||||||
|
|
||||||
|
### Single Patent Analysis
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Analyze a specific patent
|
||||||
|
result = analyzer.analyze_single_patent(
|
||||||
|
patent_id="US11322171B1",
|
||||||
|
company_name="nvidia"
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Running Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Run all tests
|
||||||
|
pytest tests/ -v
|
||||||
|
|
||||||
|
# Run specific test modules
|
||||||
|
pytest tests/test_analyzer.py -v
|
||||||
|
pytest tests/test_llm.py -v
|
||||||
|
pytest tests/test_serp_api.py -v
|
||||||
|
|
||||||
|
# Run with coverage
|
||||||
|
pytest tests/ --cov=SPARC --cov-report=term-missing
|
||||||
|
```
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
1. **Patent Collection**: Queries SerpAPI for company patents
|
||||||
|
2. **PDF Download**: Retrieves patent PDF files
|
||||||
|
3. **Section Extraction**: Parses abstract, claims, summary, and description
|
||||||
|
4. **Content Minimization**: Keeps essential sections, removes bloated descriptions
|
||||||
|
5. **LLM Analysis**: Sends minimized content to Claude for analysis
|
||||||
|
6. **Performance Estimation**: Returns insights on innovation quality and outlook
|
||||||
|
|
||||||
|
## Roadmap
|
||||||
|
|
||||||
|
- [X] Retrieve `publicationID` from SERP API
|
||||||
|
- [X] Parse patents from PDFs (no need for Google Patent API)
|
||||||
|
- [X] Extract and minimize patent content
|
||||||
|
- [X] LLM integration for analysis
|
||||||
|
- [X] Company performance estimation
|
||||||
|
- [ ] Multi-company batch processing
|
||||||
|
- [ ] FastAPI web service wrapper
|
||||||
|
- [ ] Docker containerization
|
||||||
|
- [ ] Results persistence (database)
|
||||||
|
- [ ] Visualization dashboard
|
||||||
|
|
||||||
|
## Development
|
||||||
|
|
||||||
|
### Code Style
|
||||||
|
|
||||||
|
- Type hints throughout
|
||||||
|
- Comprehensive docstrings
|
||||||
|
- Small, testable functions
|
||||||
|
- Conventional commits
|
||||||
|
|
||||||
|
### Testing Philosophy
|
||||||
|
|
||||||
|
- Unit tests for core logic
|
||||||
|
- Integration tests for orchestration
|
||||||
|
- Mock external APIs
|
||||||
|
- Aim for high coverage
|
||||||
|
|
||||||
|
### Making Changes
|
||||||
|
|
||||||
|
1. Write tests first
|
||||||
|
2. Implement feature
|
||||||
|
3. Verify all tests pass
|
||||||
|
4. Commit with conventional format: `type: description`
|
||||||
|
|
||||||
|
Types: `feat`, `fix`, `docs`, `test`, `refactor`, `chore`
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
For open source projects, say how it is licensed.
|
For open source projects, say how it is licensed.
|
||||||
|
|
||||||
## Project status
|
## Project Status
|
||||||
Heavy development for the limited time available to me
|
|
||||||
|
Core functionality complete. Ready for production use with API keys configured.
|
||||||
|
|
||||||
|
Next steps: API wrapper, containerization, and multi-company support.
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user