681 lines
18 KiB
Markdown
681 lines
18 KiB
Markdown
# Website Change Detection Monitor - Claude Context
|
||
|
||
## Project Overview
|
||
|
||
This is a **Website Change Detection Monitor SaaS** application. The core value proposition is helping users track changes on web pages they care about, with intelligent noise filtering to ensure only meaningful changes trigger alerts.
|
||
|
||
**Updated Tagline (2026-01-18)**: "Less noise. More signal. Proof included."
|
||
|
||
**Previous**: "I watch pages so you don't have to" (too generic, doesn't communicate value)
|
||
|
||
**Target Market**: SEO & Growth Teams (SMB → Mid-Market) who monitor competitor pages, SERP changes, and policy updates
|
||
|
||
---
|
||
|
||
## Key Differentiators (Market-Validated)
|
||
|
||
1. **Superior Noise Filtering** 🔥
|
||
- Automatically filter cookie banners, timestamps, rotating ads, session IDs
|
||
- Custom ignore rules (CSS selectors, regex, text patterns)
|
||
- **Market Evidence**: Distill & Fluxguard emphasize this as core differentiator
|
||
|
||
2. **Keyword-Based Alerts** 🔥
|
||
- Alert when specific words appear/disappear (e.g., "sold out", "hiring", "$99")
|
||
- Threshold-based triggers, regex support
|
||
- **Market Evidence**: High-value feature across all competitors
|
||
|
||
3. **Workflow Integrations** 🔥 NEW PRIORITY
|
||
- Webhooks (MVP), Slack (V1), Teams/Discord (V2)
|
||
- Alerts in your existing tools, not just email
|
||
- **Market Evidence**: Shown prominently by Visualping, Wachete, ChangeDetection
|
||
|
||
4. **Proof & History** 🔥
|
||
- Compare versions, audit-proof snapshots, full history
|
||
- Messaging: "Prove changes" not just "see changes"
|
||
- **Market Evidence**: Sken & Fluxguard highlight "versions kept"
|
||
|
||
5. **Use-Case-Focused Marketing**
|
||
- Primary: SEO Monitoring, Competitor Tracking, Policy/Compliance
|
||
- Secondary: Stock/Restock, Job Postings
|
||
- **Market Evidence**: All competitors segment by use case
|
||
|
||
---
|
||
|
||
## Architecture Overview
|
||
|
||
### Tech Stack (Recommended)
|
||
|
||
**Frontend**:
|
||
- Next.js 14+ (App Router)
|
||
- TypeScript
|
||
- Tailwind CSS + shadcn/ui components
|
||
- React Query for state management
|
||
- Zod for validation
|
||
|
||
**Backend**:
|
||
- Node.js + Express OR Python + FastAPI
|
||
- PostgreSQL for relational data
|
||
- Redis + Bull/BullMQ for job queuing
|
||
- Puppeteer/Playwright for JS-heavy sites
|
||
|
||
**Infrastructure**:
|
||
- Vercel/Railway for frontend hosting
|
||
- Render/Railway/AWS for backend
|
||
- AWS S3 or Cloudflare R2 for snapshot storage
|
||
- Upstash Redis or managed Redis
|
||
|
||
**Third-Party Services**:
|
||
- Stripe for billing
|
||
- SendGrid/Postmark for emails
|
||
- Sentry for error tracking
|
||
- PostHog/Mixpanel for analytics
|
||
|
||
---
|
||
|
||
## Project Structure
|
||
|
||
```
|
||
/website-monitor
|
||
├── /frontend (Next.js)
|
||
│ ├── /app
|
||
│ │ ├── /dashboard
|
||
│ │ ├── /monitors
|
||
│ │ ├── /settings
|
||
│ │ └── /auth
|
||
│ ├── /components
|
||
│ │ ├── /ui (shadcn components)
|
||
│ │ ├── /monitors
|
||
│ │ └── /diff-viewer
|
||
│ ├── /lib
|
||
│ │ ├── api-client.ts
|
||
│ │ ├── auth.ts
|
||
│ │ └── utils.ts
|
||
│ └── /public
|
||
├── /backend
|
||
│ ├── /src
|
||
│ │ ├── /routes
|
||
│ │ ├── /controllers
|
||
│ │ ├── /models
|
||
│ │ ├── /services
|
||
│ │ │ ├── fetcher.ts
|
||
│ │ │ ├── differ.ts
|
||
│ │ │ ├── scheduler.ts
|
||
│ │ │ └── alerter.ts
|
||
│ │ ├── /jobs
|
||
│ │ └── /utils
|
||
│ ├── /db
|
||
│ │ └── /migrations
|
||
│ └── /tests
|
||
├── /docs
|
||
│ ├── spec.md
|
||
│ ├── task.md
|
||
│ ├── actions.md
|
||
│ └── claude.md (this file)
|
||
└── README.md
|
||
```
|
||
|
||
---
|
||
|
||
## Core Entities & Data Models
|
||
|
||
### User
|
||
```typescript
|
||
{
|
||
id: string
|
||
email: string
|
||
passwordHash: string
|
||
plan: 'free' | 'pro' | 'business' | 'enterprise'
|
||
stripeCustomerId: string
|
||
createdAt: Date
|
||
lastLoginAt: Date
|
||
}
|
||
```
|
||
|
||
### Monitor
|
||
```typescript
|
||
{
|
||
id: string
|
||
userId: string
|
||
url: string
|
||
name: string
|
||
frequency: number // minutes
|
||
status: 'active' | 'paused' | 'error'
|
||
|
||
// Advanced features
|
||
elementSelector?: string
|
||
ignoreRules?: {
|
||
type: 'css' | 'regex' | 'text'
|
||
value: string
|
||
}[]
|
||
keywordRules?: {
|
||
keyword: string
|
||
type: 'appears' | 'disappears' | 'count'
|
||
threshold?: number
|
||
}[]
|
||
|
||
// Metadata
|
||
lastCheckedAt?: Date
|
||
lastChangedAt?: Date
|
||
consecutiveErrors: number
|
||
createdAt: Date
|
||
}
|
||
```
|
||
|
||
### Snapshot
|
||
```typescript
|
||
{
|
||
id: string
|
||
monitorId: string
|
||
htmlContent: string
|
||
contentHash: string
|
||
screenshotUrl?: string
|
||
|
||
// Status
|
||
httpStatus: number
|
||
responseTime: number
|
||
changed: boolean
|
||
changePercentage?: number
|
||
|
||
// Errors
|
||
errorMessage?: string
|
||
|
||
// Metadata
|
||
createdAt: Date
|
||
}
|
||
```
|
||
|
||
### Alert
|
||
```typescript
|
||
{
|
||
id: string
|
||
monitorId: string
|
||
snapshotId: string
|
||
userId: string
|
||
|
||
// Alert details
|
||
type: 'change' | 'error' | 'keyword'
|
||
title: string
|
||
summary?: string
|
||
|
||
// Delivery
|
||
channels: ('email' | 'slack' | 'webhook')[]
|
||
deliveredAt?: Date
|
||
readAt?: Date
|
||
|
||
createdAt: Date
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Key Algorithms & Logic
|
||
|
||
### Change Detection
|
||
```typescript
|
||
// Simple hash comparison for binary change detection
|
||
const changed = previousHash !== currentHash
|
||
|
||
// Text diff for detailed comparison
|
||
const diff = diffLines(previousText, currentText)
|
||
const changePercentage = (changedLines / totalLines) * 100
|
||
|
||
// Severity calculation
|
||
const severity =
|
||
changePercentage > 50 ? 'major' :
|
||
changePercentage > 10 ? 'medium' : 'minor'
|
||
```
|
||
|
||
### Noise Filtering
|
||
```typescript
|
||
// Remove common noise patterns
|
||
function filterNoise(html: string): string {
|
||
// Remove timestamps
|
||
html = html.replace(/\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}/g, '')
|
||
|
||
// Remove cookie banners (common selectors)
|
||
const noisySelectors = [
|
||
'.cookie-banner',
|
||
'#cookie-notice',
|
||
'[class*="consent"]',
|
||
// ... more patterns
|
||
]
|
||
|
||
// Parse and remove elements
|
||
const $ = cheerio.load(html)
|
||
noisySelectors.forEach(sel => $(sel).remove())
|
||
|
||
return $.html()
|
||
}
|
||
```
|
||
|
||
### Keyword Detection
|
||
```typescript
|
||
function checkKeywords(
|
||
previousText: string,
|
||
currentText: string,
|
||
rules: KeywordRule[]
|
||
): KeywordMatch[] {
|
||
const matches = []
|
||
|
||
for (const rule of rules) {
|
||
const prevMatch = previousText.includes(rule.keyword)
|
||
const currMatch = currentText.includes(rule.keyword)
|
||
|
||
if (rule.type === 'appears' && !prevMatch && currMatch) {
|
||
matches.push({ rule, type: 'appeared' })
|
||
}
|
||
if (rule.type === 'disappears' && prevMatch && !currMatch) {
|
||
matches.push({ rule, type: 'disappeared' })
|
||
}
|
||
|
||
// Count logic...
|
||
}
|
||
|
||
return matches
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Development Guidelines
|
||
|
||
### When Working on This Project
|
||
|
||
1. **Prioritize MVP**: Focus on core features before adding complexity
|
||
2. **Performance matters**: Diffing and fetching should be fast (<2s)
|
||
3. **Noise reduction is key**: This is our competitive advantage
|
||
4. **User feedback loop**: Build in ways to learn from false positives
|
||
5. **Security first**: Never store credentials in plain text, sanitize all URLs
|
||
|
||
### Code Style
|
||
|
||
- Use TypeScript strict mode
|
||
- Write unit tests for core algorithms (differ, filter, keyword)
|
||
- Use async/await, avoid callbacks
|
||
- Prefer functional programming patterns
|
||
- Comment complex logic, especially regex patterns
|
||
|
||
### API Design Principles
|
||
|
||
- RESTful endpoints
|
||
- Use proper HTTP status codes
|
||
- Return consistent error format:
|
||
```json
|
||
{
|
||
"error": "monitor_not_found",
|
||
"message": "Monitor with id 123 not found",
|
||
"details": {}
|
||
}
|
||
```
|
||
- Paginate list endpoints (monitors, snapshots, alerts)
|
||
- Version API if breaking changes needed (/v1/monitors)
|
||
|
||
---
|
||
|
||
## Common Tasks & Commands
|
||
|
||
### When Starting Development
|
||
```bash
|
||
# Clone and setup
|
||
git clone <repo>
|
||
cd website-monitor
|
||
|
||
# Install dependencies
|
||
cd frontend && npm install
|
||
cd ../backend && npm install
|
||
|
||
# Setup environment
|
||
cp .env.example .env
|
||
# Edit .env with your values
|
||
|
||
# Start database
|
||
docker-compose up -d postgres redis
|
||
|
||
# Run migrations
|
||
cd backend && npm run migrate
|
||
|
||
# Start dev servers
|
||
cd frontend && npm run dev
|
||
cd backend && npm run dev
|
||
```
|
||
|
||
### Running Tests
|
||
```bash
|
||
# Frontend tests
|
||
cd frontend && npm test
|
||
|
||
# Backend tests
|
||
cd backend && npm test
|
||
|
||
# E2E tests
|
||
npm run test:e2e
|
||
```
|
||
|
||
### Deployment
|
||
```bash
|
||
# Build frontend
|
||
cd frontend && npm run build
|
||
|
||
# Deploy frontend (Vercel)
|
||
vercel deploy --prod
|
||
|
||
# Deploy backend
|
||
docker build -t monitor-api .
|
||
docker push <registry>/monitor-api
|
||
# Deploy to Railway/Render/AWS
|
||
```
|
||
|
||
---
|
||
|
||
## Key User Flows to Support
|
||
|
||
When building features, always consider these primary use cases:
|
||
|
||
1. **Job seeker monitoring career pages** (most common)
|
||
- Needs: Fast frequency (5 min), keyword alerts, instant notifications
|
||
|
||
2. **Price tracking for e-commerce** (high value)
|
||
- Needs: Element selection, numeric comparison, reliable alerts
|
||
|
||
3. **Competitor monitoring** (B2B focus)
|
||
- Needs: Multiple monitors, digest mode, AI summaries
|
||
|
||
4. **Stock/availability tracking** (urgent)
|
||
- Needs: Fastest frequency (1 min), SMS alerts, auto-pause
|
||
|
||
5. **Policy/regulation monitoring** (professional)
|
||
- Needs: Long-term history, team sharing, AI summaries
|
||
|
||
---
|
||
|
||
## Integration Points
|
||
|
||
### Email Service (SendGrid/Postmark)
|
||
```typescript
|
||
async function sendChangeAlert(monitor: Monitor, snapshot: Snapshot) {
|
||
const diffUrl = `https://app.example.com/monitors/${monitor.id}/diff/${snapshot.id}`
|
||
|
||
await emailService.send({
|
||
to: monitor.user.email,
|
||
subject: `Change detected: ${monitor.name}`,
|
||
template: 'change-alert',
|
||
data: {
|
||
monitorName: monitor.name,
|
||
url: monitor.url,
|
||
timestamp: snapshot.createdAt,
|
||
diffUrl,
|
||
changePercentage: snapshot.changePercentage
|
||
}
|
||
})
|
||
}
|
||
```
|
||
|
||
### Stripe Billing
|
||
```typescript
|
||
async function handleSubscription(userId: string, plan: string) {
|
||
const user = await db.users.findById(userId)
|
||
|
||
// Create or update subscription
|
||
const subscription = await stripe.subscriptions.create({
|
||
customer: user.stripeCustomerId,
|
||
items: [{ price: PRICE_IDS[plan] }]
|
||
})
|
||
|
||
// Update user plan
|
||
await db.users.update(userId, {
|
||
plan,
|
||
subscriptionId: subscription.id
|
||
})
|
||
}
|
||
```
|
||
|
||
### Job Queue (Bull)
|
||
```typescript
|
||
// Schedule monitor checks
|
||
async function scheduleMonitor(monitor: Monitor) {
|
||
await monitorQueue.add(
|
||
'check-monitor',
|
||
{ monitorId: monitor.id },
|
||
{
|
||
repeat: {
|
||
every: monitor.frequency * 60 * 1000 // convert to ms
|
||
},
|
||
jobId: `monitor-${monitor.id}`
|
||
}
|
||
)
|
||
}
|
||
|
||
// Process checks
|
||
monitorQueue.process('check-monitor', async (job) => {
|
||
const { monitorId } = job.data
|
||
await checkMonitor(monitorId)
|
||
})
|
||
```
|
||
|
||
---
|
||
|
||
## Testing Strategy
|
||
|
||
### Unit Tests
|
||
- Diff algorithms
|
||
- Noise filtering
|
||
- Keyword matching
|
||
- Ignore rules application
|
||
|
||
### Integration Tests
|
||
- API endpoints
|
||
- Database operations
|
||
- Job queue processing
|
||
|
||
### E2E Tests
|
||
- User registration & login
|
||
- Monitor creation & management
|
||
- Alert delivery
|
||
- Subscription changes
|
||
|
||
### Performance Tests
|
||
- Fetch speed with various page sizes
|
||
- Diff calculation speed
|
||
- Concurrent monitor checks
|
||
- Database query performance
|
||
|
||
---
|
||
|
||
## Deployment Checklist
|
||
|
||
Before deploying to production:
|
||
|
||
- [ ] Environment variables configured
|
||
- [ ] Database migrations run
|
||
- [ ] SSL certificates configured
|
||
- [ ] Email deliverability tested
|
||
- [ ] Payment processing tested (Stripe test mode → live mode)
|
||
- [ ] Error tracking configured (Sentry)
|
||
- [ ] Monitoring & alerts set up (uptime, error rate, queue health)
|
||
- [ ] Backup strategy implemented
|
||
- [ ] Rate limiting configured
|
||
- [ ] GDPR compliance (privacy policy, data export/deletion)
|
||
- [ ] Security headers configured
|
||
- [ ] API documentation updated
|
||
|
||
---
|
||
|
||
## Troubleshooting Common Issues
|
||
|
||
### "Monitor keeps triggering false alerts"
|
||
- Check if noise filtering is working
|
||
- Review ignore rules for the monitor
|
||
- Look at diff to identify changing element
|
||
- Add custom ignore rule for that element
|
||
|
||
### "Some pages aren't being monitored correctly"
|
||
- Check if page requires JavaScript rendering
|
||
- Try enabling headless browser mode
|
||
- Check if page requires authentication
|
||
- Look for CAPTCHA or bot detection
|
||
|
||
### "Alerts aren't being delivered"
|
||
- Check email service status
|
||
- Verify email isn't going to spam
|
||
- Check alert queue for errors
|
||
- Verify user's alert settings
|
||
|
||
### "System is slow/overloaded"
|
||
- Check Redis queue health
|
||
- Look for monitors with very high frequency
|
||
- Check database query performance
|
||
- Consider scaling workers horizontally
|
||
|
||
---
|
||
|
||
## Metrics to Track
|
||
|
||
### Technical Metrics
|
||
- Average check duration
|
||
- Diff calculation time
|
||
- Check success rate
|
||
- Alert delivery rate
|
||
- Queue processing lag
|
||
|
||
### Product Metrics
|
||
- Active monitors per user
|
||
- Alerts sent per day
|
||
- False positive rate (from user feedback)
|
||
- Feature adoption (keywords, elements, integrations)
|
||
|
||
### Business Metrics
|
||
- Free → Paid conversion rate
|
||
- Monthly churn rate
|
||
- Average revenue per user (ARPU)
|
||
- Customer acquisition cost (CAC)
|
||
- Lifetime value (LTV)
|
||
|
||
---
|
||
|
||
## Resources & Documentation
|
||
|
||
### External Documentation
|
||
- [Next.js Docs](https://nextjs.org/docs)
|
||
- [Tailwind CSS](https://tailwindcss.com/docs)
|
||
- [Playwright Docs](https://playwright.dev)
|
||
- [Bull Queue](https://github.com/OptimalBits/bull)
|
||
- [Stripe API](https://stripe.com/docs/api)
|
||
|
||
### Internal Documentation
|
||
- See `spec.md` for complete feature specifications
|
||
- See `task.md` for development roadmap
|
||
- See `actions.md` for user workflows and use cases
|
||
|
||
---
|
||
|
||
## Future Considerations
|
||
|
||
### Potential Enhancements
|
||
- Mobile app (React Native or Progressive Web App)
|
||
- Browser extension for quick monitor addition
|
||
- AI-powered change importance scoring
|
||
- Collaborative features (team annotations, approval workflows)
|
||
- Marketplace for monitor templates
|
||
- Affiliate program for power users
|
||
|
||
### Scaling Considerations
|
||
- Distributed workers across multiple regions
|
||
- Caching layer for frequently accessed pages
|
||
- Database sharding by user
|
||
- Separate queue for high-frequency monitors
|
||
- CDN for snapshot storage
|
||
|
||
---
|
||
|
||
## Notes for Claude
|
||
|
||
When working on this project:
|
||
|
||
1. **Always reference these docs**: spec.md, task.md, actions.md, and this file
|
||
2. **MVP mindset**: Implement the simplest solution that works first
|
||
3. **User-centric**: Consider the user workflows in actions.md when building features
|
||
4. **Security-conscious**: Validate URLs, sanitize inputs, encrypt sensitive data
|
||
5. **Performance-aware**: Optimize for speed, especially diff calculation
|
||
6. **Ask clarifying questions**: If requirements are ambiguous, ask before implementing
|
||
7. **Test as you go**: Write tests for core functionality
|
||
8. **Document decisions**: Update these docs when making architectural decisions
|
||
|
||
### Common Questions & Answers
|
||
|
||
**Q: Should we support authenticated pages in MVP?**
|
||
A: No, save for V2. Focus on public pages first.
|
||
|
||
**Q: What diff library should we use?**
|
||
A: `diff` (npm) or `jsdiff` for JavaScript, `difflib` for Python.
|
||
|
||
**Q: How do we handle CAPTCHA?**
|
||
A: For MVP, just alert the user. For V2, consider residential proxies or browser fingerprinting.
|
||
|
||
**Q: Should we store full HTML or just text?**
|
||
A: Store both: full HTML for accuracy, extracted text for diffing performance.
|
||
|
||
**Q: What's the minimum viable frequency?**
|
||
A: 5 minutes for paid users, 1 hour for free tier.
|
||
|
||
---
|
||
|
||
## Quick Reference
|
||
|
||
### Key Files
|
||
- `spec.md` - Feature specifications
|
||
- `task.md` - Development tasks and roadmap
|
||
- `actions.md` - User workflows and use cases
|
||
- `claude.md` - This file (project context)
|
||
|
||
### Key Concepts
|
||
- **Noise reduction** - Core differentiator
|
||
- **Keyword alerts** - High-value feature
|
||
- **Element selection** - Monitor specific parts
|
||
- **Change severity** - Classify importance
|
||
|
||
### Pricing Tiers (Under Review - See findings_market.md)
|
||
- **Free**: 5 monitors, 1hr frequency
|
||
- **Pro**: 50 monitors, 5min frequency, $19-29/mo
|
||
- **Business**: 200 monitors, 1min frequency, teams, $99-149/mo
|
||
- **Enterprise**: Unlimited, custom pricing
|
||
|
||
**Note:** Considering switch to "checks/month" model instead of "monitors + frequency" for fairer pricing
|
||
|
||
---
|
||
|
||
## Competitive Positioning (Updated 2026-01-18)
|
||
|
||
### Market Landscape
|
||
We compete with established players (Visualping, Distill, Fluxguard) and budget options (Sken.io, ChangeDetection.io).
|
||
|
||
### vs. Visualping
|
||
- **Their Strength**: Enterprise trust ("85% Fortune 500"), broad features
|
||
- **Our Angle**: "Better noise control + fairer pricing – without the enterprise bloat"
|
||
- **Messaging**: "Built for teams who need results, not demos"
|
||
|
||
### vs. Distill.io
|
||
- **Their Strength**: Conditions/filters, established user base
|
||
- **Our Angle**: "Team features built-in + modern UX – not stuck in 2015"
|
||
- **Messaging**: "Collaboration-first, not an afterthought"
|
||
|
||
### vs. Fluxguard
|
||
- **Their Strength**: AI summaries, enterprise focus, sales-led
|
||
- **Our Angle**: "Self-serve pricing + instant setup – no demo calls required"
|
||
- **Messaging**: "AI-powered intelligence without the enterprise tax"
|
||
|
||
### vs. ChangeDetection.io / Sken.io
|
||
- **Their Strength**: Low price ($3-9/mo), simple
|
||
- **Our Angle**: "Advanced features (keywords, integrations, teams) without complexity"
|
||
- **Messaging**: "Powerful, but still simple"
|
||
|
||
### How We Win
|
||
1. **Superior noise control** (multi-layer filtering)
|
||
2. **Workflow integrations** (Slack/Teams/Webhooks early)
|
||
3. **Use-case marketing** (SEO, Competitor, Policy segments)
|
||
4. **Modern UX** (not stuck in legacy design)
|
||
5. **Fair pricing** (considering checks/month model)
|
||
|
||
---
|
||
|
||
*Last updated: 2026-01-18*
|