328 lines
9.4 KiB
Markdown
328 lines
9.4 KiB
Markdown
# Website Change Detection Monitor - Technical Specification
|
|
|
|
## Product Overview
|
|
|
|
A SaaS platform that monitors web pages for changes and alerts users when meaningful updates occur. The core value proposition is "signal over noise" - detecting real changes while filtering out irrelevant updates.
|
|
|
|
## Core Value Propositions
|
|
|
|
- **Easy to understand**: "I watch pages so you don't have to"
|
|
- **Smart filtering**: Automatically ignores timestamps, cookie banners, and noise
|
|
- **Keyword intelligence**: Alert on specific content appearing/disappearing
|
|
- **SEO-friendly**: Captures long-tail keywords (e.g., "monitor job posting changes")
|
|
|
|
---
|
|
|
|
## Feature Specification by Phase
|
|
|
|
### MVP Features (Launch Fast)
|
|
|
|
#### 1. URL Monitoring
|
|
- **Track URLs**: Add any public web page by URL
|
|
- **Frequency options**: 5min / 30min / 6hr / 24hr intervals
|
|
- **Change detection methods**:
|
|
- Content hash comparison (fast, binary change detection)
|
|
- Text diff (character/line level differences)
|
|
- **Storage**: Last 5-10 snapshots per URL
|
|
|
|
#### 2. Alert System
|
|
- **Email notifications**: Send alert when change detected
|
|
- **Alert content**:
|
|
- Timestamp of change
|
|
- Link to diff view
|
|
- Change severity indicator
|
|
- **Basic throttling**: Max 1 alert per check interval
|
|
|
|
#### 3. Change Viewing
|
|
- **History timeline**: Chronological list of all checks
|
|
- Status (changed/unchanged/error)
|
|
- Timestamp
|
|
- Response code
|
|
- **Diff viewer**:
|
|
- Side-by-side or unified view
|
|
- Highlighted additions/deletions
|
|
- Character-level diff for precision
|
|
- **Snapshot storage**: Full HTML snapshots for history
|
|
|
|
#### 4. Reliability
|
|
- **Retry logic**: 2-3 attempts on timeout/5xx errors
|
|
- **Error alerts**: Notify if page becomes unavailable
|
|
- **Status tracking**:
|
|
- HTTP response codes
|
|
- Timeout detection
|
|
- Robot/CAPTCHA blocking detection
|
|
- **Run logs**: Detailed history of each check attempt
|
|
|
|
---
|
|
|
|
### V1 Features (People Pay)
|
|
|
|
#### 5. Noise Reduction (Differentiator)
|
|
- **Automatic filtering**:
|
|
- Cookie consent banners
|
|
- Timestamps and "last updated" text
|
|
- Current date/time displays
|
|
- Session IDs in URLs
|
|
- Rotating content (recommendations, ads)
|
|
- **Custom ignore rules**:
|
|
- Text pattern matching (regex)
|
|
- CSS selector exclusion
|
|
- Ignore numeric changes except in specific areas
|
|
- Whitelist/blacklist mode
|
|
|
|
#### 6. Selective Monitoring
|
|
- **Element monitoring**: Track only specific page sections
|
|
- Visual point-and-click selector
|
|
- CSS selector input (power users)
|
|
- XPath support
|
|
- **Multiple elements**: Track different sections separately
|
|
- **Element naming**: Label tracked elements for clarity
|
|
|
|
#### 7. Keyword-Based Alerts (High Value)
|
|
- **Keyword rules**:
|
|
- Alert when keyword appears
|
|
- Alert when keyword disappears
|
|
- Alert when keyword count changes
|
|
- Threshold-based alerts ("less than 5 items")
|
|
- **Regex support**: Advanced pattern matching
|
|
- **Multiple keywords**: AND/OR logic combinations
|
|
- **Case sensitivity options**
|
|
|
|
#### 8. Advanced Alerting
|
|
- **Digest mode**: Daily/weekly summary of all changes
|
|
- **Quiet hours**: No alerts during specified times
|
|
- **Alert throttling**: Configurable limits
|
|
- Max alerts per hour/day
|
|
- Cooldown periods
|
|
- **Severity filtering**: Only alert on major changes
|
|
- **Multiple channels per monitor**: Email + Slack, etc.
|
|
|
|
---
|
|
|
|
### V2 Features (Market Winner)
|
|
|
|
#### 9. Visual Change Detection
|
|
- **Screenshot capture**: Full-page screenshots
|
|
- **Image diff**: Visual highlighting of changed areas
|
|
- **Pixel-perfect detection**: Layout shift detection
|
|
- **Before/After carousel**: Easy visual comparison
|
|
- **Screenshot retention**: Based on plan tier
|
|
|
|
#### 10. AI-Powered Summaries
|
|
- **Change summarization**: Natural language description
|
|
- "Price changed from $29.99 to $24.99"
|
|
- "New 'Out of Stock' banner added"
|
|
- **Change classification**:
|
|
- Price change
|
|
- Availability change
|
|
- Policy/text update
|
|
- Layout-only change
|
|
- **Smart alerts**: Only notify for meaningful changes
|
|
- **Summary in alert**: No need to view diff for simple changes
|
|
|
|
#### 11. Complex Page Support
|
|
- **JavaScript rendering**: Headless browser mode
|
|
- **Authentication**:
|
|
- Basic auth
|
|
- Session cookies
|
|
- Login flow automation
|
|
- **Dynamic content**: Wait for AJAX/lazy loading
|
|
- **SPA support**: Monitor client-side rendered apps
|
|
|
|
#### 12. Integrations
|
|
- **Slack**: Channel notifications
|
|
- **Discord**: Webhook alerts
|
|
- **Microsoft Teams**: Connector integration
|
|
- **Webhook**: Generic POST for automation tools
|
|
- **RSS feed**: Per-monitor or global feed
|
|
- **Zapier/Make**: Pre-built integrations
|
|
- **API**: Programmatic access to monitors and history
|
|
|
|
---
|
|
|
|
### Power User Features (Teams & Scale)
|
|
|
|
#### 13. Organization
|
|
- **Folders/Projects**: Hierarchical organization
|
|
- **Tags**: Multi-dimensional categorization
|
|
- **Search**: Full-text search across monitors and history
|
|
- **Filters**: Status, tags, frequency, last changed
|
|
- **Bulk operations**:
|
|
- Import URLs from CSV
|
|
- Export history
|
|
- Bulk pause/resume
|
|
- Bulk delete
|
|
|
|
#### 14. Collaboration
|
|
- **Team workspaces**: Shared monitor collections
|
|
- **Role-based access**:
|
|
- Admin: Full control
|
|
- Editor: Create/edit monitors
|
|
- Viewer: Read-only access
|
|
- **Assignment**: Assign monitors to team members
|
|
- **Comments**: Annotate changes
|
|
- **Audit log**: Track team actions
|
|
|
|
#### 15. Advanced Scheduling
|
|
- **Custom schedules**: Cron-like expressions
|
|
- **Business hours only**: Skip nights/weekends
|
|
- **Timezone-aware**: Different times per monitor
|
|
- **Geo-distributed checks**: Monitor from multiple regions
|
|
- **Adaptive frequency**: Check more often during active periods
|
|
|
|
---
|
|
|
|
## Technical Architecture
|
|
|
|
### Core Components
|
|
|
|
#### Frontend
|
|
- **Tech stack**: React/Next.js + TypeScript
|
|
- **UI components**: Tailwind CSS + shadcn/ui
|
|
- **State management**: React Query for API data
|
|
- **Real-time**: WebSocket for live updates (optional)
|
|
|
|
#### Backend
|
|
- **API**: Node.js/Express or Python/FastAPI
|
|
- **Database**: PostgreSQL for relational data
|
|
- **Queue system**: Redis + Bull/BullMQ for job scheduling
|
|
- **Storage**: S3-compatible for snapshots and screenshots
|
|
|
|
#### Monitoring Engine
|
|
- **Fetcher**: Axios/Got for simple pages
|
|
- **Browser**: Puppeteer/Playwright for JS-heavy sites
|
|
- **Differ**: jsdiff or custom algorithm
|
|
- **Scheduler**: Distributed job queue with priority
|
|
- **Rate limiting**: Per-domain backoff
|
|
|
|
#### Alert System
|
|
- **Email**: SendGrid/Postmark
|
|
- **Queue**: Separate alert queue for reliability
|
|
- **Templates**: Customizable alert formats
|
|
- **Delivery tracking**: Open/click tracking
|
|
|
|
### Data Models
|
|
|
|
#### Monitor
|
|
```
|
|
id, user_id, url, name, frequency, element_selector,
|
|
ignore_rules, keyword_rules, alert_settings, status,
|
|
created_at, last_checked_at, last_changed_at
|
|
```
|
|
|
|
#### Snapshot
|
|
```
|
|
id, monitor_id, html_content, screenshot_url,
|
|
content_hash, http_status, error_message,
|
|
created_at, changed_from_previous
|
|
```
|
|
|
|
#### Alert
|
|
```
|
|
id, monitor_id, snapshot_id, alert_type,
|
|
delivered_at, delivery_status, channels
|
|
```
|
|
|
|
---
|
|
|
|
## Monetization & Plan Gating
|
|
|
|
### Free Tier
|
|
- 5 monitors
|
|
- 1-hour minimum frequency
|
|
- 7-day history retention
|
|
- Email alerts only
|
|
- Basic noise filtering
|
|
|
|
### Pro Tier ($19-29/month)
|
|
- 50 monitors
|
|
- 5-minute frequency
|
|
- 90-day history
|
|
- All alert channels
|
|
- Advanced filtering + keywords
|
|
- Screenshot snapshots
|
|
|
|
### Business Tier ($99-149/month)
|
|
- 200 monitors
|
|
- 1-minute frequency
|
|
- 1-year history
|
|
- API access
|
|
- Team collaboration (5 seats)
|
|
- Priority support
|
|
- JS rendering included
|
|
|
|
### Enterprise Tier (Custom)
|
|
- Unlimited monitors
|
|
- Custom frequency
|
|
- Unlimited history
|
|
- Dedicated infrastructure
|
|
- SLA guarantees
|
|
- SSO/SAML
|
|
- Custom integrations
|
|
|
|
### Add-ons
|
|
- Extra monitors: $5 per 10
|
|
- Extended history: $10/month
|
|
- Additional team seats: $15/seat
|
|
- JS rendering credits: $20/100 pages
|
|
|
|
---
|
|
|
|
## Success Metrics
|
|
|
|
### Product KPIs
|
|
- Monitors created per user
|
|
- Check success rate (>99%)
|
|
- False positive rate (<5%)
|
|
- Alert open rate (>40%)
|
|
- User retention (D7, D30, M3)
|
|
|
|
### Business KPIs
|
|
- Free → Paid conversion (target >10%)
|
|
- Churn rate (target <5% monthly)
|
|
- Average monitors per paid user (target >15)
|
|
- CAC < 3x MRR
|
|
- Net promoter score (target >50)
|
|
|
|
---
|
|
|
|
## Security & Compliance
|
|
|
|
### Security
|
|
- **Authentication**: JWT + refresh tokens
|
|
- **2FA**: TOTP support
|
|
- **Encryption**: At rest (database) and in transit (TLS)
|
|
- **API keys**: Scoped, revocable
|
|
- **Rate limiting**: Per user and IP
|
|
|
|
### Privacy
|
|
- **GDPR**: Data export, deletion, consent management
|
|
- **Data retention**: Configurable, automatic cleanup
|
|
- **No tracking**: Don't store personal data from monitored pages
|
|
- **Anonymization**: Strip cookies/sessions from snapshots
|
|
|
|
### Reliability
|
|
- **Uptime SLA**: 99.9% for paid plans
|
|
- **Status page**: Public incident tracking
|
|
- **Backups**: Daily encrypted backups
|
|
- **Disaster recovery**: 24-hour RTO
|
|
|
|
---
|
|
|
|
## Competitive Differentiation
|
|
|
|
### vs. Visualping/ChangeTower
|
|
- **Better noise filtering**: AI-powered content classification
|
|
- **Smarter alerts**: Keyword-based + summarization
|
|
- **Better UX**: Cleaner UI, faster setup
|
|
|
|
### vs. Distill.io
|
|
- **Team features**: Built for collaboration from day one
|
|
- **More integrations**: Wider ecosystem support
|
|
- **Better pricing**: More generous free tier
|
|
|
|
### vs. Wachete
|
|
- **Modern tech**: Faster, more reliable
|
|
- **Visual diff**: Screenshot comparison
|
|
- **API-first**: Better for automation
|