# Website Change Detection Monitor - Technical Specification ## Product Overview A SaaS platform that monitors web pages for changes and alerts users when meaningful updates occur. The core value proposition is "signal over noise" - detecting real changes while filtering out irrelevant updates. ## Core Value Propositions - **Easy to understand**: "I watch pages so you don't have to" - **Smart filtering**: Automatically ignores timestamps, cookie banners, and noise - **Keyword intelligence**: Alert on specific content appearing/disappearing - **SEO-friendly**: Captures long-tail keywords (e.g., "monitor job posting changes") --- ## Feature Specification by Phase ### MVP Features (Launch Fast) #### 1. URL Monitoring - **Track URLs**: Add any public web page by URL - **Frequency options**: 5min / 30min / 6hr / 24hr intervals - **Change detection methods**: - Content hash comparison (fast, binary change detection) - Text diff (character/line level differences) - **Storage**: Last 5-10 snapshots per URL #### 2. Alert System - **Email notifications**: Send alert when change detected - **Alert content**: - Timestamp of change - Link to diff view - Change severity indicator - **Basic throttling**: Max 1 alert per check interval #### 3. Change Viewing - **History timeline**: Chronological list of all checks - Status (changed/unchanged/error) - Timestamp - Response code - **Diff viewer**: - Side-by-side or unified view - Highlighted additions/deletions - Character-level diff for precision - **Snapshot storage**: Full HTML snapshots for history #### 4. Reliability - **Retry logic**: 2-3 attempts on timeout/5xx errors - **Error alerts**: Notify if page becomes unavailable - **Status tracking**: - HTTP response codes - Timeout detection - Robot/CAPTCHA blocking detection - **Run logs**: Detailed history of each check attempt --- ### V1 Features (People Pay) #### 5. Noise Reduction (Differentiator) - **Automatic filtering**: - Cookie consent banners - Timestamps and "last updated" text - Current date/time displays - Session IDs in URLs - Rotating content (recommendations, ads) - **Custom ignore rules**: - Text pattern matching (regex) - CSS selector exclusion - Ignore numeric changes except in specific areas - Whitelist/blacklist mode #### 6. Selective Monitoring - **Element monitoring**: Track only specific page sections - Visual point-and-click selector - CSS selector input (power users) - XPath support - **Multiple elements**: Track different sections separately - **Element naming**: Label tracked elements for clarity #### 7. Keyword-Based Alerts (High Value) - **Keyword rules**: - Alert when keyword appears - Alert when keyword disappears - Alert when keyword count changes - Threshold-based alerts ("less than 5 items") - **Regex support**: Advanced pattern matching - **Multiple keywords**: AND/OR logic combinations - **Case sensitivity options** #### 8. Advanced Alerting - **Digest mode**: Daily/weekly summary of all changes - **Quiet hours**: No alerts during specified times - **Alert throttling**: Configurable limits - Max alerts per hour/day - Cooldown periods - **Severity filtering**: Only alert on major changes - **Multiple channels per monitor**: Email + Slack, etc. --- ### V2 Features (Market Winner) #### 9. Visual Change Detection - **Screenshot capture**: Full-page screenshots - **Image diff**: Visual highlighting of changed areas - **Pixel-perfect detection**: Layout shift detection - **Before/After carousel**: Easy visual comparison - **Screenshot retention**: Based on plan tier #### 10. AI-Powered Summaries - **Change summarization**: Natural language description - "Price changed from $29.99 to $24.99" - "New 'Out of Stock' banner added" - **Change classification**: - Price change - Availability change - Policy/text update - Layout-only change - **Smart alerts**: Only notify for meaningful changes - **Summary in alert**: No need to view diff for simple changes #### 11. Complex Page Support - **JavaScript rendering**: Headless browser mode - **Authentication**: - Basic auth - Session cookies - Login flow automation - **Dynamic content**: Wait for AJAX/lazy loading - **SPA support**: Monitor client-side rendered apps #### 12. Integrations - **Slack**: Channel notifications - **Discord**: Webhook alerts - **Microsoft Teams**: Connector integration - **Webhook**: Generic POST for automation tools - **RSS feed**: Per-monitor or global feed - **Zapier/Make**: Pre-built integrations - **API**: Programmatic access to monitors and history --- ### Power User Features (Teams & Scale) #### 13. Organization - **Folders/Projects**: Hierarchical organization - **Tags**: Multi-dimensional categorization - **Search**: Full-text search across monitors and history - **Filters**: Status, tags, frequency, last changed - **Bulk operations**: - Import URLs from CSV - Export history - Bulk pause/resume - Bulk delete #### 14. Collaboration - **Team workspaces**: Shared monitor collections - **Role-based access**: - Admin: Full control - Editor: Create/edit monitors - Viewer: Read-only access - **Assignment**: Assign monitors to team members - **Comments**: Annotate changes - **Audit log**: Track team actions #### 15. Advanced Scheduling - **Custom schedules**: Cron-like expressions - **Business hours only**: Skip nights/weekends - **Timezone-aware**: Different times per monitor - **Geo-distributed checks**: Monitor from multiple regions - **Adaptive frequency**: Check more often during active periods --- ## Technical Architecture ### Core Components #### Frontend - **Tech stack**: React/Next.js + TypeScript - **UI components**: Tailwind CSS + shadcn/ui - **State management**: React Query for API data - **Real-time**: WebSocket for live updates (optional) #### Backend - **API**: Node.js/Express or Python/FastAPI - **Database**: PostgreSQL for relational data - **Queue system**: Redis + Bull/BullMQ for job scheduling - **Storage**: S3-compatible for snapshots and screenshots #### Monitoring Engine - **Fetcher**: Axios/Got for simple pages - **Browser**: Puppeteer/Playwright for JS-heavy sites - **Differ**: jsdiff or custom algorithm - **Scheduler**: Distributed job queue with priority - **Rate limiting**: Per-domain backoff #### Alert System - **Email**: SendGrid/Postmark - **Queue**: Separate alert queue for reliability - **Templates**: Customizable alert formats - **Delivery tracking**: Open/click tracking ### Data Models #### Monitor ``` id, user_id, url, name, frequency, element_selector, ignore_rules, keyword_rules, alert_settings, status, created_at, last_checked_at, last_changed_at ``` #### Snapshot ``` id, monitor_id, html_content, screenshot_url, content_hash, http_status, error_message, created_at, changed_from_previous ``` #### Alert ``` id, monitor_id, snapshot_id, alert_type, delivered_at, delivery_status, channels ``` --- ## Monetization & Plan Gating ### Free Tier - 5 monitors - 1-hour minimum frequency - 7-day history retention - Email alerts only - Basic noise filtering ### Pro Tier ($19-29/month) - 50 monitors - 5-minute frequency - 90-day history - All alert channels - Advanced filtering + keywords - Screenshot snapshots ### Business Tier ($99-149/month) - 200 monitors - 1-minute frequency - 1-year history - API access - Team collaboration (5 seats) - Priority support - JS rendering included ### Enterprise Tier (Custom) - Unlimited monitors - Custom frequency - Unlimited history - Dedicated infrastructure - SLA guarantees - SSO/SAML - Custom integrations ### Add-ons - Extra monitors: $5 per 10 - Extended history: $10/month - Additional team seats: $15/seat - JS rendering credits: $20/100 pages --- ## Success Metrics ### Product KPIs - Monitors created per user - Check success rate (>99%) - False positive rate (<5%) - Alert open rate (>40%) - User retention (D7, D30, M3) ### Business KPIs - Free → Paid conversion (target >10%) - Churn rate (target <5% monthly) - Average monitors per paid user (target >15) - CAC < 3x MRR - Net promoter score (target >50) --- ## Security & Compliance ### Security - **Authentication**: JWT + refresh tokens - **2FA**: TOTP support - **Encryption**: At rest (database) and in transit (TLS) - **API keys**: Scoped, revocable - **Rate limiting**: Per user and IP ### Privacy - **GDPR**: Data export, deletion, consent management - **Data retention**: Configurable, automatic cleanup - **No tracking**: Don't store personal data from monitored pages - **Anonymization**: Strip cookies/sessions from snapshots ### Reliability - **Uptime SLA**: 99.9% for paid plans - **Status page**: Public incident tracking - **Backups**: Daily encrypted backups - **Disaster recovery**: 24-hour RTO --- ## Competitive Differentiation ### vs. Visualping/ChangeTower - **Better noise filtering**: AI-powered content classification - **Smarter alerts**: Keyword-based + summarization - **Better UX**: Cleaner UI, faster setup ### vs. Distill.io - **Team features**: Built for collaboration from day one - **More integrations**: Wider ecosystem support - **Better pricing**: More generous free tier ### vs. Wachete - **Modern tech**: Faster, more reliable - **Visual diff**: Screenshot comparison - **API-first**: Better for automation