# Unified Email Worker (Modular Version) Multi-domain email processing worker for AWS SES/S3/SQS with bounce handling, auto-replies, forwarding, and sender blocking. ## 🏗️ Architecture ``` email-worker/ ├── config.py # Configuration management ├── logger.py # Structured logging ├── aws/ # AWS service handlers │ ├── s3_handler.py # S3 operations (download, metadata) │ ├── sqs_handler.py # SQS polling │ ├── ses_handler.py # SES email sending │ └── dynamodb_handler.py # DynamoDB (rules, bounces, blocklist) ├── email/ # Email processing │ ├── parser.py # Email parsing utilities │ ├── bounce_handler.py # Bounce detection & rewriting │ ├── rules_processor.py # OOO & forwarding logic │ └── blocklist.py # Sender blocking with wildcards ├── smtp/ # SMTP delivery │ ├── pool.py # Connection pooling │ └── delivery.py # SMTP/LMTP delivery with retry ├── metrics/ # Monitoring │ └── prometheus.py # Prometheus metrics ├── worker.py # Message processing logic ├── domain_poller.py # Domain queue poller ├── unified_worker.py # Main worker coordinator ├── health_server.py # Health check HTTP server └── main.py # Entry point ``` ## ✨ Features - ✅ **Multi-Domain Processing**: Parallel processing of multiple domains via thread pool - ✅ **Bounce Detection**: Automatic SES bounce notification rewriting - ✅ **Auto-Reply/OOO**: Out-of-office automatic replies - ✅ **Email Forwarding**: Rule-based forwarding to internal/external addresses - ✅ **Sender Blocking**: Wildcard-based sender blocklist per recipient - ✅ **SMTP Connection Pooling**: Efficient reuse of connections - ✅ **LMTP Support**: Direct delivery to Dovecot (bypasses Postfix transport_maps) - ✅ **Prometheus Metrics**: Comprehensive monitoring - ✅ **Health Checks**: HTTP health endpoint for container orchestration - ✅ **Graceful Shutdown**: Proper cleanup on SIGTERM/SIGINT ## 🔧 Configuration All configuration via environment variables: ### AWS Settings ```bash AWS_REGION=us-east-2 ``` ### Domains ```bash # Option 1: Comma-separated list DOMAINS=example.com,another.com # Option 2: File with one domain per line DOMAINS_FILE=/etc/email-worker/domains.txt ``` ### Worker Settings ```bash WORKER_THREADS=10 POLL_INTERVAL=20 # SQS long polling (seconds) MAX_MESSAGES=10 # Max messages per poll VISIBILITY_TIMEOUT=300 # Message visibility timeout (seconds) ``` ### SMTP Delivery ```bash SMTP_HOST=localhost SMTP_PORT=25 SMTP_USE_TLS=false SMTP_USER= SMTP_PASS= SMTP_POOL_SIZE=5 INTERNAL_SMTP_PORT=2525 # Port for internal delivery (bypasses transport_maps) ``` ### LMTP (Direct Dovecot Delivery) ```bash LMTP_ENABLED=false # Set to 'true' to use LMTP LMTP_HOST=localhost LMTP_PORT=24 ``` ### DynamoDB Tables ```bash DYNAMODB_RULES_TABLE=email-rules DYNAMODB_MESSAGES_TABLE=ses-outbound-messages DYNAMODB_BLOCKED_TABLE=email-blocked-senders ``` ### Bounce Handling ```bash BOUNCE_LOOKUP_RETRIES=3 BOUNCE_LOOKUP_DELAY=1.0 ``` ### Monitoring ```bash METRICS_PORT=8000 # Prometheus metrics HEALTH_PORT=8080 # Health check endpoint ``` ## 📊 DynamoDB Schemas ### email-rules ```json { "email_address": "user@example.com", // Partition Key "ooo_active": true, "ooo_message": "I am currently out of office...", "ooo_content_type": "text", // "text" or "html" "forwards": ["other@example.com", "external@gmail.com"] } ``` ### ses-outbound-messages ```json { "MessageId": "abc123...", // Partition Key (SES Message-ID) "original_source": "sender@example.com", "recipients": ["recipient@other.com"], "timestamp": "2025-01-01T12:00:00Z", "bounceType": "Permanent", "bounceSubType": "General", "bouncedRecipients": ["recipient@other.com"] } ``` ### email-blocked-senders ```json { "email_address": "user@example.com", // Partition Key "blocked_patterns": [ "spam@*.com", // Wildcard support "noreply@badsite.com", "*@malicious.org" ] } ``` ## 🚀 Usage ### Installation ```bash cd email-worker pip install -r requirements.txt ``` ### Run ```bash python3 main.py ``` ### Docker ```dockerfile FROM python:3.11-slim WORKDIR /app COPY . /app RUN pip install --no-cache-dir -r requirements.txt CMD ["python3", "main.py"] ``` ## 📈 Metrics Available at `http://localhost:8000/metrics`: - `emails_processed_total{domain, status}` - Total emails processed - `emails_in_flight` - Currently processing emails - `email_processing_seconds{domain}` - Processing time histogram - `queue_messages_available{domain}` - Queue size gauge - `bounces_processed_total{domain, type}` - Bounce notifications - `autoreplies_sent_total{domain}` - Auto-replies sent - `forwards_sent_total{domain}` - Forwards sent - `blocked_senders_total{domain}` - Blocked emails ## 🏥 Health Checks Available at `http://localhost:8080/health`: ```json { "status": "healthy", "domains": 5, "domain_list": ["example.com", "another.com"], "dynamodb": true, "features": { "bounce_rewriting": true, "auto_reply": true, "forwarding": true, "blocklist": true, "lmtp": false }, "timestamp": "2025-01-22T10:00:00.000000" } ``` ## 🔍 Key Improvements in Modular Version ### 1. **Fixed Critical Bugs** - ✅ Fixed `signal.SIGINT` typo (was `signalIGINT`) - ✅ Proper S3 metadata before deletion (audit trail) - ✅ Batch DynamoDB calls for blocklist (performance) - ✅ Error handling for S3 delete failures ### 2. **Better Architecture** - **Separation of Concerns**: Each component has single responsibility - **Testability**: Easy to unit test individual components - **Maintainability**: Changes isolated to specific modules - **Extensibility**: Easy to add new features ### 3. **Performance** - **Batch Blocklist Checks**: One DynamoDB call for all recipients - **Connection Pooling**: Reusable SMTP connections - **Efficient Metrics**: Optional Prometheus integration ### 4. **Reliability** - **Proper Error Handling**: Each component handles its own errors - **Graceful Degradation**: Works even if DynamoDB unavailable - **Audit Trail**: All actions logged to S3 metadata ## 🔐 Security Features 1. **Domain Validation**: Workers only process their assigned domains 2. **Loop Prevention**: Detects and skips already-processed emails 3. **Blocklist Support**: Wildcard-based sender blocking 4. **Internal vs External**: Separate handling prevents loops ## 📝 Example Usage ### Enable OOO for user ```python import boto3 dynamodb = boto3.resource('dynamodb') table = dynamodb.Table('email-rules') table.put_item(Item={ 'email_address': 'john@example.com', 'ooo_active': True, 'ooo_message': 'I am out of office until Feb 1st.', 'ooo_content_type': 'html' }) ``` ### Block spam senders ```python table = dynamodb.Table('email-blocked-senders') table.put_item(Item={ 'email_address': 'john@example.com', 'blocked_patterns': [ '*@spam.com', 'noreply@*.marketing.com', 'newsletter@*' ] }) ``` ### Forward emails ```python table = dynamodb.Table('email-rules') table.put_item(Item={ 'email_address': 'support@example.com', 'forwards': [ 'john@example.com', 'jane@example.com', 'external@gmail.com' ] }) ``` ## 🐛 Troubleshooting ### Worker not processing emails 1. Check queue URLs: `curl http://localhost:8080/domains` 2. Check logs for SQS errors 3. Verify IAM permissions for SQS/S3/SES/DynamoDB ### Bounces not rewritten 1. Check DynamoDB table name: `DYNAMODB_MESSAGES_TABLE` 2. Verify Lambda function is writing bounce records 3. Check logs for DynamoDB lookup errors ### Auto-replies not sent 1. Verify DynamoDB rules table accessible 2. Check `ooo_active` is `true` (boolean, not string) 3. Review logs for SES send errors ### Blocked emails still delivered 1. Verify blocklist table exists and is accessible 2. Check wildcard patterns are lowercase 3. Review logs for blocklist check errors ## 📄 License MIT License - See LICENSE file for details