307 lines
8.3 KiB
Markdown
307 lines
8.3 KiB
Markdown
# Unified Email Worker (Modular Version)
|
|
|
|
Multi-domain email processing worker for AWS SES/S3/SQS with bounce handling, auto-replies, forwarding, and sender blocking.
|
|
|
|
## 🏗️ Architecture
|
|
|
|
```
|
|
email-worker/
|
|
├── config.py # Configuration management
|
|
├── logger.py # Structured logging
|
|
├── aws/ # AWS service handlers
|
|
│ ├── s3_handler.py # S3 operations (download, metadata)
|
|
│ ├── sqs_handler.py # SQS polling
|
|
│ ├── ses_handler.py # SES email sending
|
|
│ └── dynamodb_handler.py # DynamoDB (rules, bounces, blocklist)
|
|
├── email/ # Email processing
|
|
│ ├── parser.py # Email parsing utilities
|
|
│ ├── bounce_handler.py # Bounce detection & rewriting
|
|
│ ├── rules_processor.py # OOO & forwarding logic
|
|
│ └── blocklist.py # Sender blocking with wildcards
|
|
├── smtp/ # SMTP delivery
|
|
│ ├── pool.py # Connection pooling
|
|
│ └── delivery.py # SMTP/LMTP delivery with retry
|
|
├── metrics/ # Monitoring
|
|
│ └── prometheus.py # Prometheus metrics
|
|
├── worker.py # Message processing logic
|
|
├── domain_poller.py # Domain queue poller
|
|
├── unified_worker.py # Main worker coordinator
|
|
├── health_server.py # Health check HTTP server
|
|
└── main.py # Entry point
|
|
```
|
|
|
|
## ✨ Features
|
|
|
|
- ✅ **Multi-Domain Processing**: Parallel processing of multiple domains via thread pool
|
|
- ✅ **Bounce Detection**: Automatic SES bounce notification rewriting
|
|
- ✅ **Auto-Reply/OOO**: Out-of-office automatic replies
|
|
- ✅ **Email Forwarding**: Rule-based forwarding to internal/external addresses
|
|
- ✅ **Sender Blocking**: Wildcard-based sender blocklist per recipient
|
|
- ✅ **SMTP Connection Pooling**: Efficient reuse of connections
|
|
- ✅ **LMTP Support**: Direct delivery to Dovecot (bypasses Postfix transport_maps)
|
|
- ✅ **Prometheus Metrics**: Comprehensive monitoring
|
|
- ✅ **Health Checks**: HTTP health endpoint for container orchestration
|
|
- ✅ **Graceful Shutdown**: Proper cleanup on SIGTERM/SIGINT
|
|
|
|
## 🔧 Configuration
|
|
|
|
All configuration via environment variables:
|
|
|
|
### AWS Settings
|
|
```bash
|
|
AWS_REGION=us-east-2
|
|
```
|
|
|
|
### Domains
|
|
```bash
|
|
# Option 1: Comma-separated list
|
|
DOMAINS=example.com,another.com
|
|
|
|
# Option 2: File with one domain per line
|
|
DOMAINS_FILE=/etc/email-worker/domains.txt
|
|
```
|
|
|
|
### Worker Settings
|
|
```bash
|
|
WORKER_THREADS=10
|
|
POLL_INTERVAL=20 # SQS long polling (seconds)
|
|
MAX_MESSAGES=10 # Max messages per poll
|
|
VISIBILITY_TIMEOUT=300 # Message visibility timeout (seconds)
|
|
```
|
|
|
|
### SMTP Delivery
|
|
```bash
|
|
SMTP_HOST=localhost
|
|
SMTP_PORT=25
|
|
SMTP_USE_TLS=false
|
|
SMTP_USER=
|
|
SMTP_PASS=
|
|
SMTP_POOL_SIZE=5
|
|
INTERNAL_SMTP_PORT=2525 # Port for internal delivery (bypasses transport_maps)
|
|
```
|
|
|
|
### LMTP (Direct Dovecot Delivery)
|
|
```bash
|
|
LMTP_ENABLED=false # Set to 'true' to use LMTP
|
|
LMTP_HOST=localhost
|
|
LMTP_PORT=24
|
|
```
|
|
|
|
### DynamoDB Tables
|
|
```bash
|
|
DYNAMODB_RULES_TABLE=email-rules
|
|
DYNAMODB_MESSAGES_TABLE=ses-outbound-messages
|
|
DYNAMODB_BLOCKED_TABLE=email-blocked-senders
|
|
```
|
|
|
|
### Bounce Handling
|
|
```bash
|
|
BOUNCE_LOOKUP_RETRIES=3
|
|
BOUNCE_LOOKUP_DELAY=1.0
|
|
```
|
|
|
|
### Monitoring
|
|
```bash
|
|
METRICS_PORT=8000 # Prometheus metrics
|
|
HEALTH_PORT=8080 # Health check endpoint
|
|
```
|
|
|
|
## 📊 DynamoDB Schemas
|
|
|
|
### email-rules
|
|
```json
|
|
{
|
|
"email_address": "user@example.com", // Partition Key
|
|
"ooo_active": true,
|
|
"ooo_message": "I am currently out of office...",
|
|
"ooo_content_type": "text", // "text" or "html"
|
|
"forwards": ["other@example.com", "external@gmail.com"]
|
|
}
|
|
```
|
|
|
|
### ses-outbound-messages
|
|
```json
|
|
{
|
|
"MessageId": "abc123...", // Partition Key (SES Message-ID)
|
|
"original_source": "sender@example.com",
|
|
"recipients": ["recipient@other.com"],
|
|
"timestamp": "2025-01-01T12:00:00Z",
|
|
"bounceType": "Permanent",
|
|
"bounceSubType": "General",
|
|
"bouncedRecipients": ["recipient@other.com"]
|
|
}
|
|
```
|
|
|
|
### email-blocked-senders
|
|
```json
|
|
{
|
|
"email_address": "user@example.com", // Partition Key
|
|
"blocked_patterns": [
|
|
"spam@*.com", // Wildcard support
|
|
"noreply@badsite.com",
|
|
"*@malicious.org"
|
|
]
|
|
}
|
|
```
|
|
|
|
## 🚀 Usage
|
|
|
|
### Installation
|
|
```bash
|
|
cd email-worker
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### Run
|
|
```bash
|
|
python3 main.py
|
|
```
|
|
|
|
### Docker
|
|
```dockerfile
|
|
FROM python:3.11-slim
|
|
|
|
WORKDIR /app
|
|
COPY . /app
|
|
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
CMD ["python3", "main.py"]
|
|
```
|
|
|
|
## 📈 Metrics
|
|
|
|
Available at `http://localhost:8000/metrics`:
|
|
|
|
- `emails_processed_total{domain, status}` - Total emails processed
|
|
- `emails_in_flight` - Currently processing emails
|
|
- `email_processing_seconds{domain}` - Processing time histogram
|
|
- `queue_messages_available{domain}` - Queue size gauge
|
|
- `bounces_processed_total{domain, type}` - Bounce notifications
|
|
- `autoreplies_sent_total{domain}` - Auto-replies sent
|
|
- `forwards_sent_total{domain}` - Forwards sent
|
|
- `blocked_senders_total{domain}` - Blocked emails
|
|
|
|
## 🏥 Health Checks
|
|
|
|
Available at `http://localhost:8080/health`:
|
|
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"domains": 5,
|
|
"domain_list": ["example.com", "another.com"],
|
|
"dynamodb": true,
|
|
"features": {
|
|
"bounce_rewriting": true,
|
|
"auto_reply": true,
|
|
"forwarding": true,
|
|
"blocklist": true,
|
|
"lmtp": false
|
|
},
|
|
"timestamp": "2025-01-22T10:00:00.000000"
|
|
}
|
|
```
|
|
|
|
## 🔍 Key Improvements in Modular Version
|
|
|
|
### 1. **Fixed Critical Bugs**
|
|
- ✅ Fixed `signal.SIGINT` typo (was `signalIGINT`)
|
|
- ✅ Proper S3 metadata before deletion (audit trail)
|
|
- ✅ Batch DynamoDB calls for blocklist (performance)
|
|
- ✅ Error handling for S3 delete failures
|
|
|
|
### 2. **Better Architecture**
|
|
- **Separation of Concerns**: Each component has single responsibility
|
|
- **Testability**: Easy to unit test individual components
|
|
- **Maintainability**: Changes isolated to specific modules
|
|
- **Extensibility**: Easy to add new features
|
|
|
|
### 3. **Performance**
|
|
- **Batch Blocklist Checks**: One DynamoDB call for all recipients
|
|
- **Connection Pooling**: Reusable SMTP connections
|
|
- **Efficient Metrics**: Optional Prometheus integration
|
|
|
|
### 4. **Reliability**
|
|
- **Proper Error Handling**: Each component handles its own errors
|
|
- **Graceful Degradation**: Works even if DynamoDB unavailable
|
|
- **Audit Trail**: All actions logged to S3 metadata
|
|
|
|
## 🔐 Security Features
|
|
|
|
1. **Domain Validation**: Workers only process their assigned domains
|
|
2. **Loop Prevention**: Detects and skips already-processed emails
|
|
3. **Blocklist Support**: Wildcard-based sender blocking
|
|
4. **Internal vs External**: Separate handling prevents loops
|
|
|
|
## 📝 Example Usage
|
|
|
|
### Enable OOO for user
|
|
```python
|
|
import boto3
|
|
|
|
dynamodb = boto3.resource('dynamodb')
|
|
table = dynamodb.Table('email-rules')
|
|
|
|
table.put_item(Item={
|
|
'email_address': 'john@example.com',
|
|
'ooo_active': True,
|
|
'ooo_message': 'I am out of office until Feb 1st.',
|
|
'ooo_content_type': 'html'
|
|
})
|
|
```
|
|
|
|
### Block spam senders
|
|
```python
|
|
table = dynamodb.Table('email-blocked-senders')
|
|
|
|
table.put_item(Item={
|
|
'email_address': 'john@example.com',
|
|
'blocked_patterns': [
|
|
'*@spam.com',
|
|
'noreply@*.marketing.com',
|
|
'newsletter@*'
|
|
]
|
|
})
|
|
```
|
|
|
|
### Forward emails
|
|
```python
|
|
table = dynamodb.Table('email-rules')
|
|
|
|
table.put_item(Item={
|
|
'email_address': 'support@example.com',
|
|
'forwards': [
|
|
'john@example.com',
|
|
'jane@example.com',
|
|
'external@gmail.com'
|
|
]
|
|
})
|
|
```
|
|
|
|
## 🐛 Troubleshooting
|
|
|
|
### Worker not processing emails
|
|
1. Check queue URLs: `curl http://localhost:8080/domains`
|
|
2. Check logs for SQS errors
|
|
3. Verify IAM permissions for SQS/S3/SES/DynamoDB
|
|
|
|
### Bounces not rewritten
|
|
1. Check DynamoDB table name: `DYNAMODB_MESSAGES_TABLE`
|
|
2. Verify Lambda function is writing bounce records
|
|
3. Check logs for DynamoDB lookup errors
|
|
|
|
### Auto-replies not sent
|
|
1. Verify DynamoDB rules table accessible
|
|
2. Check `ooo_active` is `true` (boolean, not string)
|
|
3. Review logs for SES send errors
|
|
|
|
### Blocked emails still delivered
|
|
1. Verify blocklist table exists and is accessible
|
|
2. Check wildcard patterns are lowercase
|
|
3. Review logs for blocklist check errors
|
|
|
|
## 📄 License
|
|
|
|
MIT License - See LICENSE file for details
|