PagerDuty Integration: On-Call Alerts for Critical Failures
Connect supaguard to PagerDuty for on-call incident alerting. Set up escalation policies to page engineers only for critical synthetic monitoring failures.
Connect supaguard to PagerDuty to alert your on-call engineers when critical checks fail. This guide covers integration setup, escalation best practices, and reducing alert fatigue.
When to Use PagerDuty vs Slack
| Scenario | Channel |
|---|---|
| Critical outage (site down, checkout broken) | PagerDuty |
| Performance degradation | Slack |
| Informational updates | Slack or Email |
| Recovery notifications | Slack |
PagerDuty should wake people up. Slack should inform. Use both together for effective incident response.
Quick Setup
Step 1: Create a supaguard Service in PagerDuty
- Log in to PagerDuty
- Go to Services → Service Directory
- Click + New Service
- Configure the service:
- Name: "supaguard Synthetic Monitoring"
- Description: "Alerts from supaguard synthetic checks"
- Escalation Policy: Select your existing policy or create new
- Click Next
- On Integrations, select Events API V2
- Click Create Service
- Copy the Integration Key (also called Routing Key)
Your integration key looks like:
a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6Step 2: Add PagerDuty to supaguard
- Go to supaguard dashboard → Settings → Communications
- Click Add Channel
- Select PagerDuty
- Enter your integration key
- Name it (e.g., "On-Call Alerts")
- Click Save
Step 3: Create an Alert Policy for Critical Failures
- Go to Alert Policies → Create Policy
- Configure:
- Name: "Critical - Page On-Call"
- Trigger: On failure
- Severity Filter: Critical only
- Channels: Select your PagerDuty channel
- Click Save
Step 4: Assign to Checks
- Edit each check that should page on-call
- Go to the Alerting tab
- Select your "Critical - Page On-Call" policy
- Save
Best Practice: Escalation Flow
Don't page immediately for every failure. Use escalation:
Recommended Flow
Failure Detected (3:00 AM)
│
├── Immediate: Slack notification (#alerts)
│
├── Wait 5 minutes...
│
├── Still failing? → PagerDuty alert
│
└── Recovery: Slack notification + PagerDuty auto-resolveImplementation in supaguard
Create two alert policies:
Policy 1: Slack Immediate
- Trigger: On failure
- Delay: None
- Channel: Slack
Policy 2: PagerDuty Escalation
- Trigger: On failure
- Delay: 5 minutes
- Severity: Critical only
- Channel: PagerDuty
Assign both policies to critical checks.
PagerDuty Alert Details
supaguard sends rich incident data:
{
"routing_key": "your-integration-key",
"event_action": "trigger",
"dedup_key": "supaguard-check-abc123",
"payload": {
"summary": "CRITICAL: Checkout Flow - site checkout is broken",
"severity": "critical",
"source": "supaguard",
"custom_details": {
"check_name": "Checkout Flow",
"url": "https://shop.example.com/checkout",
"location": "San Francisco",
"error": "Button 'Pay Now' not clickable",
"duration_ms": 30000,
"trace_url": "https://app.supaguard.com/trace/..."
}
},
"links": [{
"href": "https://app.supaguard.com/checks/abc123",
"text": "View in supaguard"
}]
}Auto-Resolution
When a check recovers, supaguard automatically resolves the PagerDuty incident:
{
"routing_key": "your-integration-key",
"event_action": "resolve",
"dedup_key": "supaguard-check-abc123"
}This ensures incidents don't stay open after the issue is fixed.
Deduplication
supaguard uses consistent dedup_key values per check. This means:
- Multiple failures of the same check = one PagerDuty incident
- No duplicate pages for the same issue
- Clean incident timeline
PagerDuty Service Configuration Tips
Set Appropriate Urgency
Configure your PagerDuty service urgency:
- High Urgency: For production-critical checks (checkout, login)
- Low Urgency: For less critical monitoring (docs, marketing pages)
Configure Intelligent Grouping
Enable PagerDuty's Intelligent Alert Grouping to combine related supaguard alerts during widespread outages.
Set Support Hours
If you don't need 3 AM pages for certain checks, configure support hours on the PagerDuty service or use supaguard's scheduling features.
Multiple PagerDuty Services
You might want different escalation paths:
| Check Type | PagerDuty Service |
|---|---|
| Payment flows | Payments On-Call |
| Authentication | Platform On-Call |
| Marketing site | Marketing Team |
Create multiple PagerDuty integrations in supaguard and assign appropriate policies to each check.
Troubleshooting
Incidents Not Creating
- Verify integration key — Test with PagerDuty's API directly
- Check service status — Ensure the PagerDuty service is active
- Verify escalation policy — Must have at least one target
Duplicate Incidents
This shouldn't happen with supaguard's deduplication. If you see duplicates:
- Check for multiple alert policies assigned
- Verify you're using a single PagerDuty channel
Not Auto-Resolving
Ensure:
- The PagerDuty integration supports Events API V2
- supaguard recovery notifications are enabled
Security Best Practices
- Use dedicated service — Don't share with unrelated integrations
- Restrict integration key access — Only admins should see it
- Enable PagerDuty audit logs — Track who acknowledges/resolves
Related Resources
- Slack Integration — For non-urgent notifications
- Configuring Alerts — Alert policy setup
- Smart Retries — How we reduce false alarms before they page
Writing Playwright Tests for supaguard
Learn to write manual Playwright scripts for supaguard. Covers clicking, typing, waiting, and best practices for resilient selectors and focused test design.
Slack Integration: Get Synthetic Monitoring Alerts in Slack
Set up Slack integration for supaguard alerts. Step-by-step guide to receiving synthetic monitoring notifications in your Slack workspace with formatted messages.