Troubleshooting Synthetic Monitoring Checks
Fix common synthetic monitoring issues. Solutions for timeout errors, authentication failures, flaky tests, selector problems, and network errors in supaguard.
A comprehensive guide to diagnosing and resolving the most common issues with synthetic monitoring checks. For each issue, we provide the symptom, likely cause, and recommended fix.
Script Errors
Element Not Found / Locator Timeout
Symptom: Timeout 30000ms exceeded waiting for locator
Common causes:
- The element's selector no longer matches the current DOM
- A popup, modal, or cookie banner is covering the element
- The page didn't load completely
- The element is off-screen and requires scrolling
Fixes:
// Fix 1: Dismiss cookie banner first
const banner = page.getByRole("button", { name: "Accept Cookies" });
if (await banner.isVisible({ timeout: 3000 }).catch(() => false)) {
await banner.click();
}
// Fix 2: Use more resilient selectors
// Bad: fragile CSS
await page.click("div.header > nav > ul > li:nth-child(3) > a");
// Good: semantic selector
await page.getByRole("link", { name: "Dashboard" }).click();
// Fix 3: Scroll element into view
await page.getByTestId("footer-link").scrollIntoViewIfNeeded();
await page.getByTestId("footer-link").click();Timeout Exceeded
Symptom: page.goto: Timeout 30000ms exceeded or navigation timeout
Common causes:
- The server is slow or under heavy load
- A large asset (image, video) is blocking the load event
- A third-party script is hanging (analytics, chat widget)
- DNS resolution failure
Fixes:
// Fix 1: Use domcontentloaded instead of load
await page.goto("https://app.example.com", {
waitUntil: "domcontentloaded",
});
// Fix 2: Increase navigation timeout for known slow pages
await page.goto("https://app.example.com/reports", {
timeout: 60000,
});
// Fix 3: Block heavy third-party resources
await page.route("**/analytics.js", (route) => route.abort());
await page.route("**/chat-widget.js", (route) => route.abort());Assertion Failed
Symptom: expect(received).toContainText(expected) or similar mismatch
Common causes:
- The application text changed (legitimate UI update)
- The feature is genuinely broken (real bug)
- Dynamic content loaded differently (A/B test, cached version)
Diagnosis:
- Watch the video recording — does the page look correct?
- If the text changed intentionally → update the assertion
- If the page shows an error → this is a real application bug
Authentication Issues
Login Fails in Monitoring but Works Locally
Common causes:
- Test credentials were rotated or expired
- IP-based rate limiting blocks monitoring IP addresses
- MFA/2FA is enabled on the test account
- CAPTCHA or bot detection is blocking automated access
Fixes:
- Verify credentials in Organization Variables
- Allowlist supaguard IPs via Firewall Allowlisting
- Disable MFA on the dedicated test account
- Whitelist the test account from CAPTCHA checks
Session Expired During Check
Symptom: Check starts working but fails midway with a redirect to login
Fix: Perform login at the start of every check. Don't rely on session persistence between runs—each run starts with a clean browser context.
test("dashboard access", async ({ page }) => {
// Always login first — sessions don't persist
await page.goto("https://app.example.com/login");
await page.getByLabel("Email").fill(process.env.TEST_USER_EMAIL!);
await page.getByLabel("Password").fill(process.env.TEST_USER_PASSWORD!);
await page.getByRole("button", { name: "Sign In" }).click();
// Now test the actual flow
await page.goto("https://app.example.com/dashboard");
await expect(page.getByText("Welcome")).toBeVisible();
});Network and Infrastructure Issues
SSL / Certificate Errors
Symptom: ERR_CERT_DATE_INVALID or NET::ERR_CERT_AUTHORITY_INVALID
This is typically a real problem. Your SSL certificate may be expired or misconfigured.
Action: Alert your DevOps team immediately. → SSL Certificate Monitoring
DNS Resolution Failure
Symptom: ERR_NAME_NOT_RESOLVED
Common causes:
- DNS propagation delay after a domain change
- DNS provider outage
- Typo in the URL
Fix: Verify the URL is correct and the DNS is resolving from the monitoring region.
Firewall Blocking Monitoring Traffic
Symptom: ERR_CONNECTION_TIMED_OUT or ERR_CONNECTION_REFUSED
Fix: Add supaguard's IP addresses to your firewall allowlist. → Firewall Allowlisting
Flaky Tests
Check Passes Sometimes, Fails Other Times
Common causes:
- Timing issues: Element not ready when the assertion runs
- Dynamic content: Content changes between runs (ads, recommendations)
- Race conditions: API response arrives after the assertion
- Regional differences: CDN serves different content to different regions
Fixes:
// Fix 1: Wait for specific content to load
await page.getByTestId("data-table").waitFor({ state: "visible" });
// Fix 2: Use more forgiving assertions
await expect(page.getByTestId("price")).toContainText("$");
// Instead of exact match:
// await expect(page.getByTestId("price")).toHaveText("$49.99");
// Fix 3: Wait for network to settle
await page.waitForLoadState("networkidle");[!TIP] If a check is consistently flaky despite fixes, consider whether it's testing something inherently variable. Smart Retries help by re-running from different regions, but the script itself should be deterministic.
Alert and Notification Issues
Not Receiving Alerts
- Check channel configuration — Verify the webhook URL, Slack channel, or email is correct
- Send a test notification — Use the "Test" button in Alert Policies
- Check spam folder — Email alerts may be filtered
- Verify policy assignment — Make sure the alert policy is assigned to the check
- Check webhook logs — For custom webhooks, inspect the response from your server
Too Many Alerts (Alert Fatigue)
- Change the trigger from "Immediate" to "After 2 consecutive failures"
- Enable Smart Retries to filter transient issues
- Use escalation rules to delay PagerDuty alerts
- Review and fix frequently-failing checks rather than muting them
Still Stuck?
If none of the above solutions help:
- Review the Debugging Failures guide for detailed diagnostic steps
- Check Failure Classification to understand the severity
- Contact support at support@supaguard.com with your check ID and the error message
Next Steps
- Debugging Failures — Video, network, and trace debugging
- Writing Playwright Tests — Best practices for reliable scripts
- Smart Retries — How false alarms are eliminated
- Firewall Allowlisting — Allow monitoring traffic through your firewall
Team Collaboration for Synthetic Monitoring
Set up team workflows for monitoring. Learn how to invite members, manage roles, organize checks, share alert policies, and collaborate effectively on supaguard.
Writing Playwright Tests for supaguard
Learn to write manual Playwright scripts for supaguard. Covers clicking, typing, waiting, and best practices for resilient selectors and focused test design.