supaguardsupaguardDocs
Guides

Troubleshooting Synthetic Monitoring Checks

Fix common synthetic monitoring issues. Solutions for timeout errors, authentication failures, flaky tests, selector problems, and network errors in supaguard.

A comprehensive guide to diagnosing and resolving the most common issues with synthetic monitoring checks. For each issue, we provide the symptom, likely cause, and recommended fix.

Script Errors

Element Not Found / Locator Timeout

Symptom: Timeout 30000ms exceeded waiting for locator

Common causes:

  1. The element's selector no longer matches the current DOM
  2. A popup, modal, or cookie banner is covering the element
  3. The page didn't load completely
  4. The element is off-screen and requires scrolling

Fixes:

// Fix 1: Dismiss cookie banner first
const banner = page.getByRole("button", { name: "Accept Cookies" });
if (await banner.isVisible({ timeout: 3000 }).catch(() => false)) {
  await banner.click();
}

// Fix 2: Use more resilient selectors
// Bad: fragile CSS
await page.click("div.header > nav > ul > li:nth-child(3) > a");
// Good: semantic selector
await page.getByRole("link", { name: "Dashboard" }).click();

// Fix 3: Scroll element into view
await page.getByTestId("footer-link").scrollIntoViewIfNeeded();
await page.getByTestId("footer-link").click();

Resilient Locators

Timeout Exceeded

Symptom: page.goto: Timeout 30000ms exceeded or navigation timeout

Common causes:

  1. The server is slow or under heavy load
  2. A large asset (image, video) is blocking the load event
  3. A third-party script is hanging (analytics, chat widget)
  4. DNS resolution failure

Fixes:

// Fix 1: Use domcontentloaded instead of load
await page.goto("https://app.example.com", {
  waitUntil: "domcontentloaded",
});

// Fix 2: Increase navigation timeout for known slow pages
await page.goto("https://app.example.com/reports", {
  timeout: 60000,
});

// Fix 3: Block heavy third-party resources
await page.route("**/analytics.js", (route) => route.abort());
await page.route("**/chat-widget.js", (route) => route.abort());

Assertion Failed

Symptom: expect(received).toContainText(expected) or similar mismatch

Common causes:

  1. The application text changed (legitimate UI update)
  2. The feature is genuinely broken (real bug)
  3. Dynamic content loaded differently (A/B test, cached version)

Diagnosis:

  1. Watch the video recording — does the page look correct?
  2. If the text changed intentionally → update the assertion
  3. If the page shows an error → this is a real application bug

Authentication Issues

Login Fails in Monitoring but Works Locally

Common causes:

  1. Test credentials were rotated or expired
  2. IP-based rate limiting blocks monitoring IP addresses
  3. MFA/2FA is enabled on the test account
  4. CAPTCHA or bot detection is blocking automated access

Fixes:

  1. Verify credentials in Organization Variables
  2. Allowlist supaguard IPs via Firewall Allowlisting
  3. Disable MFA on the dedicated test account
  4. Whitelist the test account from CAPTCHA checks

Session Expired During Check

Symptom: Check starts working but fails midway with a redirect to login

Fix: Perform login at the start of every check. Don't rely on session persistence between runs—each run starts with a clean browser context.

test("dashboard access", async ({ page }) => {
  // Always login first — sessions don't persist
  await page.goto("https://app.example.com/login");
  await page.getByLabel("Email").fill(process.env.TEST_USER_EMAIL!);
  await page.getByLabel("Password").fill(process.env.TEST_USER_PASSWORD!);
  await page.getByRole("button", { name: "Sign In" }).click();

  // Now test the actual flow
  await page.goto("https://app.example.com/dashboard");
  await expect(page.getByText("Welcome")).toBeVisible();
});

Network and Infrastructure Issues

SSL / Certificate Errors

Symptom: ERR_CERT_DATE_INVALID or NET::ERR_CERT_AUTHORITY_INVALID

This is typically a real problem. Your SSL certificate may be expired or misconfigured.

Action: Alert your DevOps team immediately. → SSL Certificate Monitoring

DNS Resolution Failure

Symptom: ERR_NAME_NOT_RESOLVED

Common causes:

  1. DNS propagation delay after a domain change
  2. DNS provider outage
  3. Typo in the URL

Fix: Verify the URL is correct and the DNS is resolving from the monitoring region.

Firewall Blocking Monitoring Traffic

Symptom: ERR_CONNECTION_TIMED_OUT or ERR_CONNECTION_REFUSED

Fix: Add supaguard's IP addresses to your firewall allowlist. → Firewall Allowlisting

Flaky Tests

Check Passes Sometimes, Fails Other Times

Common causes:

  1. Timing issues: Element not ready when the assertion runs
  2. Dynamic content: Content changes between runs (ads, recommendations)
  3. Race conditions: API response arrives after the assertion
  4. Regional differences: CDN serves different content to different regions

Fixes:

// Fix 1: Wait for specific content to load
await page.getByTestId("data-table").waitFor({ state: "visible" });

// Fix 2: Use more forgiving assertions
await expect(page.getByTestId("price")).toContainText("$");
// Instead of exact match:
// await expect(page.getByTestId("price")).toHaveText("$49.99");

// Fix 3: Wait for network to settle
await page.waitForLoadState("networkidle");

[!TIP] If a check is consistently flaky despite fixes, consider whether it's testing something inherently variable. Smart Retries help by re-running from different regions, but the script itself should be deterministic.

Alert and Notification Issues

Not Receiving Alerts

  1. Check channel configuration — Verify the webhook URL, Slack channel, or email is correct
  2. Send a test notification — Use the "Test" button in Alert Policies
  3. Check spam folder — Email alerts may be filtered
  4. Verify policy assignment — Make sure the alert policy is assigned to the check
  5. Check webhook logs — For custom webhooks, inspect the response from your server

Configuring Alerts

Too Many Alerts (Alert Fatigue)

  1. Change the trigger from "Immediate" to "After 2 consecutive failures"
  2. Enable Smart Retries to filter transient issues
  3. Use escalation rules to delay PagerDuty alerts
  4. Review and fix frequently-failing checks rather than muting them

Still Stuck?

If none of the above solutions help:

  1. Review the Debugging Failures guide for detailed diagnostic steps
  2. Check Failure Classification to understand the severity
  3. Contact support at support@supaguard.com with your check ID and the error message

Next Steps

On this page