Correlating Playwright E2E Test Traces with Backend Services on AWS Using OpenTelemetry


The CI pipeline for our primary customer-facing application was becoming a source of constant friction. End-to-end tests, written in Playwright, were exhibiting unpredictable performance, with runtimes varying from five to fifteen minutes. Flaky tests were quarantined daily, but the root cause was always elusive. The typical failure analysis involved a painful, manual process: sifting through ephemeral CodeBuild logs, trying to correlate timestamps with CloudWatch logs from the backend services running on Fargate, and guessing whether the latency originated in the test runner, the browser rendering, the network, or a slow API dependency. This lack of visibility was untenable.

The initial concept was to stop treating the test runner, the browser, and the backend as separate domains. An entire E2E test run—from the npx playwright test command to the final database query on the backend—should be treated as a single, cohesive transaction. This required a unified observability fabric capable of propagating context across process and network boundaries. OpenTelemetry was the logical choice for this, providing a vendor-neutral standard for distributed tracing. The goal was to generate a single, unified trace for each Playwright test, visualizing the entire lifecycle and pinpointing the exact source of any latency or error.

Our stack was fairly standard: a Next.js frontend with Shadcn UI for componentry, a Node.js/Express backend, all containerized and deployed on AWS Fargate, with CI/CD managed by AWS CodePipeline and CodeBuild. Integrating OpenTelemetry would require instrumenting three distinct layers: the backend service, the frontend application running in the browser, and, most critically, the Playwright test runner itself. AWS X-Ray was selected as the trace backend due to its seamless integration with other AWS services like Fargate and API Gateway, minimizing operational overhead. A self-hosted Jaeger or SigNoz was considered but deferred to avoid initial infrastructure management costs.

The implementation began with the backend service, as it was the most straightforward. The OpenTelemetry SDK for Node.js offers robust auto-instrumentation for common frameworks and libraries.

// file: backend/src/tracing.js

const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { AwsXRayPropagator } = require('@opentelemetry/propagator-aws-xray');
const { AwsXRayIdGenerator } = require('@opentelemetry/id-generator-aws-xray');
const { diag, DiagConsoleLogger, DiagLogLevel } = require('@opentelemetry/api');

// For troubleshooting, set the log level to DEBUG
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.INFO);

// The OTLPTraceExporter must be configured to send data to the AWS OTEL Collector.
// In an AWS environment like Fargate or EC2, this typically points to the collector sidecar or agent endpoint.
const traceExporter = new OTLPTraceExporter({
  url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'grpc://localhost:4317',
});

const sdk = new NodeSDK({
  traceExporter,
  instrumentations: [getNodeAutoInstrumentations()],
  idGenerator: new AwsXRayIdGenerator(),
  propagator: new AwsXRayPropagator(),
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: process.env.OTEL_SERVICE_NAME || 'my-backend-service',
  }),
  spanProcessor: new SimpleSpanProcessor(traceExporter),
});

// Graceful shutdown
process.on('SIGTERM', () => {
  sdk.shutdown()
    .then(() => console.log('Tracing terminated'))
    .catch((error) => console.error('Error terminating tracing', error))
    .finally(() => process.exit(0));
});

try {
  sdk.start();
  console.log('OpenTelemetry SDK started successfully for backend...');
} catch (error) {
  console.error('Error starting OpenTelemetry SDK', error);
  process.exit(1);
}

module.exports = sdk;

This tracer configuration is initialized at the very start of the application’s lifecycle. In our Express app, it’s the first import.

// file: backend/src/server.js
require('./tracing'); // Must be the first import

const express = require('express');
const app = express();
const port = process.env.PORT || 8080;

app.use(express.json());

// A sample API endpoint to be called by the frontend
app.get('/api/data', async (req, res) => {
  // In a real application, this would involve a database call or other service call,
  // which would be automatically instrumented by OpenTelemetry.
  console.log('Received request for /api/data');
  await new Promise(resolve => setTimeout(resolve, Math.random() * 200 + 50)); // Simulate work
  res.status(200).json({ id: 1, name: 'Sample Data', timestamp: new Date().toISOString() });
});

app.listen(port, () => {
  console.log(`Backend server listening at http://localhost:${port}`);
});

With this setup, any inbound HTTP request to the Express server will automatically generate a trace span, and the AWS X-Ray propagator ensures the X-Amzn-Trace-Id header is correctly handled.

Next was instrumenting the Shadcn UI frontend. The goal here is to capture fetch requests made by the browser and ensure they carry the trace context initiated by the Playwright test. This required the OpenTelemetry Web SDK.

// file: frontend/lib/tracing.js
'use client';

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { WebTracerProvider, BatchSpanProcessor } from '@opentelemetry/sdk-trace-web';
import { ZoneContextManager } from '@opentelemetry/context-zone';
import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { getWebAutoInstrumentations } from '@opentelemetry/auto-instrumentations-web';
import { B3Propagator, B3InjectEncoding } from '@opentelemetry/propagator-b3';

const provider = new WebTracerProvider({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'shadcn-ui-frontend',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
  }),
});

// In a real-world scenario, this endpoint points to an OpenTelemetry Collector.
// It's not advisable to expose a trace exporter directly to public clients without an aggregator.
const exporter = new OTLPTraceExporter({
  url: process.env.NEXT_PUBLIC_OTEL_EXPORTER_OTLP_URL || 'http://localhost:4318/v1/traces',
});

provider.addSpanProcessor(new BatchSpanProcessor(exporter));

// We use ZoneContextManager to automatically propagate context in the browser.
provider.register({
  contextManager: new ZoneContextManager(),
  // For simplicity here we use B3, but this needs to match the backend propagator.
  // In our real setup, a custom propagator or logic handles both X-Ray and W3C formats.
  propagator: new B3Propagator({
    injectEncoding: B3InjectEncoding.MULTI_HEADER,
  }),
});

registerInstrumentations({
  instrumentations: [
    getWebAutoInstrumentations({
      // We only care about fetch requests for this exercise.
      '@opentelemetry/instrumentation-xml-http-request': {
        enabled: false,
      },
      '@opentelemetry/instrumentation-document-load': {
        enabled: true,
      },
      '@opentelemetry/instrumentation-fetch': {
        enabled: true,
        // This function allows us to prevent certain requests from being traced,
        // for example, requests to the OTel collector itself.
        ignoreUrls: [/localhost:4318/],
        clearTimingResources: true,
        propagateTraceHeaderCorsUrls: [
          new RegExp(process.env.NEXT_PUBLIC_API_ENDPOINT || 'http://localhost:8080'),
        ],
      },
    }),
  ],
});

This instrumentation is loaded in the root layout of the Next.js application, ensuring it’s active on every page.

The most complex part was instrumenting the Playwright test runner itself and bridging its context with the browser’s. The test runner is a Node.js process, while the browser it controls is a separate environment. A trace initiated in the test runner must have its context (traceId, spanId) propagated into the browser so that subsequent fetch calls made by the frontend code include the correct traceparent header.

First, a dedicated tracing setup for the Playwright process was created. This is similar to the backend’s but tailored for short-lived test scripts.

// file: playwright-tests/otel-setup.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { AwsXRayIdGenerator } = require('@opentelemetry/id-generator-aws-xray');
const { AwsXRayPropagator } = require('@opentelemetry/propagator-aws-xray');
const { diag, DiagConsoleLogger, DiagLogLevel } = require('@opentelemetry/api');

diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.INFO);

let sdk;

function initializeTracer(serviceName) {
  if (sdk) {
    return;
  }
  const traceExporter = new OTLPTraceExporter(); // Assumes collector is at default localhost:4317

  sdk = new NodeSDK({
    traceExporter,
    idGenerator: new AwsXRayIdGenerator(),
    propagator: new AwsXRayPropagator(),
    spanProcessor: new SimpleSpanProcessor(traceExporter),
    resource: new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: serviceName,
    }),
  });

  sdk.start();
  console.log(`OpenTelemetry SDK started for ${serviceName}`);
  return sdk;
}

async function shutdownTracer() {
  if (sdk) {
    await sdk.shutdown();
    console.log('OpenTelemetry SDK shut down.');
    sdk = null;
  }
}

module.exports = { initializeTracer, shutdownTracer };

This utility is then used within a custom test fixture or a global setup file in Playwright to wrap test execution in spans. The key challenge is context propagation. We solved this by using page.addInitScript. This Playwright function injects a script that runs in the browser context before any page scripts. We used it to seed the browser’s window object with the trace context from the Node.js test runner process.

// file: playwright-tests/trace-fixture.js
const base = require('@playwright/test');
const { trace, context } = require('@opentelemetry/api');
const { initializeTracer, shutdownTracer } = require('./otel-setup');

// Initialize the tracer once for the entire test run.
initializeTracer('playwright-test-runner');
process.on('exit', shutdownTracer); // Ensure shutdown on exit.

// A shared tracer instance for all tests.
const tracer = trace.getTracer('playwright-test-tracer');

exports.test = base.test.extend({
  page: async ({ page }, use, testInfo) => {
    // Create a root span for the entire test case.
    const testSpan = tracer.startSpan(`E2E Test: ${testInfo.title}`);

    // Activate the context for this span. All child spans will be nested under it.
    await context.with(trace.setSpan(context.active(), testSpan), async () => {
      // The core of the solution: propagate the current active span's context
      // into the browser's window object before the page loads.
      const spanContext = testSpan.spanContext();
      const traceParent = `00-${spanContext.traceId}-${spanContext.spanId}-01`;

      await page.addInitScript(traceParentForBrowser => {
        window.__TRACE_PARENT__ = traceParentForBrowser;
      }, traceParent);

      // We must modify the frontend's fetch instrumentation to pick up this context.
      // This is a critical piece of glue.
      // The frontend code needs to be aware of `window.__TRACE_PARENT__` and use it to start its traces.
      // The frontend `tracing.js` would need a modification to check for this global variable.
      // For example:
      // const parentSpanContext = parseTraceParent(window.__TRACE_PARENT__);
      // const tracer = provider.getTracer('frontend-tracer');
      // const span = tracer.startSpan('page-load', undefined, trace.setSpan(context.active(), parentSpanContext));
      
      // Wrap specific actions in child spans for granular timing.
      const originalClick = page.click.bind(page);
      page.click = async (selector, options) => {
        const actionSpan = tracer.startSpan(`click: ${selector}`);
        const result = await originalClick(selector, options);
        actionSpan.end();
        return result;
      };
      
      const originalFill = page.fill.bind(page);
      page.fill = async (selector, value, options) => {
        const actionSpan = tracer.startSpan(`fill: ${selector}`);
        const result = await originalFill(selector, value, options);
        actionSpan.end();
        return result;
      };

      // Run the actual test logic.
      await use(page);

      // End the root test span.
      testSpan.setStatus({ code: testInfo.status === 'passed' ? 1 : 2 });
      testSpan.setAttribute('test.status', testInfo.status);
      testSpan.setAttribute('test.duration', testInfo.duration);
      testSpan.end();
    });
  },
});

Here’s an example of a test using this fixture:

// file: playwright-tests/example.spec.js
const { test } = require('./trace-fixture');
const { expect } = require('@playwright/test');

test('should load data from backend after button click', async ({ page }) => {
  await page.goto('http://localhost:3000');

  // The custom `page.click` method from our fixture will create a span for this action.
  await page.click('button:has-text("Fetch Data")');

  const dataElement = page.locator('#data-container');
  await expect(dataElement).toContainText('Sample Data');
});

This entire setup was integrated into an AWS CodeBuild buildspec.yml. The build process now includes a stage to run an OpenTelemetry Collector, which receives traces from both the Playwright test runner and the backend service, then forwards them to AWS X-Ray.

# file: buildspec.yml
version: 0.2

phases:
  install:
    runtime-versions:
      nodejs: 18
  pre_build:
    commands:
      - echo "Starting OpenTelemetry Collector..."
      # In a real scenario, you'd pull the otel-collector-contrib image
      # and run it with a config file pointing to AWS X-Ray exporter.
      - docker run -d -p 4317:4317 -p 4318:4318 --name otel-collector otel/opentelemetry-collector-contrib:latest
      - export OTEL_EXPORTER_OTLP_ENDPOINT="grpc://localhost:4317"
      - export OTEL_SERVICE_NAME="e2e-pipeline"
  build:
    commands:
      - echo "Building backend..."
      - (cd backend && npm install && npm run build)
      - echo "Building frontend..."
      - (cd frontend && npm install && npm run build)
      - echo "Starting services (in background)..."
      - (cd backend && node dist/server.js) &
      - (cd frontend && npm start) &
      - sleep 10 # Wait for services to be ready
      - echo "Running Playwright tests..."
      - (cd playwright-tests && npm install && npx playwright install --with-deps && npx playwright test)
  post_build:
    commands:
      - echo "Stopping OpenTelemetry Collector..."
      - docker stop otel-collector
      - docker rm otel-collector

The final result was transformative. We could now visualize an entire E2E test failure in AWS X-Ray’s trace map.

graph TD
    A[E2E Test: login.spec.ts] --> B(click: #login-button);
    A --> C(fill: #username);
    B --> D{fetch /api/auth};
    D --> E[Backend: POST /api/auth];
    E --> F[Auth Service DB Query];

A trace would clearly show a long duration on the “Auth Service DB Query” span, immediately identifying a slow database as the cause of a test timeout. Conversely, if there was a large gap between the “click: #login-button” span and the “fetch /api/auth” span, it pointed directly to a client-side performance issue in our Shadcn UI application—perhaps an expensive re-render or a blocking script. The ambiguity was gone, replaced by actionable data. The mean time to resolve flaky tests dropped from hours to minutes.

This solution is not without its limitations. The context propagation from the Playwright Node.js process into the browser via addInitScript and a global window variable is a workaround; it’s functional but feels brittle. A more robust solution would involve deeper integration with Playwright’s APIs or a standardized way for test runners to propagate W3C Trace Context to the browser environment they control. Furthermore, we are currently tracing every test run, which can be cost-prohibitive at scale. A future iteration must implement a more intelligent sampling strategy, perhaps using a tail-based sampler in the OpenTelemetry Collector to only store traces for failed or excessively slow test runs, thus balancing visibility with cost.


  TOC