Architecting a WebRTC Client with Dynamic Stream Permissions Driven by AWS IAM Policies

Architecture

Word Count: 3k

Read Times: 18 Min

The initial architecture for our internal broadcasting platform relied on a central backend service to authorize every single WebRTC connection, creating a significant performance bottleneck and a single point of failure. Each time a client attempted to view a media stream, our monolith had to validate the user’s session, query a permissions database, and proxy the connection. This design failed to scale during all-hands meetings, where thousands of clients connected simultaneously. The core problem was clear: our application logic was trying to reinvent a robust, scalable permissions system, and the stateful nature of the proxy introduced unacceptable latency.

Our first major pivot was to offload authorization entirely. The concept was to transform our backend from a gatekeeper into a trusted broker. Instead of managing permissions itself, it would petition a dedicated identity and access management system to vend short-lived, narrowly-scoped credentials directly to the client. The client could then use these credentials to connect directly to the media source. This decentralizes the connection logic and leverages a battle-tested infrastructure for authorization. AWS IAM, combined with the Security Token Service (STS), was the natural choice for this, especially since our media streams were already hosted on AWS Kinesis Video Streams.

This architectural shift, however, moved significant complexity to the front-end. The client was no longer a dumb terminal; it was now responsible for managing temporary credentials, handling sophisticated authorization failures, and reflecting permission states in the UI. This required a rigorous approach to front-end engineering, treating the client-side codebase with the same discipline as a backend service. This meant robust unit testing with Jest, proactive security and style enforcement with ESLint, and a resilient styling system with PostCSS to handle the dynamic nature of the user interface.

The Credential Brokering Backend

The first piece of the puzzle is a backend service whose sole responsibility is to exchange a user’s long-lived application session for short-lived AWS credentials. This service acts as an STS broker. It authenticates the user, determines their entitlements (e.g., which department they belong to), constructs an appropriate IAM policy in-memory, and then calls sts:AssumeRole to generate temporary credentials tied to that policy.

In a real-world project, this endpoint must be secured behind your primary authentication system. For this implementation, we’ll simulate this with a simple Express server that accepts a userId to determine permissions.

Here is the core of the STS broker. Note the construction of the IAM policy document. It explicitly grants kinesisvideo:Get* actions but only on a specific, dynamically determined list of Kinesis Video Stream ARNs. This is the principle of least privilege in action.

// src/sts-broker/server.js
const express = require('express');
const { STSClient, AssumeRoleCommand } = require('@aws-sdk/client-sts');
const { fromNodeProvider } = require('@aws-sdk/credential-providers');

const app = express();
app.use(express.json());

const PORT = process.env.PORT || 3001;
const AWS_REGION = process.env.AWS_REGION || 'us-east-1';
// This is the ARN of the IAM Role that the broker service itself assumes.
// This role MUST have a trust policy allowing the service (e.g., from EC2/ECS) to assume it.
// It must also have permissions to call sts:AssumeRole on the target role.
const BROKER_ROLE_TO_ASSUME_ARN = process.env.BROKER_ROLE_TO_ASSUME_ARN;

// A mock database of users and their corresponding stream entitlements.
const userEntitlements = {
  'user-eng-123': [
    'arn:aws:kinesisvideo:us-east-1:123456789012:stream/engineering-all-hands/1668789300',
    'arn:aws:kinesisvideo:us-east-1:123456789012:stream/project-phoenix-standup/1668789350',
  ],
  'user-sales-456': [
    'arn:aws:kinesisvideo:us-east-1:123456789012:stream/sales-q4-kickoff/1668789400',
  ],
  'user-guest-789': [], // This user has no entitlements
};

const stsClient = new STSClient({
    region: AWS_REGION,
    // In a production environment, the credentials for the STS broker itself
    // should be sourced securely, e.g., from an EC2 instance profile or ECS task role.
    // fromNodeProvider() handles this automatically.
    credentials: fromNodeProvider(),
});

// Middleware for basic logging
app.use((req, res, next) => {
    console.log(`[${new Date().toISOString()}] ${req.method} ${req.url}`);
    next();
});

app.post('/api/v1/sts/credentials', async (req, res) => {
  const { userId } = req.body;

  if (!userId || !userEntitlements.hasOwnProperty(userId)) {
    console.error(`Authentication failed: Invalid or missing userId: ${userId}`);
    return res.status(401).json({ error: 'Unauthorized: Invalid user identifier.' });
  }

  const allowedStreamARNs = userEntitlements[userId];
  const sessionName = `web-webrtc-session-${userId}-${Date.now()}`;

  // Dynamically generate the IAM policy based on the user's entitlements.
  // If the user has no entitlements, the policy will contain an empty Resource list,
  // effectively denying access to all streams.
  const policyDocument = {
    Version: '2012-10-17',
    Statement: [
      {
        Effect: 'Allow',
        Action: [
          'kinesisvideo:GetSignalingChannelEndpoint',
          'kinesisvideo:GetIceServerConfig',
          'kinesisvideo:ConnectAsMaster', // Or ConnectAsViewer depending on role
        ],
        Resource: allowedStreamARNs.length > 0 ? allowedStreamARNs : ['arn:aws:kinesisvideo:*:*:stream/null/0'], // Grant nothing if no entitlements
      },
    ],
  };

  const command = new AssumeRoleCommand({
    RoleArn: BROKER_ROLE_TO_ASSUME_ARN,
    RoleSessionName: sessionName,
    Policy: JSON.stringify(policyDocument),
    DurationSeconds: 3600, // 1 hour lifetime for credentials
  });

  try {
    const { Credentials } = await stsClient.send(command);
    console.log(`Successfully vended credentials for session: ${sessionName}`);
    res.json({
      accessKeyId: Credentials.AccessKeyId,
      secretAccessKey: Credentials.SecretAccessKey,
      sessionToken: Credentials.SessionToken,
      expiration: Credentials.Expiration,
    });
  } catch (err) {
    console.error(`STS AssumeRole failed for user ${userId}:`, err);
    // This is a critical server-side error. Avoid leaking detailed AWS errors to the client.
    res.status(500).json({ error: 'Internal server error: Could not retrieve temporary credentials.' });
  }
});

// Generic error handler
app.use((err, req, res, next) => {
    console.error(err.stack);
    res.status(500).send('Something broke!');
});


app.listen(PORT, () => {
  console.log(`STS Broker listening on port ${PORT}`);
});

The critical part here is the Policy parameter in the AssumeRoleCommand. We are not just assuming a role; we are creating an “inline session policy” that further restricts the permissions of the assumed role for this specific session only. This ensures that even if the BROKER_ROLE_TO_ASSUME_ARN has broad permissions, the credentials vended to the client are scoped down to the bare minimum required.

Client-Side Orchestration and State Management

The front-end now needs to manage the entire lifecycle: fetch credentials, instantiate the AWS Kinesis Video client, attempt to connect to multiple streams, and handle both success and failure states gracefully for each stream independently.

We’ll encapsulate this logic in a service class. This makes it testable and decouples it from the UI framework.

// src/services/WebRTCAuthClient.js
import { KinesisVideoClient } from '@aws-sdk/client-kinesis-video';
import { KinesisVideoSignalingClient, GetIceServerConfigCommand } from '@aws-sdk/client-kinesis-video-signaling';

// A simple event emitter to notify the UI of state changes.
// In a real app, this would be a more robust state management solution (Redux, MobX, etc.).
class EventEmitter {
  constructor() {
    this.callbacks = {};
  }
  on(event, cb) {
    if (!this.callbacks[event]) this.callbacks[event] = [];
    this.callbacks[event].push(cb);
  }
  emit(event, data) {
    let cbs = this.callbacks[event];
    if (cbs) {
      cbs.forEach(cb => cb(data));
    }
  }
}

export class WebRTCAuthClient extends EventEmitter {
  constructor(userId, brokerUrl) {
    super();
    this.userId = userId;
    this.brokerUrl = brokerUrl;
    this.credentials = null;
    this.kinesisVideoClient = null;
    this.streamStates = new Map(); // Map<streamArn, { status: 'pending' | 'authorized' | 'denied', peerConnection: RTCPeerConnection }>
  }

  async initialize() {
    try {
      this.emit('auth:pending');
      const response = await fetch(`${this.brokerUrl}/api/v1/sts/credentials`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ userId: this.userId }),
      });

      if (!response.ok) {
        const errorBody = await response.json();
        throw new Error(errorBody.error || `HTTP error! status: ${response.status}`);
      }
      
      const creds = await response.json();
      this.credentials = {
        accessKeyId: creds.accessKeyId,
        secretAccessKey: creds.secretAccessKey,
        sessionToken: creds.sessionToken,
      };

      this.kinesisVideoClient = new KinesisVideoClient({
        region: 'us-east-1',
        credentials: this.credentials,
      });
      this.emit('auth:success');
      console.log('Successfully initialized with temporary credentials.');
    } catch (error) {
      this.emit('auth:failed', error);
      console.error('Failed to initialize WebRTCAuthClient:', error);
      throw error;
    }
  }

  async connectToStream(streamArn) {
    if (!this.kinesisVideoClient) {
      console.error('Client not initialized. Call initialize() first.');
      this.emit('stream:status', { streamArn, status: 'denied', error: new Error('Client not initialized') });
      return;
    }
    
    this.streamStates.set(streamArn, { status: 'pending', peerConnection: null });
    this.emit('stream:status', { streamArn, status: 'pending' });

    try {
      const signalingChannelEndpointResponse = await this.kinesisVideoClient.getSignalingChannelEndpoint({
          ChannelARN: streamArn,
          SingleMasterChannelEndpointConfiguration: {
              Protocols: ['WSS', 'HTTPS'],
              Role: 'VIEWER',
          },
      });

      const endpoint = signalingChannelEndpointResponse.ResourceEndpointList.find(e => e.Protocol === 'WSS').ResourceEndpoint;

      const signalingClient = new KinesisVideoSignalingClient({
        region: 'us-east-1',
        credentials: this.credentials,
        endpoint,
      });
      
      const iceServerConfigResponse = await signalingClient.send(new GetIceServerConfigCommand({ ChannelARN: streamArn }));
      
      // ... WebRTC peer connection setup logic would go here ...
      // This is where you would use the ICE servers and signaling channel.
      // For this example, we simulate success based on the API calls above succeeding.
      // A real implementation would involve SDP offer/answer exchange.
      
      this.streamStates.set(streamArn, { status: 'authorized', peerConnection: {} /* a real RTCPeerConnection */ });
      this.emit('stream:status', { streamArn, status: 'authorized' });
      console.log(`Authorization successful for stream: ${streamArn}`);

    } catch (error) {
      // This is the critical part. IAM denials will manifest as exceptions here.
      // The AWS SDK v3 client throws errors with a `name` property.
      if (error.name === 'AccessDeniedException' || error.name === 'NotAuthorizedException') {
        this.streamStates.set(streamArn, { status: 'denied', error });
        this.emit('stream:status', { streamArn, status: 'denied', error });
        console.warn(`Authorization denied for stream: ${streamArn}`);
      } else {
        // Handle other errors (e.g., network issues, stream not found)
        this.streamStates.set(streamArn, { status: 'denied', error });
        this.emit('stream:status', { streamArn, status: 'denied', error });
        console.error(`An unexpected error occurred for stream ${streamArn}:`, error);
      }
    }
  }
}

Rigorous Unit Testing with Jest

The logic in WebRTCAuthClient is complex and security-sensitive. It must be thoroughly tested. A common mistake is to neglect testing failure paths. In our case, the AccessDeniedException path is just as important as the success path. We use Jest and its mocking capabilities to simulate the entire credential-fetching and connection lifecycle without making real network requests.

// src/services/WebRTCAuthClient.test.js
import { WebRTCAuthClient } from './WebRTCAuthClient';
import { KinesisVideoClient } from '@aws-sdk/client-kinesis-video';
import { KinesisVideoSignalingClient } from '@aws-sdk/client-kinesis-video-signaling';

// Mock the AWS SDK clients.
// This is crucial for isolating our code from external services during tests.
jest.mock('@aws-sdk/client-kinesis-video');
jest.mock('@aws-sdk/client-kinesis-video-signaling');

// Mock `fetch` globally for all tests in this file.
global.fetch = jest.fn();

describe('WebRTCAuthClient', () => {
  const mockBrokerUrl = 'http://localhost:3001';
  const mockUserId = 'user-eng-123';
  const authorizedStreamArn = 'arn:aws:kinesisvideo:us-east-1:123456789012:stream/engineering-all-hands/1668789300';
  const deniedStreamArn = 'arn:aws:kinesisvideo:us-east-1:123456789012:stream/sales-q4-kickoff/1668789400';

  let client;

  beforeEach(() => {
    // Reset mocks before each test to ensure isolation.
    fetch.mockClear();
    KinesisVideoClient.mockClear();
    KinesisVideoSignalingClient.mockClear();
    
    // Mock the `send` method that all command executions go through.
    KinesisVideoClient.prototype.getSignalingChannelEndpoint = jest.fn();
    KinesisVideoSignalingClient.prototype.send = jest.fn();
    
    client = new WebRTCAuthClient(mockUserId, mockBrokerUrl);
  });

  describe('Initialization', () => {
    it('should fetch credentials and initialize clients on successful initialization', async () => {
      const mockCredentials = {
        accessKeyId: 'ASIA...',
        secretAccessKey: 'SECRET...',
        sessionToken: 'TOKEN...',
      };
      fetch.mockResolvedValueOnce({
        ok: true,
        json: async () => mockCredentials,
      });

      await client.initialize();

      expect(fetch).toHaveBeenCalledWith(`${mockBrokerUrl}/api/v1/sts/credentials`, expect.any(Object));
      expect(client.credentials).toEqual({
        accessKeyId: mockCredentials.accessKeyId,
        secretAccessKey: mockCredentials.secretAccessKey,
        sessionToken: mockCredentials.sessionToken,
      });
      expect(KinesisVideoClient).toHaveBeenCalledWith(expect.objectContaining({
        credentials: client.credentials,
      }));
    });

    it('should throw an error if the credential broker returns a non-ok response', async () => {
      fetch.mockResolvedValueOnce({
        ok: false,
        status: 401,
        json: async () => ({ error: 'Unauthorized' }),
      });

      await expect(client.initialize()).rejects.toThrow('Unauthorized');
    });
  });

  describe('Stream Connection', () => {
    beforeEach(async () => {
      // Successfully initialize the client for connection tests.
      fetch.mockResolvedValueOnce({
        ok: true,
        json: async () => ({ accessKeyId: 'id', secretAccessKey: 'secret', sessionToken: 'token' }),
      });
      await client.initialize();
    });

    it('should successfully connect to an authorized stream', async () => {
      // Mock a successful response from Kinesis Video APIs.
      KinesisVideoClient.prototype.getSignalingChannelEndpoint.mockResolvedValue({
          ResourceEndpointList: [{ Protocol: 'WSS', ResourceEndpoint: 'wss://kinesis.video' }]
      });
      KinesisVideoSignalingClient.prototype.send.mockResolvedValue({
          IceServerList: []
      });
      
      const statusHandler = jest.fn();
      client.on('stream:status', statusHandler);

      await client.connectToStream(authorizedStreamArn);

      expect(statusHandler).toHaveBeenCalledWith({ streamArn: authorizedStreamArn, status: 'pending' });
      expect(KinesisVideoClient.prototype.getSignalingChannelEndpoint).toHaveBeenCalledWith(expect.objectContaining({ ChannelARN: authorizedStreamArn }));
      expect(statusHandler).toHaveBeenCalledWith({ streamArn: authorizedStreamArn, status: 'authorized' });
      expect(client.streamStates.get(authorizedStreamArn).status).toBe('authorized');
    });

    it('should handle AccessDeniedException when connecting to an unauthorized stream', async () => {
      // Mock the AWS SDK throwing a specific, named error. This is key.
      const accessDeniedError = new Error('User is not authorized');
      accessDeniedError.name = 'AccessDeniedException';
      KinesisVideoClient.prototype.getSignalingChannelEndpoint.mockRejectedValue(accessDeniedError);

      const statusHandler = jest.fn();
      client.on('stream:status', statusHandler);

      await client.connectToStream(deniedStreamArn);
      
      expect(statusHandler).toHaveBeenCalledWith({ streamArn: deniedStreamArn, status: 'pending' });
      expect(statusHandler).toHaveBeenCalledWith({ streamArn: deniedStreamArn, status: 'denied', error: accessDeniedError });
      expect(client.streamStates.get(deniedStreamArn).status).toBe('denied');
    });
  });
});

Enforcing Security with Custom ESLint Rules

A common vulnerability in this pattern is accidentally hardcoding sensitive information, like ARNs, in the client-side code. While our current architecture fetches them dynamically, a future developer might mistakenly hardcode one for testing and forget to remove it. We can proactively prevent this entire class of error with a custom ESLint rule.

This rule will scan for string literals that match the AWS ARN pattern for Kinesis Video Streams and flag them.

First, the rule definition:

// .eslint-rules/no-hardcoded-kinesis-arns.js
module.exports = {
  meta: {
    type: 'problem',
    docs: {
      description: 'Disallow hardcoding AWS Kinesis Video Stream ARNs in the codebase.',
      category: 'Security',
      recommended: true,
    },
    fixable: null,
    schema: [], // no options
  },
  create(context) {
    return {
      Literal(node) {
        if (typeof node.value === 'string' && /arn:aws:kinesisvideo:[\w-]+:\d{12}:stream\/.+/.test(node.value)) {
          context.report({
            node,
            message: 'Do not hardcode Kinesis Video Stream ARNs. Fetch them from a configuration or service.',
          });
        }
      },
    };
  },
};

To enable this rule, we update our ESLint configuration:

// .eslintrc.json
{
  "root": true,
  "plugins": [
    // ... other plugins
  ],
  "extends": [
    "eslint:recommended",
    // ... other extends
  ],
  "parserOptions": {
    "ecmaVersion": "latest",
    "sourceType": "module"
  },
  "rules": {
    "local-rules/no-hardcoded-kinesis-arns": "error"
  },
  "settings": {
    "eslint-plugin-rulesdir": {
      "paths": [".eslint-rules"]
    }
  }
}

We also need the eslint-plugin-rulesdir package to load our local rule. Now, if a developer writes const stream = 'arn:aws:kinesisvideo:us-east-1:123456789012:stream/test-stream/123', ESLint will immediately flag it as an error during development and in the CI pipeline, preventing it from ever reaching production.

Reflecting Authorization State with PostCSS

The UI must visually communicate a user’s access level for each stream. A stream they can’t access should be clearly disabled. We can manage these visual states elegantly using PostCSS and data attributes.

Our component might render a list of video containers like this:

<div class="video-grid">
  <div class="video-container" data-stream-arn="..." data-auth-status="pending">
    <!-- Spinner placeholder -->
  </div>
  <div class="video-container" data-stream-arn="..." data-auth-status="authorized">
    <video></video>
  </div>
  <div class="video-container" data-stream-arn="..." data-auth-status="denied">
    <div class="denied-overlay">Access Denied</div>
  </div>
</div>

Our JavaScript logic, listening to events from WebRTCAuthClient, will update the data-auth-status attribute. PostCSS can then be used to style these states without complex, nested CSS selectors. We can use plugins like postcss-preset-env to enable modern features like nesting.

/* src/styles/components/video-container.css */

.video-container {
  position: relative;
  background-color: #000;
  border: 1px solid #333;
  aspect-ratio: 16 / 9;

  /* Pending state styles */
  &[data-auth-status='pending']::after {
    content: 'Connecting...';
    position: absolute;
    inset: 0;
    display: flex;
    align-items: center;
    justify-content: center;
    background: rgba(0, 0, 0, 0.7);
    color: #fff;
  }

  /* Denied state styles */
  &[data-auth-status='denied'] {
    filter: grayscale(1) blur(4px);
    cursor: not-allowed;

    .denied-overlay {
      position: absolute;
      inset: 0;
      display: flex;
      align-items: center;
      justify-content: center;
      background: rgba(255, 0, 0, 0.3);
      color: #fff;
      font-weight: bold;
      text-transform: uppercase;
    }
  }
  
  /* Authorized state */
  &[data-auth-status='authorized'] {
    border-color: #0c0;
  }
}

Our PostCSS configuration would look something like this:

// postcss.config.js
module.exports = {
  plugins: {
    'postcss-import': {},
    'tailwindcss/nesting': {}, // Or postcss-nesting
    'tailwindcss': {},
    'postcss-preset-env': {
      features: { 'nesting-rules': false },
    },
    'autoprefixer': {},
  },
};

This approach keeps the CSS clean and declarative. The styles are directly tied to the state machine managed by our JavaScript, making the UI robust and easy to reason about.

sequenceDiagram
    participant C as Client (Browser)
    participant B as STS Broker (Backend)
    participant STS as AWS STS
    participant KVS as AWS Kinesis Video

    C->>B: POST /api/v1/sts/credentials (userId: 'user-eng-123')
    B->>STS: AssumeRole(RoleArn, PolicyForEngStreams, SessionName)
    STS-->>B: Temporary Credentials (AccessKey, SecretKey, Token)
    B-->>C: 200 OK { accessKeyId, ... }
    C->>KVS: GetSignalingChannelEndpoint(Stream: Eng-ARN, Credentials)
    KVS-->>C: 200 OK { Endpoint }
    C->>KVS: Connect to Signaling Endpoint...
    Note right of C: Connection successful.
    
    C->>KVS: GetSignalingChannelEndpoint(Stream: Sales-ARN, Credentials)
    KVS-->>C: 403 AccessDeniedException
    Note right of C: Connection fails, UI shows 'Access Denied'.

While this STS-brokering architecture effectively offloads authorization to IAM and removes our backend from the critical path of media delivery, it’s not without its own complexities. The client-side logic is now more intricate and stateful, demanding higher standards for testing and maintenance. The lifetime of the temporary credentials must be carefully managed; the client needs a mechanism to refresh them before they expire to avoid disrupting active streams. Furthermore, the STS broker itself, while simple, is still a critical piece of infrastructure that must be highly available. For systems with extremely high connection churn, the latency of the AssumeRole API call, while typically low, could become a measurable factor, suggesting a potential evolution towards more complex federation models using OIDC that can provide longer-lived sessions.

Jest PostCSS ESLint WebRTC IAM

Orchestrating Containerized SciPy Computations and Nuxt.js Deployments with a Unified Tekton Pipeline

2023-10-27 DevOps

SciPy Kubernetes GitOps MLOps Nuxt.js Tekton

Implementing a Fault-Tolerant Go CDC Pipeline for Meilisearch Using a Monorepo and Consul for State Management

2023-10-27 Backend Development

Consul Go Meilisearch Monorepo PostgreSQL CDC