Establishing Secure Egress from Google Cloud Functions to an Istio-Enabled Kubernetes Service Mesh

Cloud Native

Word Count: 2.5k

Read Times: 15 Min

The mandate was seemingly straightforward: build an event-driven data enrichment pipeline. A new Google Cloud Function would trigger on a Pub/Sub message, fetch supplementary data from our core user-profile service, and push the enriched result to a BigQuery table. The complication arose from our infrastructure’s security posture. The user-profile service lives inside a hardened Google Kubernetes Engine (GKE) cluster, governed by an Istio service mesh with a default-deny policy. All inter-service communication requires strict mTLS, and no service is exposed directly to the public internet.

Our initial discussions revolved around two non-starters. The first, exposing the user-profile service via a public GKE Ingress, was immediately vetoed. It would create a hole in our security perimeter, nullifying the entire purpose of the service mesh. The second, a complex VPN or Interconnect setup, was deemed operational overkill for a single serverless function’s needs. It would introduce significant latency and maintenance overhead.

The core problem remained: how to allow a managed, serverless entity running on Google’s infrastructure to securely and efficiently communicate with a private workload inside our Istio mesh, without compromising our Zero Trust principles. The solution required bridging two distinct networking and identity domains. This led us to an architecture based on three key components: a Serverless VPC Access connector to bridge the network layer, an internal-facing Istio Ingress Gateway to act as a policy enforcement point, and a JWT-based authentication mechanism to bridge the identity gap. To ensure performance and minimize cold start latency, a critical factor in event-driven systems, we would leverage esbuild to create a minimal, optimized function bundle.

The Architectural Foundation: Network and Service Mesh Configuration

Before writing a single line of function code, the infrastructure must be correctly configured. The goal is to create a private network path from the Cloud Function environment into the GKE cluster’s VPC, terminating at a specific, controlled entry point in the service mesh.

1. The GKE Cluster and Target Service

Assume a standard GKE cluster with Istio installed and automatic sidecar injection enabled for the default namespace. Our target is a simple user-profile service.

# user-profile-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: user-profile
  labels:
    app: user-profile
spec:
  ports:
  - port: 80
    name: http
    targetPort: 8080
  selector:
    app: user-profile
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-profile-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: user-profile
  template:
    metadata:
      labels:
        app: user-profile
    spec:
      containers:
      - name: user-profile
        # A simple echo server for demonstration purposes.
        # In a real-world project, this would be your business logic container.
        image: hashicorp/http-echo 
        args:
          - "-text={\"userId\": \"123\", \"email\": \"[email protected]\"}"
          - "-listen=:8080"
        ports:
        - containerPort: 8080

Deploying this creates the internal service, but by default, it’s unreachable from outside the mesh.

2. Bridging the VPC with a Serverless Connector

Google Cloud Functions do not run inside a user’s VPC by default. To grant them access to internal resources like a GKE cluster’s internal IP range, we must create a Serverless VPC Access connector. This is a non-trivial resource; it reserves a dedicated subnet within the target VPC and acts as a network proxy.

# Ensure you are using the correct project and region
gcloud config set project YOUR_PROJECT_ID
gcloud config set compute/region YOUR_REGION

# Create the VPC Access Connector
# It requires a /28 IP range within your VPC that doesn't overlap with other subnets.
gcloud compute networks vpc-access connectors create gcf-to-gke-connector \
  --network YOUR_VPC_NAME \
  --region YOUR_REGION \
  --range 10.8.0.0/28

A common mistake here is under-provisioning the connector or choosing an IP range that will later conflict with GKE pod or service CIDR ranges. Careful VPC planning is essential.

3. Istio Ingress Gateway for Internal Traffic

The standard Istio ingress-gateway is designed for public-facing traffic. We need a dedicated gateway for internal traffic originating from our VPC, such as our Cloud Function. We achieve this by deploying a new gateway instance and annotating its service to request an internal TCP/UDP load balancer from Google Cloud.

# internal-gateway.yaml
apiVersion: v1
kind: Service
metadata:
  name: istio-internal-gateway
  namespace: istio-system
  annotations:
    # This is the critical annotation for GKE
    networking.gke.io/load-balancer-type: "Internal"
  labels:
    istio: internal-gateway
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
    name: http2
  selector:
    istio: internal-gateway
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: istio-internal-gateway
  namespace: istio-system
spec:
  replicas: 1 # Adjust for production needs
  selector:
    matchLabels:
      istio: internal-gateway
  template:
    metadata:
      labels:
        # Match the service selector
        istio: internal-gateway
      annotations:
        # Prevent this gateway from getting a public IP
        "sidecar.istio.io/inject": "true"
    spec:
      containers:
      - name: istio-proxy
        image: auto # Istio will inject the correct proxy image
        ports:
        - containerPort: 8080

After applying this, kubectl get svc -n istio-system istio-internal-gateway will show an EXTERNAL-IP which is actually an internal IP address within our VPC. This IP is the target our Cloud Function will call.

4. Exposing the Service via the Gateway

Now we wire the internal gateway to our user-profile service using Istio’s Gateway and VirtualService resources.

# user-profile-routing.yaml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: internal-traffic-gateway
spec:
  selector:
    istio: internal-gateway # Binds to our internal gateway deployment
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "user-profile.internal" # A virtual hostname for routing
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: user-profile-vs
spec:
  hosts:
  - "user-profile.internal"
  gateways:
  - internal-traffic-gateway
  http:
  - route:
    - destination:
        host: user-profile.default.svc.cluster.local
        port:
          number: 80

This configuration tells the istio-internal-gateway that any request for user-profile.internal on port 80 should be routed to the user-profile service.

Bridging Identity with JWT and Istio Policies

Network connectivity is only half the battle. The service mesh still needs to authenticate and authorize the incoming request. Since the Cloud Function cannot participate in Istio’s mTLS identity fabric (SPIFFE), we must use an alternative identity token: a JSON Web Token (JWT).

The flow is as follows:

The Cloud Function will generate a signed JWT.
The JWT will be included in the Authorization header of the request to the internal gateway.
The Istio gateway will be configured with a RequestAuthentication policy to validate the JWT’s signature and issuer.
An AuthorizationPolicy will grant access only if the request contains a valid, validated JWT.

For this example, we’ll use a simple asymmetric key pair for signing and verification. In a production environment, the public key would be exposed via a JWKS (JSON Web Key Set) endpoint.

# user-profile-jwt-policy.yaml
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: jwt-for-user-profile
  namespace: istio-system # Apply on the gateway namespace
spec:
  selector:
    matchLabels:
      istio: internal-gateway
  jwtRules:
  - issuer: "[email protected]"
    # For a real project, use a jwksUri. For this demo, we embed the key.
    # The public key must be in JWKS format.
    jwks: |
      {
        "keys": [
          {
            "kty": "EC",
            "crv": "P-256",
            "x": "...", # Base64Url encoded X coordinate
            "y": "...", # Base64Url encoded Y coordinate
            "kid": "jwt-signing-key-v1"
          }
        ]
      }
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: require-jwt-for-user-profile
  namespace: default # Apply on the target service namespace
spec:
  selector:
    matchLabels:
      app: user-profile
  action: ALLOW
  rules:
  - from:
    - source:
        # The principals are the Istio identities of the gateways
        principals: ["cluster.local/ns/istio-system/sa/istio-internal-gateway-service-account"]
    to:
    - operation:
        methods: ["GET"]
        paths: ["/profile/*"]
    when:
    # This condition requires a valid JWT principal from the RequestAuthentication
    - key: request.auth.claims[iss]
      values: ["[email protected]"]

The pitfall here is policy placement. The RequestAuthentication policy must be applied to the workload that terminates the TLS and inspects the headers—in this case, the istio-internal-gateway in the istio-system namespace. The AuthorizationPolicy, however, must be applied in the namespace of the target service (default) to protect the user-profile workload itself.

The Cloud Function: Optimized for Performance with esbuild

With the infrastructure ready, we can build the Cloud Function. Our focus is not just on functionality but also on minimizing the cold start time. A large, dependency-heavy function bundle can add hundreds of milliseconds to the initial invocation latency.

1. Project Structure and Dependencies

.
├── src/
│   └── index.ts
├── build.mjs
├── package.json
└── tsconfig.json

We’ll use TypeScript for type safety. The key dependencies are axios for HTTP requests, jose for robust JWT signing, and @google-cloud/secret-manager to securely fetch our private signing key.

// package.json
{
  "name": "gke-egress-function",
  "version": "1.0.0",
  "main": "dist/index.js",
  "scripts": {
    "build": "node build.mjs"
  },
  "dependencies": {
    "@google-cloud/functions-framework": "^3.3.0",
    "@google-cloud/secret-manager": "^5.2.0",
    "axios": "^1.6.0",
    "jose": "^5.1.0"
  },
  "devDependencies": {
    "@types/node": "^20.8.10",
    "esbuild": "^0.19.5",
    "typescript": "^5.2.2"
  }
}

2. esbuild Configuration

Instead of a complex webpack.config.js, we use a simple build script for esbuild. Its speed is transformative for quick iteration cycles.

// build.mjs
import * as esbuild from 'esbuild';

const sharedConfig = {
  entryPoints: ['src/index.ts'],
  bundle: true,
  platform: 'node',
  target: 'node18',
  // Google Cloud Functions provides this package at runtime,
  // so we mark it as external to avoid bundling it.
  // This is a critical optimization for bundle size.
  external: ['@google-cloud/functions-framework'],
  minify: true,
  sourcemap: true,
};

await esbuild.build({
  ...sharedConfig,
  outfile: 'dist/index.js',
  format: 'cjs', // CommonJS format required by older GCF runtimes
});

console.log('Build finished successfully.');

The most important line is external: ['@google-cloud/functions-framework']. Omitting this would bundle the entire Functions Framework, unnecessarily bloating our deployment package. Real-world projects often have several such runtime-provided dependencies that should be excluded.

3. Core Function Logic

The TypeScript code implements the JWT signing and the internal API call. It’s structured to cache the fetched private key in a global variable to avoid refetching it from Secret Manager on every “warm” invocation.

// src/index.ts
import { HttpFunction } from '@google-cloud/functions-framework';
import { SecretManagerServiceClient } from '@google-cloud/secret-manager';
import axios, { isAxiosError } from 'axios';
import * as jose from 'jose';

// Configuration from environment variables
const GKE_GATEWAY_URL = process.env.GKE_GATEWAY_URL; // e.g., http://10.128.0.5/profile/123
const JWT_ISSUER = process.env.JWT_ISSUER;
const SIGNING_KEY_SECRET_ID = process.env.SIGNING_KEY_SECRET_ID; // e.g., projects/123/secrets/jwt-key/versions/latest

// Cache for the signing key to avoid repeated fetches on warm starts
let signingKey: jose.KeyLike | null = null;
const secretManager = new SecretManagerServiceClient();

async function getSigningKey(): Promise<jose.KeyLike> {
  if (signingKey) {
    return signingKey;
  }

  try {
    const [version] = await secretManager.accessSecretVersion({
      name: SIGNING_KEY_SECRET_ID,
    });

    const keyData = version.payload?.data?.toString();
    if (!keyData) {
      throw new Error('Private key not found in Secret Manager.');
    }

    // This logic assumes the key is stored in PEM format.
    const importedKey = await jose.importSPKI(keyData, 'ES256');
    signingKey = importedKey;
    return signingKey;
  } catch (error) {
    console.error('Failed to fetch or import signing key:', error);
    // In a real system, you'd have more robust error handling/alerting
    throw new Error('Internal configuration error: could not load signing key.');
  }
}

export const callInternalService: HttpFunction = async (req, res) => {
  if (!GKE_GATEWAY_URL || !JWT_ISSUER || !SIGNING_KEY_SECRET_ID) {
    console.error('Missing required environment variables.');
    res.status(500).send('Server configuration error.');
    return;
  }

  try {
    const key = await getSigningKey();
    
    // Generate the JWT for this request
    const jwt = await new jose.SignJWT({ 'scope': 'read:user-profile' })
      .setProtectedHeader({ alg: 'ES256' })
      .setIssuedAt()
      .setIssuer(JWT_ISSUER)
      .setAudience('user-profile-service')
      .setExpirationTime('5m')
      .sign(key);

    // Make the authenticated call to the internal gateway
    const response = await axios.get(GKE_GATEWAY_URL, {
      headers: {
        'Authorization': `Bearer ${jwt}`,
        // Pass the Host header required by the Istio VirtualService
        'Host': 'user-profile.internal'
      },
      timeout: 3000, // Important to set a timeout
    });

    res.status(200).json(response.data);

  } catch (error) {
    console.error('Error calling internal service:', error);
    if (isAxiosError(error) && error.response) {
      // Forward the error from the downstream service if available
      res.status(error.response.status).send(error.response.data);
    } else {
      res.status(500).send('An unexpected error occurred.');
    }
  }
};

A subtle but critical detail is setting the Host header. The axios call is to an IP address, but the Istio VirtualService routes based on the hostname (user-profile.internal). We must explicitly provide this header for the routing rule to match.

4. Deployment Script

Finally, a shell script automates the build and deploy process, ensuring all required flags are set correctly.

#!/bin/bash
set -e # Exit immediately if a command exits with a non-zero status.

# --- Configuration ---
PROJECT_ID="your-project-id"
REGION="your-region"
FUNCTION_NAME="gke-egress-proxy"
VPC_CONNECTOR="gcf-to-gke-connector"
GKE_GATEWAY_URL="http://INTERNAL_GATEWAY_IP/profile/123"
JWT_ISSUER="gcp-function-issuer@${PROJECT_ID}.iam.gserviceaccount.com"
SIGNING_KEY_SECRET_ID="projects/${PROJECT_ID}/secrets/jwt-signing-key/versions/latest"
# This service account needs roles/secretmanager.secretAccessor
FUNCTION_SERVICE_ACCOUNT="${FUNCTION_NAME}@${PROJECT_ID}.iam.gserviceaccount.com"

# --- Build Step ---
echo "Building function with esbuild..."
npm run build

# --- Deploy Step ---
echo "Deploying function to Google Cloud..."
gcloud functions deploy ${FUNCTION_NAME} \
  --gen2 \
  --runtime=nodejs18 \
  --region=${REGION} \
  --source=./dist \
  --entry-point=callInternalService \
  --trigger-http \
  --allow-unauthenticated \
  --vpc-connector=${VPC_CONNECTOR} \
  --service-account=${FUNCTION_SERVICE_ACCOUNT} \
  --set-env-vars="GKE_GATEWAY_URL=${GKE_GATEWAY_URL},JWT_ISSUER=${JWT_ISSUER},SIGNING_KEY_SECRET_ID=${SIGNING_KEY_SECRET_ID}"

echo "Deployment complete."

The resulting architecture is robust and secure.

graph TD
    A[External Event e.g., Pub/Sub] --> B{Google Cloud Function};
    B -- 1. Fetch Key --> C[Secret Manager];
    B -- 2. Generate JWT --> B;
    B -- 3. HTTPS Request w/ JWT --> D[VPC Network];
    
    subgraph VPC Network
        D -- 4. Via VPC Connector --> E[Internal Load Balancer IP];
    end

    subgraph GKE Cluster / Istio Mesh
        E -- 5. --> F[istio-internal-gateway];
        F -- 6. Validate JWT --> F;
        F -- 7. Route based on Host Header --> G[user-profile sidecar];
        G -- 8. mTLS --> H[user-profile app];
    end

The solution successfully bridges the two environments. The Cloud Function operates in its managed environment, while the GKE cluster maintains its strict security perimeter. The Istio gateway acts as a trusted intermediary, translating the JWT-based identity from the “outside” world into an authorized request within the mesh. The use of esbuild ensures that our bridge component remains lightweight and responsive, minimizing the performance penalty of a serverless architecture.

This pattern, however, is not without its own complexities and trade-offs. The management of JWT signing keys, including rotation and revocation, becomes a critical security function that must be managed outside this specific implementation. Furthermore, the VPC Access connector is a stateful, always-on resource, which introduces a fixed cost component to an otherwise “serverless” design. For scenarios requiring extremely low network latency, the overhead of traversing the VPC connector and gateway might be too high, potentially favoring a solution like Cloud Run on GKE, which would place the compute directly inside the cluster, albeit at the cost of the operational simplicity of Cloud Functions.

Kubernetes esbuild Google Cloud Functions Istio GKE

Implementing a gRPC-Controlled Kubernetes Operator with Couchbase State Management on AWS EKS

2023-11-15 Cloud Native

GitHub Actions gRPC-Go AWS EKS Distributed & Middleware Couchbase

Offloading LLM Embedding Generation to a Go gRPC Service for a High-Throughput Azure Functions RAG Pipeline

2023-11-15 Cloud Architecture

gRPC-Go Pulumi LLM Azure Functions Qdrant