Building a Dynamic Service Endpoint Provider for iOS Using Consul and Memcached


The connection logic in our iOS application, which handles real-time data processing, was becoming a significant bottleneck. Initially, it pointed to a single, statically configured load balancer responsible for distributing traffic across a fleet of backend worker nodes. This architecture, while simple, presented two critical failures in production. First, the load balancer became a single point of failure and a performance chokepoint. Second, it offered no mechanism for the client to intelligently select a worker, such as connecting to the least-loaded node or one geographically closer. We needed to push service discovery capabilities closer to the client without burdening the iOS application with the complexity and security risks of a full service mesh client.

Our solution was to develop an intermediary microservice—an “Endpoint Provider”—that acts as a secure bridge. This service queries our Consul catalog for healthy, available worker nodes, caches the results aggressively in Memcached to handle high request volumes from thousands of clients, and presents a simple, digestible list of endpoints to the iOS application. This allows the mobile client to retain control over its connection strategy (e.g., round-robin, random selection with retries) while the backend infrastructure remains dynamic and scalable.

The architecture follows a clear request flow:

sequenceDiagram
    participant iOS App
    participant Endpoint Provider (Go)
    participant Memcached
    participant Consul
    participant Worker Nodes

    iOS App->>+Endpoint Provider (Go): GET /v1/endpoints/worker-service
    Endpoint Provider (Go)->>+Memcached: GET endpoints:worker-service
    Memcached-->>-Endpoint Provider (Go): Cache Miss
    Endpoint Provider (Go)->>+Consul: Query healthy instances of 'worker-service'
    Consul-->>-Endpoint Provider (Go): List of healthy nodes (IP:Port)
    Endpoint Provider (Go)->>+Memcached: SET endpoints:worker-service with TTL
    Memcached-->>-Endpoint Provider (Go): OK
    Endpoint Provider (Go)-->>-iOS App: 200 OK - [{"host":"10.0.1.10", "port":8080}, ...]

    Note over iOS App, Worker Nodes: Client now connects directly to a chosen Worker Node.

    %% Subsequent Request (Cache Hit)
    iOS App->>+Endpoint Provider (Go): GET /v1/endpoints/worker-service
    Endpoint Provider (Go)->>+Memcached: GET endpoints:worker-service
    Memcached-->>-Endpoint Provider (Go): Cached JSON blob
    Endpoint Provider (Go)-->>-iOS App: 200 OK - [{"host":"10.0.1.10", "port":8080}, ...]

Consul Service and Health Check Configuration

The foundation of this system is Consul’s ability to track the health of our worker services. For this to be effective, each worker node must register itself with a meaningful health check. A simple TCP dial check is insufficient; a real-world project requires a check that reflects the actual application’s health, perhaps by measuring queue depth or current processing load.

Here is a representative service definition for a worker node, worker-service.hcl. We’ll register this with a local Consul agent. The critical part is the check, which executes a script.

// File: /etc/consul.d/worker-service.hcl
service {
  name = "worker-service"
  id = "worker-1"
  port = 8080
  address = "10.0.1.10"

  tags = ["realtime", "v1.2"]

  check {
    id = "worker-load-check"
    name = "Worker Process Load Check"
    // In a real system, this script would check CPU, memory, or job queue length.
    // Exit code 0 = passing, 1 = warning, >1 = critical/failing.
    // We simulate a failing state for demonstration.
    args = ["/bin/sh", "-c", "exit 2"]
    interval = "10s"
    timeout = "2s"
  }
}

To run a development Consul agent and register this service:

  1. Save the above HCL configuration.
  2. Start Consul: consul agent -dev -config-dir=/etc/consul.d

With this setup, Consul’s health API will now correctly exclude this failing node from any queries for healthy instances of worker-service. This dynamic health status is what our Endpoint Provider will consume.

The Golang Endpoint Provider Implementation

We chose Go for the Endpoint Provider due to its excellent concurrency model, performance, and robust ecosystem for building networked services. The service has three primary responsibilities: handle incoming HTTP requests, communicate with Memcached, and query the Consul API.

The project structure is straightforward:

endpoint-provider/
├── go.mod
├── go.sum
└── main.go

The core logic resides in main.go. We’ll build it piece by piece, focusing on configuration, dependency management, and the HTTP handler itself.

// File: main.go
package main

import (
	"context"
	"encoding/json"
	"fmt"
	"log"
	"net/http"
	"os"
	"os/signal"
	"strings"
	"syscall"
	"time"

	"github.comcom/bradfitz/gomemcache/memcache"
	consulapi "github.com/hashicorp/consul/api"
)

// ServiceConfig holds all external configuration for the application.
// In a real-world project, this would be populated from environment variables or a config file.
type ServiceConfig struct {
	ConsulAddress  string
	MemcachedServers []string
	ListenAddress  string
	CacheTTL       time.Duration
}

// Endpoint represents a single, connectable backend service instance.
type Endpoint struct {
	Host string `json:"host"`
	Port int    `json:"port"`
}

// ServiceLocator is the core application struct, holding clients for external services.
type ServiceLocator struct {
	consulClient *consulapi.Client
	memcacheClient *memcache.Client
	config       ServiceConfig
}

// NewServiceLocator initializes all clients and configurations.
func NewServiceLocator(config ServiceConfig) (*ServiceLocator, error) {
	// Configure and create the Consul client
	consulConfig := consulapi.DefaultConfig()
	consulConfig.Address = config.ConsulAddress
	consul, err := consulapi.NewClient(consulConfig)
	if err != nil {
		return nil, fmt.Errorf("failed to create consul client: %w", err)
	}

	// Configure and create the Memcached client
	mc := memcache.New(config.MemcachedServers...)
	// A quick ping to ensure Memcached is reachable on startup.
	if err := mc.Ping(); err != nil {
		return nil, fmt.Errorf("failed to ping memcached servers: %w", err)
	}

	return &ServiceLocator{
		consulClient: consul,
		memcacheClient: mc,
		config:       config,
	}, nil
}

// getEndpointsHandler is the HTTP handler that serves the list of available endpoints.
func (s *ServiceLocator) getEndpointsHandler(w http.ResponseWriter, r *http.Request) {
	serviceName := strings.TrimPrefix(r.URL.Path, "/v1/endpoints/")
	if serviceName == "" {
		http.Error(w, "Service name is required", http.StatusBadRequest)
		return
	}

	log.Printf("Request received for service: %s", serviceName)
	cacheKey := fmt.Sprintf("endpoints:%s", serviceName)

	// 1. Attempt to fetch from cache first.
	if item, err := s.memcacheClient.Get(cacheKey); err == nil {
		log.Printf("Cache hit for key: %s", cacheKey)
		w.Header().Set("Content-Type", "application/json")
		w.Header().Set("X-Cache-Status", "HIT")
		w.Write(item.Value)
		return
	} else if err != memcache.ErrCacheMiss {
		// A common mistake is to ignore errors other than cache miss.
		// This could indicate a serious connectivity issue with Memcached.
		log.Printf("ERROR: Memcached GET failed for key %s: %v", cacheKey, err)
	}

	log.Printf("Cache miss for key: %s. Querying Consul.", cacheKey)

	// 2. On cache miss, query Consul for healthy services.
	// The `PassingOnly` flag is crucial here.
	serviceEntries, _, err := s.consulClient.Health().Service(serviceName, "", true, nil)
	if err != nil {
		log.Printf("ERROR: Failed to query Consul for service %s: %v", serviceName, err)
		http.Error(w, "Internal server error: could not query service registry", http.StatusInternalServerError)
		return
	}

	if len(serviceEntries) == 0 {
		log.Printf("WARN: No healthy instances found for service %s", serviceName)
		http.Error(w, "No healthy service instances available", http.StatusServiceUnavailable)
		return
	}

	// 3. Format the response. We extract only the necessary information for the client.
	endpoints := make([]Endpoint, 0, len(serviceEntries))
	for _, entry := range serviceEntries {
		// The service address can be in Service.Address or Node.Address.
		// A robust implementation checks both.
		address := entry.Service.Address
		if address == "" {
			address = entry.Node.Address
		}
		endpoints = append(endpoints, Endpoint{
			Host: address,
			Port: entry.Service.Port,
		})
	}

	responseBody, err := json.Marshal(endpoints)
	if err != nil {
		log.Printf("ERROR: Failed to marshal endpoints to JSON for service %s: %v", serviceName, err)
		http.Error(w, "Internal server error: could not format response", http.StatusInternalServerError)
		return
	}

	// 4. Store the result in Memcached before returning to the client.
	err = s.memcacheClient.Set(&memcache.Item{
		Key:        cacheKey,
		Value:      responseBody,
		Expiration: int32(s.config.CacheTTL.Seconds()),
	})
	if err != nil {
		// Failing to set the cache is not a critical error. We should log it
		// but still serve the response to the client. The system gracefully degrades.
		log.Printf("ERROR: Failed to set cache for key %s: %v", cacheKey, err)
	}

	w.Header().Set("Content-Type", "application/json")
	w.Header().Set("X-Cache-Status", "MISS")
	w.Write(responseBody)
}

func main() {
	// Production-grade configuration should come from a more robust source.
	config := ServiceConfig{
		ConsulAddress:  "localhost:8500",
		MemcachedServers: []string{"localhost:11211"},
		ListenAddress:  ":9090",
		CacheTTL:       10 * time.Second, // A pragmatic TTL value.
	}

	locator, err := NewServiceLocator(config)
	if err != nil {
		log.Fatalf("Failed to initialize service locator: %v", err)
	}

	mux := http.NewServeMux()
	mux.HandleFunc("/v1/endpoints/", locator.getEndpointsHandler)
	mux.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(http.StatusOK)
		fmt.Fprintln(w, "OK")
	})

	server := &http.Server{
		Addr:    config.ListenAddress,
		Handler: mux,
	}

	// Graceful shutdown handling.
	go func() {
		log.Printf("Endpoint Provider listening on %s", config.ListenAddress)
		if err := server.ListenAndServe(); err != http.ErrServerClosed {
			log.Fatalf("HTTP server ListenAndServe error: %v", err)
		}
	}()

	quit := make(chan os.Signal, 1)
	signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
	<-quit

	log.Println("Shutting down server...")
	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
	defer cancel()

	if err := server.Shutdown(ctx); err != nil {
		log.Fatalf("HTTP server shutdown error: %v", err)
	}
	log.Println("Server gracefully stopped.")
}

This Go application demonstrates several production-ready practices:

  • Configuration Management: A ServiceConfig struct centralizes configuration.
  • Dependency Injection: The ServiceLocator struct holds client dependencies, making handlers testable.
  • Robust Error Handling: It differentiates between a cache miss and a true Memcached connection error. It also handles failures in querying Consul or marshalling JSON.
  • Graceful Degradation: If setting the cache fails, the service still returns data to the client, prioritizing availability.
  • Graceful Shutdown: It listens for termination signals to finish in-flight requests before exiting.

The iOS Client Implementation

On the iOS side, the implementation requires a service layer to fetch, parse, and manage the list of endpoints. We’ll use Swift’s async/await for clean asynchronous code and Codable for parsing.

// File: EndpointService.swift
import Foundation

// A Codable struct that must match the JSON structure from our Go service.
struct Endpoint: Codable, Hashable {
    let host: String
    let port: Int
}

// Custom error types provide more context than generic errors.
enum EndpointError: Error {
    case invalidURL
    case networkError(Error)
    case decodingError(Error)
    case serverError(statusCode: Int)
    case noEndpointsAvailable
}

// The EndpointManager is the core component for the client.
// It can be used as a singleton or injected as a dependency.
@MainActor
class EndpointManager: ObservableObject {

    // The list of available endpoints is published to the UI.
    @Published private(set) var availableEndpoints: [Endpoint] = []
    
    // A simple index for round-robin selection.
    private var currentIndex = 0
    
    // In a real project, this base URL would come from a configuration file.
    private let providerBaseURL = "http://localhost:9090/v1/endpoints/"
    private let urlSession: URLSession

    init(session: URLSession = .shared) {
        self.urlSession = session
    }

    // Fetches and updates the list of endpoints for a given service.
    func refreshEndpoints(for serviceName: String) async throws {
        guard let url = URL(string: "\(providerBaseURL)\(serviceName)") else {
            throw EndpointError.invalidURL
        }

        var request = URLRequest(url: url)
        request.timeoutInterval = 5.0 // A sensible timeout for a critical path.

        do {
            let (data, response) = try await urlSession.data(for: request)

            guard let httpResponse = response as? HTTPURLResponse else {
                throw EndpointError.networkError(URLError(.badServerResponse))
            }

            guard (200...299).contains(httpResponse.statusCode) else {
                // If the provider returns a 4xx or 5xx, we handle it explicitly.
                throw EndpointError.serverError(statusCode: httpResponse.statusCode)
            }

            let decoder = JSONDecoder()
            let endpoints = try decoder.decode([Endpoint].self, from: data)

            if endpoints.isEmpty {
                throw EndpointError.noEndpointsAvailable
            }

            // Update the internal list and reset the index.
            self.availableEndpoints = endpoints
            self.currentIndex = 0
            
            print("Successfully refreshed endpoints: \(endpoints)")

        } catch let error as DecodingError {
            throw EndpointError.decodingError(error)
        } catch {
            throw EndpointError.networkError(error)
        }
    }

    // Provides the next available endpoint using a round-robin strategy.
    // A pitfall here is not handling an empty list. We must guard against it.
    func getNextEndpoint() -> Endpoint? {
        guard !availableEndpoints.isEmpty else {
            return nil
        }
        
        let endpoint = availableEndpoints[currentIndex]
        currentIndex = (currentIndex + 1) % availableEndpoints.count
        return endpoint
    }
}

A SwiftUI view could use this manager as follows:

// File: ContentView.swift
import SwiftUI

struct ContentView: View {
    @StateObject private var endpointManager = EndpointManager()
    @State private var connectionTarget: String = ""
    @State private var statusMessage: String = "Ready"

    var body: some View {
        VStack(spacing: 20) {
            Text("Endpoint Discovery Client")
                .font(.largeTitle)
            
            Button("Refresh Worker Endpoints") {
                Task {
                    await refresh(service: "worker-service")
                }
            }
            .buttonStyle(.borderedProminent)

            Button("Get Next Worker") {
                if let endpoint = endpointManager.getNextEndpoint() {
                    self.connectionTarget = "Connecting to \(endpoint.host):\(endpoint.port)"
                } else {
                    self.connectionTarget = "No endpoints available. Please refresh."
                }
            }
            .buttonStyle(.bordered)
            
            Text(connectionTarget)
                .padding()
            
            Text("Status: \(statusMessage)")
                .font(.footnote)
                .foregroundColor(.gray)

        }
        .padding()
        .task {
            // Initial refresh on view appearance
            await refresh(service: "worker-service")
        }
    }
    
    private func refresh(service: String) async {
        do {
            statusMessage = "Refreshing..."
            try await endpointManager.refreshEndpoints(for: service)
            statusMessage = "Endpoints updated successfully."
        } catch let error as EndpointError {
            statusMessage = "Error: \(error)"
        } catch {
            statusMessage = "An unexpected error occurred: \(error.localizedDescription)"
        }
    }
}

This client-side implementation correctly encapsulates the logic for fetching and cycling through endpoints. The calling code doesn’t need to know about JSON, HTTP, or caching; it just asks for the next available connection target.

Limitations and Future Iterations

This architecture, while robust, is not without its limitations. The Endpoint Provider service itself, while stateless, represents a potential single point of failure. In a true production environment, multiple instances of the Go service would be deployed behind a load balancer. This might seem circular, but the key difference is that this load balancer handles traffic for a simple, high-performance, stateless service, whereas the original problem involved load balancing for stateful or computationally heavy workers.

The client-side selection strategy is a basic round-robin. A more advanced implementation could involve the Endpoint Provider annotating the list with metadata from Consul (e.g., tags indicating geographic region or a custom load metric from a KV store). The iOS client could then use this metadata to make a more intelligent decision, such as preferring endpoints with the lowest latency or load.

Finally, the cache invalidation is purely TTL-based. While sufficient for many use cases, a 10-second TTL means a failed node might still be served to clients for up to 10 seconds. For systems requiring near-instant failover, a more complex solution involving Consul watches and a messaging bus to proactively invalidate the Memcached key could be engineered, though this adds significant operational complexity. The current design strikes a pragmatic balance between simplicity, performance, and freshness.


  TOC