The intermittent failures started subtly. A user’s dashboard would occasionally fail to load after a period of inactivity, requiring a manual refresh. GraphQL operations, dispatched in quick succession from different parts of the application, would result in a cascade of 401 Unauthorized errors. Our client-side authentication logic, a patchwork of boolean flags (isLoading
, isAuthenticated
, isRefreshingToken
) and React effects, was riddled with race conditions. When a token expired, multiple concurrent GraphQL requests would each trigger a separate refresh attempt. The first refresh would succeed, invalidating the refresh token used by the others, leading to a state of complete desynchronization. The application state became unpredictable and debugging was a nightmare.
Our initial imperative approach was a clear anti-pattern. We were trying to manage a complex, multi-step, asynchronous process—the OAuth 2.0 Authorization Code Flow with PKCE and refresh tokens—using a primitive set of disconnected boolean flags. Authentication is not a binary state; it is a finite state machine. This realization led us away from makeshift solutions towards a formal, declarative model. We decided to refactor the entire client-side security layer around a state machine, orchestrated by XState, to provide a single, predictable source of truth that would govern the behavior of our GraphQL client.
The core problem was that our GraphQL client, in this case urql
, was unaware of the intricate lifecycle of our authentication tokens. It would naively send requests, and only upon receiving a 401 would we reactively attempt a fix. This is fundamentally flawed. The GraphQL client’s operations must be governed by the current and transitioning states of authentication.
Our technology selection was deliberate:
-
oidc-client-ts
: A robust, standards-compliant library to handle the low-level mechanics of the OIDC protocol (e.g., creating authorization requests, handling redirects, validating tokens). We want to abstract away the protocol complexity, not reinvent it. -
urql
: A lightweight and extensible GraphQL client. Its concept of “Exchanges”—a middleware pipeline for operations—provides the perfect seam for injecting our security logic. -
XState
: The centerpiece of our new architecture. It allows us to formally define the authentication lifecycle as a state machine, eliminating implicit states and making every transition explicit and testable.
The goal was to build a self-contained AuthService
that exposes the state machine and provides an urql
exchange. This exchange would act as a gatekeeper for all GraphQL operations, intelligently queuing, retrying, or failing them based on the precise state of the authentication machine.
Defining the Authentication State Machine
The first step is to map out all possible states and transitions. A user’s session can be in many states: checking for an existing session on load, being fully unauthenticated, in the middle of a login redirect, actively authenticated, refreshing an expired token, or in a failed state.
Here is the complete XState machine definition. It’s verbose, but every line represents an explicit business rule, making the system’s behavior transparent.
// src/lib/auth/authMachine.ts
import { createMachine, assign, interpret } from 'xstate';
import { UserManager, User } from 'oidc-client-ts';
// OIDC configuration details - should be in environment variables
const oidcConfig = {
authority: "https://your-identity-provider.com",
client_id: "your-client-id",
redirect_uri: "http://localhost:3000/callback",
post_logout_redirect_uri: "http://localhost:3000/",
response_type: "code",
scope: "openid profile email api.read",
};
interface AuthContext {
userManager: UserManager;
user?: User | null;
error?: Error | null;
}
// All possible events that can be sent to the machine
type AuthEvent =
| { type: 'LOGIN' }
| { type: 'LOGOUT' }
| { type: 'CALLBACK_SUCCESS'; user: User }
| { type: 'CALLBACK_ERROR'; error: Error }
| { type: 'SESSION_VALID'; user: User }
| { type: 'SESSION_INVALID' }
| { type: 'TOKEN_EXPIRED' }
| { type: 'TOKEN_REFRESHED'; user: User }
| { type: 'REFRESH_ERROR'; error: Error };
// Using generics for type safety
export const authMachine = createMachine<AuthContext, AuthEvent>({
id: 'authentication',
initial: 'initializing',
context: {
userManager: new UserManager(oidcConfig),
user: null,
error: null,
},
states: {
initializing: {
invoke: {
id: 'checkSession',
src: (context) => async (send) => {
try {
// Check for an existing user session silently on startup
const user = await context.userManager.getUser();
if (user && !user.expired) {
send({ type: 'SESSION_VALID', user });
} else {
send({ type: 'SESSION_INVALID' });
}
} catch (error) {
console.error("Error checking session:", error);
send({ type: 'SESSION_INVALID' });
}
},
},
on: {
SESSION_VALID: {
target: 'authenticated',
actions: assign({ user: (_, event) => event.user }),
},
SESSION_INVALID: 'unauthenticated',
},
},
unauthenticated: {
on: {
LOGIN: 'redirectingToLogin',
},
},
redirectingToLogin: {
invoke: {
id: 'redirectToLogin',
src: (context) => context.userManager.signinRedirect(),
onError: {
target: 'failure',
actions: assign({ error: (_, event) => event.data }),
}
}
},
handlingCallback: {
invoke: {
id: 'handleCallback',
src: (context) => async (send) => {
try {
const user = await context.userManager.signinRedirectCallback();
send({ type: 'CALLBACK_SUCCESS', user });
} catch (error) {
send({ type: 'CALLBACK_ERROR', error: error as Error });
}
}
},
on: {
CALLBACK_SUCCESS: {
target: 'authenticated',
actions: assign({ user: (_, event) => event.user }),
},
CALLBACK_ERROR: {
target: 'failure',
actions: assign({ error: (_, event) => event.error }),
}
}
},
authenticated: {
on: {
LOGOUT: 'loggingOut',
TOKEN_EXPIRED: 'refreshingToken',
},
},
refreshingToken: {
invoke: {
id: 'refreshToken',
src: (context) => async (send) => {
try {
// This attempts a silent sign-in to get a new token
const user = await context.userManager.signinSilent();
if (user) {
send({ type: 'TOKEN_REFRESHED', user });
} else {
// This case can happen if the refresh token is also expired
send({ type: 'REFRESH_ERROR', error: new Error('Silent sign-in failed.') });
}
} catch (error) {
send({ type: 'REFRESH_ERROR', error: error as Error });
}
},
},
on: {
TOKEN_REFRESHED: {
target: 'authenticated',
actions: assign({ user: (_, event) => event.user }),
},
REFRESH_ERROR: {
// If refresh fails, the session is truly over.
target: 'unauthenticated',
actions: [
assign({ user: (_) => null, error: (_, event) => event.error }),
// Optional: clear stale storage
(context) => context.userManager.removeUser(),
]
},
},
},
loggingOut: {
invoke: {
id: 'logout',
src: (context) => context.userManager.signoutRedirect(),
onError: {
target: 'failure',
actions: assign({ error: (_, event) => event.data })
}
}
},
failure: {
// A terminal state for unrecoverable errors during the auth process
type: 'final'
},
},
});
This state machine is now the single source of truth. Notice how error handling is a first-class citizen; states like failure
and transitions like REFRESH_ERROR
are explicit. There are no race conditions here because the machine can only be in one state at a time. A TOKEN_EXPIRED
event arriving while in the refreshingToken
state will be ignored, solving our primary problem.
The Gatekeeper: urql
Auth Exchange
With the state machine defined, we need to connect it to our GraphQL client. The urql
exchange is the perfect place. It will intercept every operation, consult the state machine, and act accordingly.
The logic is non-trivial:
- On receiving an operation, check the auth machine’s state.
- If
authenticated
, add theAuthorization
header and forward the operation. - If
unauthenticated
(and the operation is not public), we could either fail it immediately or queue it pending a login. For simplicity, we’ll fail it. - If the state is
refreshingToken
, we must queue the operation. It cannot proceed until the token is refreshed. - If an operation returns a 401 error, we must dispatch the
TOKEN_EXPIRED
event to our machine and hold onto the failed operation for a potential retry.
This requires careful management of asynchronous flows and operation buffers.
// src/lib/graphql/authExchange.ts
import { Exchange } from '@urql/core';
import { pipe, tap, filter, mergeMap, fromPromise, map } from 'wonka';
import { Actor } from 'xstate';
import { authMachine } from '../auth/authMachine';
import { shared } from 'wonka';
// Type definition for our auth machine actor
type AuthActor = Actor<typeof authMachine>;
interface OperationWithAuth extends Operation {
context: {
auth: boolean; // Flag to indicate if operation needs authentication
};
}
export const authExchange = (authActor: AuthActor): Exchange => {
return ({ forward }) => {
// A buffer to hold operations that are waiting for a token refresh
let operationBuffer: Operation[] = [];
// This is the core logic for processing operations
const processOperation = (operation: Operation) => {
const { auth: requiresAuth = true } = operation.context;
if (!requiresAuth) {
return forward(operation);
}
const currentState = authActor.getSnapshot();
// If we are authenticated, add the token and proceed
if (currentState.matches('authenticated') && currentState.context.user?.access_token) {
const token = currentState.context.user.access_token;
const operationWithToken = {
...operation,
fetchOptions: {
...operation.fetchOptions,
headers: {
...operation.fetchOptions?.headers,
Authorization: `Bearer ${token}`,
},
},
};
return forward(operationWithToken);
}
// If we are in the middle of refreshing, queue the operation
if (currentState.matches('refreshingToken')) {
return fromPromise(
new Promise<Operation>((resolve) => {
operationBuffer.push({ ...operation, resolve });
})
).pipe(mergeMap(processOperation)); // Re-process the operation after promise resolves
}
// If we are unauthenticated, trigger a refresh (or login)
// This is a crucial step for handling the initial 401.
authActor.send('TOKEN_EXPIRED');
return fromPromise(
new Promise<Operation>((resolve) => {
operationBuffer.push({ ...operation, resolve });
})
).pipe(mergeMap(processOperation));
};
// Listen to state transitions from the auth machine
const state$ = shared(fromPromise(new Promise(resolve => {
authActor.subscribe(state => {
// When we successfully get a new token (either from refresh or login)
if (state.matches('authenticated') && state.event.type === 'TOKEN_REFRESHED') {
// Process all buffered operations
operationBuffer.forEach(op => op.resolve(op)); // 'resolve' is a custom property we added
operationBuffer = [];
}
// If the refresh fails, we must fail all buffered operations
if (state.matches('unauthenticated') && state.event.type === 'REFRESH_ERROR') {
operationBuffer.forEach(op => {
// This is tricky, we don't have a direct way to error the stream here.
// A more robust implementation might use a dedicated error stream.
// For now, we just clear the buffer. The operations will effectively hang, then timeout.
// A better solution would involve custom error handling within the operation promise.
});
operationBuffer = [];
}
});
})));
// The main exchange logic
return (ops$) => {
return pipe(
ops$,
mergeMap((operation: Operation) => {
const result$ = processOperation(operation);
return pipe(
result$,
// This tap is for handling 401 errors from the server
tap(({ error }) => {
if (error?.graphQLErrors.some(e => e.extensions?.code === 'UNAUTHENTICATED') || error?.networkError?.statusCode === 401) {
// Check if we are already trying to refresh to avoid loops
if (!authActor.getSnapshot().matches('refreshingToken')) {
authActor.send('TOKEN_EXPIRED');
}
}
}),
// Filter out initial 401 errors while we retry
filter(({ error }) => {
const isAuthError = error?.graphQLErrors.some(e => e.extensions?.code === 'UNAUTHENTICATED') || error?.networkError?.statusCode === 401;
// If it's an auth error AND we are not yet in a failed refresh state, filter it out, as we are handling it.
return !(isAuthError && !authActor.getSnapshot().matches('unauthenticated'));
})
);
})
);
};
};
};
Note: The error handling for failing buffered operations is complex within Wonka streams. The provided code gives a functional path, but a production system might require a more sophisticated mechanism to propagate errors back to the UI from the buffer.
Assembling the Application
Now we wire everything together. We create a single AuthService
instance and pass its actor to our urql
client configuration.
// src/main.tsx or App.tsx
import { createClient, Provider, fetchExchange } from 'urql';
import { interpret } from 'xstate';
import { authMachine } from './lib/auth/authMachine';
import { authExchange } from './lib/graphql/authExchange';
// 1. Create and start the authentication state machine service
const authService = interpret(authMachine).start();
// 2. Create the auth exchange, passing the service actor
const customAuthExchange = authExchange(authService);
// 3. Create the urql client with the auth exchange at the front of the pipeline
const client = createClient({
url: 'https://your-api.com/graphql',
exchanges: [
// The auth exchange runs before any other exchange, including cache and fetch.
customAuthExchange,
// Add other exchanges like cache, etc.
fetchExchange,
],
});
// Provide the client and service to your app
// ...
A React component can now consume the state directly from the service, ensuring the UI is always perfectly synchronized with the authentication state.
// src/components/Profile.tsx
import { useActor } from '@xstate/react';
import { authService } from '../main'; // Assuming authService is exported
export const Profile = () => {
const [state, send] = useActor(authService);
const { user } = state.context;
if (state.matches('initializing') || state.matches('handlingCallback')) {
return <div>Loading session...</div>;
}
if (state.matches('authenticated')) {
return (
<div>
<h1>Welcome, {user?.profile.name}</h1>
<button onClick={() => send('LOGOUT')}>Log Out</button>
{/* Your GraphQL components would go here */}
</div>
);
}
return <button onClick={() => send('LOGIN')}>Log In</button>;
};
Testing the Untestable
One of the most significant wins of this approach is testability. We can test the entire authentication flow without a browser or a real identity provider. Using @xstate/test
, we can create model-based tests to verify all paths through our machine.
// src/lib/auth/authMachine.test.ts
import { createModel } from '@xstate/test';
import { authMachine } from './authMachine';
const testMachine = authMachine.withContext({
// Mock the userManager to control its behavior in tests
userManager: {
// ... mock implementations of signinRedirect, getUser, signinSilent etc.
getUser: jest.fn().mockResolvedValue(null),
signinSilent: jest.fn().mockResolvedValue({ /* mock user object */ }),
} as any,
user: null,
error: null,
});
const authModel = createModel(testMachine).withEvents({
TOKEN_EXPIRED: {
exec: async () => {
// No-op for the test, just triggering the event
},
},
TOKEN_REFRESHED: {
exec: async () => {},
cases: [{ user: { /* mock user */ } }],
},
// ... other events
});
describe('authentication machine', () => {
const testPlans = authModel.getSimplePathPlans();
testPlans.forEach((plan) => {
describe(plan.description, () => {
it(plan.description, async () => {
// `plan.test` executes the test case
await plan.test({});
});
});
});
it('should have full coverage', () => {
return authModel.testCoverage();
});
});
This allows us to prove, for instance, that receiving a TOKEN_EXPIRED
event while in the authenticated
state correctly transitions the machine to refreshingToken
, and a subsequent TOKEN_REFRESHED
event returns it to authenticated
with an updated user context. This was impossible with our old flag-based system.
Here is a visualization of the state machine we built, which XState’s tooling can generate automatically. This clarity is invaluable for onboarding new developers and reasoning about system behavior.
stateDiagram-v2 [*] --> initializing initializing --> unauthenticated: SESSION_INVALID initializing --> authenticated: SESSION_VALID unauthenticated --> redirectingToLogin: LOGIN redirectingToLogin --> failure: onError [*] --> handlingCallback handlingCallback --> authenticated: CALLBACK_SUCCESS handlingCallback --> failure: CALLBACK_ERROR authenticated --> loggingOut: LOGOUT authenticated --> refreshingToken: TOKEN_EXPIRED refreshingToken --> authenticated: TOKEN_REFRESHED refreshingToken --> unauthenticated: REFRESH_ERROR loggingOut --> failure: onError
This architectural shift was not merely a code change; it was a paradigm shift in how we handle client-side state. By modeling the complex, asynchronous nature of security as a formal state machine, we eliminated an entire class of bugs related to race conditions and inconsistent state. The system is now predictable, resilient, and, most importantly, debuggable.
The solution is not without its trade-offs. It introduces XState as a new dependency, which has a learning curve. The authExchange
itself contains complex logic that must be thoroughly tested. Furthermore, this pattern primarily addresses token lifecycle management; it does not solve token storage security (e.g., mitigating XSS attacks on tokens in localStorage
), which remains a separate and critical concern. Future iterations could explore using service workers to move token refreshing off the main thread entirely, providing even greater resilience against application freezes or complex multi-tab scenarios.