Automating a JavaScript-based ML Model Interface Deployment Using MLflow, GitHub Actions, and Puppet

MLOps

Word Count: 2.7k

Read Times: 16 Min

The operational disconnect between a data science team’s model registry and the front-end interfaces that consume them is a common source of friction. In our case, the process was manual: a model would be promoted to “Production” in MLflow, followed by a ticket to the platform team. That team would then manually update a simple JavaScript-based dashboard that displayed the model’s key performance indicators, artifact locations, and version. This manual handoff was not just slow; it was a reliable source of human error, leading to mismatched versions and stale information being presented to stakeholders.

Our objective was to create a zero-touch pipeline. When a specific model version is designated for production, an automated workflow should handle the entire lifecycle: fetching metadata from MLflow, building a fresh static front-end bundle with this new data, and deploying it to the existing fleet of virtual machines that host our internal tools. The existing infrastructure is managed by Puppet, a constraint we must engineer around, not replace.

The Orchestration Core: GitHub Actions Workflow

The entire process is orchestrated by a single GitHub Actions workflow. We opted for a workflow_dispatch trigger initially, allowing a team member to manually trigger a deployment for a specific model. This provides a control point before we evolve to a fully event-driven system based on MLflow webhooks.

The workflow is broken down into distinct jobs for clarity and potential parallelization: fetch-model-metadata, build-frontend, and trigger-deployment.

# .github/workflows/deploy-model-viewer.yml

name: Deploy ML Model Viewer

on:
  workflow_dispatch:
    inputs:
      model_name:
        description: 'The name of the MLflow model to deploy'
        required: true
        default: 'prod-recommendation-engine'
      model_version:
        description: 'The version of the model to deploy (e.g., 3)'
        required: true

env:
  MLFLOW_TRACKING_URI: ${{ secrets.MLFLOW_TRACKING_URI }}
  MLFLOW_TRACKING_USERNAME: ${{ secrets.MLFLOW_TRACKING_USERNAME }}
  MLFLOW_TRACKING_PASSWORD: ${{ secrets.MLFLOW_TRACKING_PASSWORD }}
  ARTIFACT_NAME: model-viewer-dist
  PYTHON_VERSION: '3.9'

jobs:
  fetch-model-metadata:
    runs-on: ubuntu-latest
    outputs:
      metadata_file: 'model_metadata.json'
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: ${{ env.PYTHON_VERSION }}

      - name: Install Python dependencies
        run: |
          pip install -r ./scripts/requirements.txt

      - name: Fetch and Write Metadata
        id: fetch
        run: |
          python ./scripts/fetch_mlflow_metadata.py \
            --model-name "${{ github.event.inputs.model_name }}" \
            --model-version "${{ github.event.inputs.model_version }}" \
            --output-file "model_metadata.json"
        
      - name: Upload metadata artifact
        uses: actions/upload-artifact@v3
        with:
          name: model-metadata
          path: model_metadata.json

  build-frontend:
    needs: fetch-model-metadata
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Download metadata artifact
        uses: actions/download-artifact@v3
        with:
          name: model-metadata
          path: ./public

      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '18'
          cache: 'npm'

      - name: Install NPM dependencies
        run: npm install

      - name: Build static site
        # This script will read ./public/model_metadata.json
        run: npm run build

      - name: Upload distributable artifact
        uses: actions/upload-artifact@v3
        with:
          name: ${{ env.ARTIFACT_NAME }}
          path: ./dist

  trigger-deployment:
    needs: build-frontend
    runs-on: ubuntu-latest
    # This job requires more complex secrets for git operations and artifact storage
    steps:
      - name: Download distributable artifact
        uses: actions/download-artifact@v3
        with:
          name: ${{ env.ARTIFACT_NAME }}

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_REGION }}

      - name: Upload artifact to S3
        id: upload_s3
        run: |
          TIMESTAMP=$(date +%s)
          VERSION="${{ github.event.inputs.model_name }}-${{ github.event.inputs.model_version }}-${TIMESTAMP}"
          aws s3 cp . s3://${{ secrets.S3_ARTIFACT_BUCKET }}/model-viewer/${VERSION} --recursive
          echo "artifact_version=${VERSION}" >> $GITHUB_OUTPUT

      - name: Checkout Hiera data repository
        uses: actions/checkout@v3
        with:
          repository: 'our-org/puppet-hiera-data'
          token: ${{ secrets.HIERA_REPO_PAT }}
          path: 'hiera-data'

      - name: Update Hiera configuration
        run: |
          cd hiera-data
          # This script updates the YAML file with the new S3 path
          ../scripts/update_hiera.sh \
            "common.yaml" \
            "model_viewer::artifact_version" \
            "${{ steps.upload_s3.outputs.artifact_version }}"

      - name: Commit and push Hiera changes
        run: |
          cd hiera-data
          git config --global user.name "GitHub Actions Bot"
          git config --global user.email "[email protected]"
          git add .
          git commit -m "Automated deployment of model-viewer ${{ steps.upload_s3.outputs.artifact_version }}" || echo "No changes to commit"
          git push

Bridging the Gap: The MLflow Client Script

A critical component is the Python script responsible for communicating with the MLflow Tracking Server. This script cannot be trivial; it must handle authentication, potential network issues, and cases where the requested model or version does not exist. It authenticates using environment variables populated by GitHub secrets. Its sole job is to query the MLflow API and serialize the required data into a structured JSON file that the front-end build process can consume without any ambiguity.

# scripts/fetch_mlflow_metadata.py

import os
import json
import argparse
import logging
from mlflow.tracking import MlflowClient
from mlflow.exceptions import RestException

# Basic logging configuration
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def get_model_details(client, model_name, model_version):
    """
    Fetches comprehensive details for a specific model version from MLflow.
    Handles exceptions for missing models or versions.
    """
    try:
        logging.info(f"Fetching version '{model_version}' for model '{model_name}'...")
        version_details = client.get_model_version(name=model_name, version=model_version)
        
        # We need more than just the version details, like metrics from the parent run.
        run_id = version_details.run_id
        run_data = client.get_run(run_id).data
        
        metrics = run_data.metrics
        params = run_data.params
        tags = run_data.tags
        
        # Construct a clean, serializable dictionary for the frontend
        metadata = {
            "model_name": version_details.name,
            "version": version_details.version,
            "stage": version_details.current_stage,
            "description": version_details.description,
            "run_id": version_details.run_id,
            "source_artifact_uri": version_details.source,
            "created_timestamp": version_details.creation_timestamp,
            "last_updated_timestamp": version_details.last_updated_timestamp,
            "metrics": metrics,
            "params": params,
            "tags": {k: v for k, v in tags.items() if not k.startswith("mlflow.")}, # Filter out internal tags
        }
        
        logging.info(f"Successfully fetched metadata for run ID: {run_id}")
        return metadata

    except RestException as e:
        logging.error(f"Failed to fetch model '{model_name}' version '{model_version}'. Error: {e}")
        # In a CI environment, a non-zero exit code is crucial
        exit(1)
    except Exception as e:
        logging.error(f"An unexpected error occurred: {e}")
        exit(1)


def main():
    parser = argparse.ArgumentParser(description="Fetch MLflow model metadata for frontend consumption.")
    parser.add_argument("--model-name", required=True, help="The registered model name in MLflow.")
    parser.add_argument("--model-version", required=True, help="The model version to fetch.")
    parser.add_argument("--output-file", required=True, help="Path to write the output JSON file.")
    
    args = parser.parse_args()

    # MLflow client automatically picks up credentials from environment variables
    # set in the GitHub Actions workflow (MLFLOW_TRACKING_URI, etc.)
    try:
        client = MlflowClient()
    except Exception as e:
        logging.error(f"Failed to initialize MLflow client. Check credentials and URI. Error: {e}")
        exit(1)

    metadata = get_model_details(client, args.model_name, args.model_version)

    if metadata:
        try:
            with open(args.output_file, 'w') as f:
                json.dump(metadata, f, indent=4)
            logging.info(f"Metadata successfully written to {args.output_file}")
        except IOError as e:
            logging.error(f"Failed to write metadata to file {args.output_file}. Error: {e}")
            exit(1)

if __name__ == "__main__":
    main()

The JavaScript Front-End: Static Data Injection

The front-end is a simple, static single-page application. To maximize performance and simplify deployment, we avoid making client-side API calls to MLflow. Instead, the metadata is baked into the application at build time. Our build script reads the model_metadata.json file and makes it available as a global JavaScript object.

Here is the core logic within our main application script. It assumes a build tool (like Webpack or Vite) has processed this file.

// src/app.js

// This import is handled by a build tool plugin or a custom build script.
// It effectively reads the JSON file and makes it available as a module.
import modelData from '../public/model_metadata.json';

// Simple error handling for the case where the data might be missing
function renderError(message) {
  const container = document.getElementById('app');
  container.innerHTML = `<div class="error-panel">
    <h2>Failed to Load Model Data</h2>
    <p>${message}</p>
  </div>`;
}

function renderKeyValuePairs(title, dataObject) {
  if (!dataObject || Object.keys(dataObject).length === 0) {
    return '';
  }

  const items = Object.entries(dataObject)
    .map(([key, value]) => `
      <div class="kv-pair">
        <span class="key">${key}</span>
        <span class="value">${typeof value === 'number' ? value.toFixed(4) : value}</span>
      </div>
    `)
    .join('');

  return `
    <div class="data-section">
      <h3>${title}</h3>
      <div class="kv-container">${items}</div>
    </div>
  `;
}

function render() {
  const container = document.getElementById('app');
  if (!container) {
    console.error('Root element #app not found.');
    return;
  }

  if (!modelData || !modelData.model_name) {
    renderError('The model_metadata.json file is either missing or malformed.');
    return;
  }
  
  // Convert timestamp to a readable format
  const updatedDate = new Date(modelData.last_updated_timestamp).toLocaleString();

  const appHTML = `
    <header>
      <h1>Model Viewer: ${modelData.model_name}</h1>
      <span class="version-tag">Version: ${modelData.version}</span>
      <span class="stage-tag stage-${modelData.stage.toLowerCase()}">${modelData.stage}</span>
    </header>
    <main>
      <div class="metadata-card">
        <h2>Details</h2>
        <p class="description">${modelData.description || 'No description provided.'}</p>
        <div class="info-grid">
            <div><strong>Run ID:</strong> <span>${modelData.run_id}</span></div>
            <div><strong>Last Updated:</strong> <span>${updatedDate}</span></div>
        </div>
        <div class="info-grid">
          <div><strong>Artifact Path:</strong> <code class="code-block">${modelData.source_artifact_uri}</code></div>
        </div>
      </div>

      ${renderKeyValuePairs('Metrics', modelData.metrics)}
      ${renderKeyValuePairs('Parameters', modelData.params)}
      ${renderKeyValuePairs('Tags', modelData.tags)}
    </main>
  `;

  container.innerHTML = appHTML;
}

// Ensure the DOM is ready before attempting to render
document.addEventListener('DOMContentLoaded', render);

Styling is handled by PostCSS, which allows us to use modern CSS features while ensuring browser compatibility. The configuration is minimal but effective for our needs, using plugins for nesting and future features.

// postcss.config.js
module.exports = {
  plugins: {
    'postcss-preset-env': {
      stage: 1,
      features: {
        'nesting-rules': true,
      },
    },
    'cssnano': {}, // Minify for production builds
  },
};

An example of a styled component using nesting:

/* src/styles/main.css */

.metadata-card {
  background-color: #f9f9f9;
  border: 1px solid #e1e4e8;
  border-radius: 6px;
  padding: 24px;
  margin-bottom: 20px;

  h2 {
    margin-top: 0;
    border-bottom: 1px solid #ccc;
    padding-bottom: 8px;
  }

  .info-grid {
    display: grid;
    grid-template-columns: 1fr;
    gap: 12px;
    margin-top: 16px;

    @media (min-width: 768px) {
      grid-template-columns: 1fr 1fr;
    }
  }
}

The Puppet Integration: Declarative Deployment

Here lies the most significant architectural constraint. The target nodes are managed by Puppet. A common anti-pattern would be to have the GitHub Action SSH into the nodes and run deployment commands. This is brittle and violates the declarative nature of configuration management.

Instead, we adopt a GitOps-like pattern. The source of truth for our application’s deployed version is not in the CI job but in a Hiera data file stored in a separate Git repository.

The trigger-deployment job in our workflow does two things:

It uploads the built front-end artifact (dist folder) to an S3 bucket with a unique versioned key (e.g., model-viewer/prod-recommendation-engine-3-1677610000).
It checks out the puppet-hiera-data repository, programmatically modifies a YAML file to update the artifact_version key, and pushes the change.

A helper script handles the YAML update safely.

# scripts/update_hiera.sh
#!/bin/bash
set -euo pipefail

FILE_PATH=$1
KEY=$2
NEW_VALUE=$3

# A more robust solution would use a proper YAML parser like yq
# But for a simple key-value, grep/sed is sufficient and has fewer dependencies.
if grep -q "^${KEY}:" "${FILE_PATH}"; then
  # Key exists, update it
  sed -i "s|^${KEY}:.*|${KEY}: \"${NEW_VALUE}\"|" "${FILE_PATH}"
  echo "Updated key '${KEY}' in ${FILE_PATH}"
else
  # Key does not exist, append it
  echo "${KEY}: \"${NEW_VALUE}\"" >> "${FILE_PATH}"
  echo "Added key '${KEY}' to ${FILE_PATH}"
fi

Finally, the Puppet manifest on the server uses this data to ensure the correct version is deployed. It describes the desired state, and Puppet’s agent makes it so during its next run.

# modules/model_viewer/manifests/init.pp

class model_viewer (
  String $artifact_version = undef,
  String $install_dir      = '/var/www/html/model-viewer',
  String $artifact_bucket  = 'our-internal-artifacts',
) {

  # Ensure the web server package is installed and service is running
  ensure_packages(['nginx'])
  service { 'nginx':
    ensure => running,
    enable => true,
  }

  # Ensure the installation directory exists and has correct permissions
  file { $install_dir:
    ensure => directory,
    owner  => 'www-data',
    group  => 'www-data',
    mode   => '0755',
  }

  # Use the hiera-provided version to construct the S3 source URL
  $s3_source_path = "s3://${artifact_bucket}/model-viewer/${artifact_version}/"

  # A pitfall here is managing the cleanup of old files.
  # We need to ensure that when we deploy a new version, any files from
  # the old version that are no longer present are removed.
  # Using `aws s3 sync` with the --delete flag is idempotent and handles this.
  exec { 'sync-model-viewer-from-s3':
    command     => "/usr/local/bin/aws s3 sync '${s3_source_path}' '${install_dir}' --delete --no-progress",
    path        => ['/bin', '/usr/bin'],
    # This `onlyif` is crucial. It prevents the command from running on every
    # Puppet agent run if the content is already up-to-date.
    # We check for a version file that should be part of the artifact.
    onlyif      => "test \"$(cat ${install_dir}/VERSION 2>/dev/null)\" != \"${artifact_version}\"",
    # This requires the archive resource to create the version file
    # for the check to work reliably on subsequent runs.
    logoutput   => true,
    require     => File[$install_dir],
    notify      => Exec['write-version-file'],
  }
  
  # Write a VERSION file to make the deployment state auditable on the machine
  # and to help with the idempotency check above.
  exec { 'write-version-file':
    command     => "echo '${artifact_version}' > ${install_dir}/VERSION",
    path        => ['/bin', '/usr/bin'],
    refreshonly => true, # This exec only runs when notified by the sync
  }
}

The data binding happens in Hiera:

# data/common.yaml
---
model_viewer::artifact_version: "prod-recommendation-engine-2-1677510000" # This value gets updated by CI

This architecture creates a clean separation of concerns. The CI/CD system is responsible for building the artifact and signaling intent by updating a data file. The configuration management system is responsible for observing that intent and converging the state of the infrastructure to match it.

graph TD
    subgraph GitHub Actions
        A[Manual Trigger: workflow_dispatch] --> B{Job: fetch-model-metadata};
        B --> C[Python Script: fetch_mlflow_metadata.py];
        C --> D[MLflow API];
        D --> C;
        C --> E[model_metadata.json];
        E --> F{Job: build-frontend};
        F --> G[npm run build];
        G --> H[Static Site Artifact];
        H --> I{Job: trigger-deployment};
    end

    subgraph AWS
        J[S3 Artifact Bucket]
    end

    subgraph Puppet Infra
        K[Hiera Git Repository]
        L[Puppet Server]
        M[Web Server VM]
    end

    I --> J[Upload Artifact to S3];
    I --> K[Commit New Version to Hiera Data];
    K --> L[Puppet Server Reads Hiera];
    L --> M[Puppet Agent Run on VM];
    M --> J[Syncs Artifact from S3];

The primary limitation of this approach is the latency introduced by Puppet’s agent run interval. The deployment is not instantaneous upon the Hiera data update; it occurs during the next scheduled check-in, which could be up to 30 minutes. For our internal dashboard use case, this delay is acceptable. For a system requiring near-instant deployments, a push-based mechanism using Puppet Bolt or a transition to a more dynamic container orchestration platform would be necessary. Furthermore, the reliance on a shell script to update Hiera YAML is functional but fragile; replacing it with a tool like yq within the runner would improve robustness against complex YAML structures.

DevOps PostCSS GitHub Actions JavaScript MLflow Puppet

Implementing a Resilient OIDC PKCE Flow Between a Scala Backend and a Jetpack Compose Android Client

2023-10-27 Security

Android Scala OpenID Connect Jetpack Compose Akka HTTP

Architecting a High-Throughput System with CQRS Using Go, AWS SQS, and a Reactive Solid.js Frontend

2023-10-27 Software Architecture

CQRS Go-Gin Software Engineering & Architecture Solid.js AWS SQS WebSocket