A Versioned Shared Kernel for ORM Models Manages Data Consistency Across Sanic Polyrepo Services


The move from a monolith to a polyrepo microservice architecture introduced an immediate and painful problem: managing shared data models. Our initial services, a user management service and an order processing service, both needed a concept of a User. In the monolith, this was a single, unambiguous ORM model. In our new distributed world, it became two separately maintained, increasingly divergent definitions. A change to the user’s data structure in one service would break the other unless a coordinated, manual update was performed. This was not the autonomy we were promised; it was a distributed monolith, the worst of both worlds.

Our first attempt to solve this with git submodules was a disaster. The developer experience was clumsy, and managing submodule versions across different service deployments created a complex dependency graph that was nearly impossible to reason about during an incident. The core issue remained: we were sharing code implicitly, not through a well-defined, versioned contract.

This led to the decision to adopt a “Shared Kernel” pattern, a concept from Domain-Driven Design. The plan was to extract the truly shared, stable parts of our domain—specifically, the ORM models and their associated data contracts—into a dedicated, versioned Python package. This package would become an internal library, a formal dependency for any service that needed to interact with these core entities. It wouldn’t contain business logic, which must remain within service boundaries, but it would provide a single source of truth for the database schema of shared tables.

For our stack, this meant creating a central library containing tortoise-orm models. The consumer services, built with Sanic for its async performance, would install this library like any other dependency (e.g., pip install our-shared-kernel==1.2.0). This approach forces developers to make conscious, deliberate decisions about when to adopt a change from the kernel by updating a version number in their pyproject.toml. It turns an implicit, dangerous coupling into an explicit, manageable dependency.

Here is the architectural flow we settled on:

graph TD
    subgraph Polyrepo Structure
        A[Repo: user-service]
        B[Repo: order-service]
        C[Repo: shared-kernel]
    end

    subgraph CI/CD and Dependency Flow
        C -- Publishes Package --> D{Private PyPI Registry}
        D -- "pip install our-shared-kernel==1.0.0" --> A
        D -- "pip install our-shared-kernel==1.0.0" --> B
    end

    subgraph Database Layer
        E[(Shared Database)]
    end

    subgraph Service Layer
        F[Sanic App: user-service]
        G[Sanic App: order-service]
    end

    A --> F
    B --> G

    F -- Manages Migrations & CRUD --> E
    G -- Reads/Writes Data --> E

    style D fill:#f9f,stroke:#333,stroke-width:2px
    style C fill:#ccf,stroke:#333,stroke-width:2px

This diagram illustrates the separation of concerns. The shared-kernel repository has its own lifecycle. When a change is made (e.g., adding a field to the User model), a new version of the package is published. The user-service and order-service are free to upgrade to this new version on their own schedules, ensuring independent deployability. The key, however, is that only one service—the primary owner of that data, in this case, the user-service—is responsible for running the database migrations.

Implementing the Shared Kernel Package

The first step is creating the shared-kernel repository. This is a standard Python package, not a service. Its only purpose is to define and distribute the shared models.

Repository: shared-kernel
File: pyproject.toml

[tool.poetry]
name = "our-shared-kernel"
version = "0.1.0"
description = "Shared ORM models and data contracts for our microservices."
authors = ["The Architect <[email protected]>"]

[tool.poetry.dependencies]
python = "^3.10"
tortoise-orm = "^0.20.0"
pydantic = "^2.0"

[tool.poetry.dev-dependencies]
pytest = "^7.0"
# Add other dev dependencies here

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

The core of the package is the model definitions. We’ll start with a Tenant and a User, as these are common cross-cutting concerns. A pragmatic choice is to include a base model that enforces consistency for primary keys and timestamps.

File: our_shared_kernel/models/base.py

# our_shared_kernel/models/base.py
import uuid
from tortoise import fields, models

class BaseModel(models.Model):
    """
    An abstract base model providing common fields.
    """
    id = fields.UUIDField(pk=True, default=uuid.uuid4)
    created_at = fields.DatetimeField(auto_now_add=True)
    updated_at = fields.DatetimeField(auto_now=True)

    class Meta:
        abstract = True

Now, we define the concrete models that inherit from our base.

File: our_shared_kernel/models/tenant.py

# our_shared_kernel/models/tenant.py
from tortoise import fields
from .base import BaseModel

class Tenant(BaseModel):
    """
    Represents a tenant in our multi-tenant system.
    """
    name = fields.CharField(max_length=100, unique=True)
    is_active = fields.BooleanField(default=True)

    def __str__(self):
        return self.name

    class Meta:
        table = "tenants"

File: our_shared_kernel/models/user.py

# our_shared_kernel/models/user.py
from tortoise import fields
from .base import BaseModel

class User(BaseModel):
    """
    Represents a system user, belonging to a tenant.
    """
    email = fields.CharField(max_length=255, unique=True)
    hashed_password = fields.CharField(max_length=255)
    is_active = fields.BooleanField(default=True)
    
    tenant = fields.ForeignKeyField(
        "models.Tenant", related_name="users", on_delete=fields.CASCADE
    )

    def __str__(self):
        return self.email

    class Meta:
        table = "users"

To make these models easily discoverable by consuming services, we create an __init__.py that exposes them.

File: our_shared_kernel/models/__init__.py

# our_shared_kernel/models/__init__.py
from .tenant import Tenant
from .user import User

__all__ = ["Tenant", "User"]

Finally, we also package Pydantic schemas for API data contracts. This ensures that the data structures used in HTTP requests and responses are also consistent with the database models.

File: our_shared_kernel/schemas.py

# our_shared_kernel/schemas.py
import uuid
from datetime import datetime
from pydantic import BaseModel, EmailStr, ConfigDict

# --- Tenant Schemas ---
class TenantBase(BaseModel):
    name: str

class TenantCreate(TenantBase):
    pass

class TenantSchema(TenantBase):
    model_config = ConfigDict(from_attributes=True)
    
    id: uuid.UUID
    is_active: bool
    created_at: datetime

# --- User Schemas ---
class UserBase(BaseModel):
    email: EmailStr

class UserCreate(UserBase):
    password: str
    tenant_id: uuid.UUID

class UserSchema(UserBase):
    model_config = ConfigDict(from_attributes=True)

    id: uuid.UUID
    is_active: bool
    tenant_id: uuid.UUID
    created_at: datetime

With this structure, we can build and publish version 0.1.0 of our-shared-kernel to our private package repository.

Implementing the User Service

The user-service is the primary owner of the User and Tenant entities. It will handle their creation, modification, and, critically, their database migrations.

Repository: user-service
File: pyproject.toml

[tool.poetry]
name = "user-service"
version = "0.1.0"
description = "Manages users and tenants."
authors = ["The Architect <[email protected]>"]

[tool.poetry.dependencies]
python = "^3.10"
sanic = "^23.6"
sanic-ext = "^23.6"
tortoise-orm = "^0.20.0"
aerich = "^0.7.2" # For migrations
pydantic = "^2.0"
# This is the crucial dependency
our-shared-kernel = { version = "0.1.0", source = "private-pypi" }

# [[tool.poetry.source]]
# name = "private-pypi"
# url = "https://your-private-pypi/simple/"
# priority = "primary"

# ... other dependencies

The service configuration needs to define the database connection and tell Tortoise ORM where to find the models. The models are not in this repository; they are inside the installed our_shared_kernel package.

File: user_service/config.py

# user_service/config.py
import os

class AppConfig:
    DB_USER = os.environ.get("DB_USER", "postgres")
    DB_PASSWORD = os.environ.get("DB_PASSWORD", "password")
    DB_HOST = os.environ.get("DB_HOST", "localhost")
    DB_PORT = os.environ.get("DB_PORT", "5432")
    DB_NAME = os.environ.get("DB_NAME", "micro_db")
    
    DATABASE_URL = f"postgres://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"
    
    # This is the key part for Tortoise ORM initialization.
    # It points to the location of models inside the installed package.
    TORTOISE_ORM_CONFIG = {
        "connections": {"default": DATABASE_URL},
        "apps": {
            "models": {
                "models": ["our_shared_kernel.models", "aerich.models"],
                "default_connection": "default",
            },
        },
    }

We configure aerich for migrations. This configuration lives only in the user-service. The order-service will not and must not have migration tooling for these shared tables.

File: user_service/aerich.ini

[aerich]
tortoise_orm = user_service.config.AppConfig.TORTOISE_ORM_CONFIG
location = ./migrations
src_folder = ./.

Now, the Sanic application itself. We set up listeners to initialize and close the ORM connection.

File: user_service/main.py

# user_service/main.py
import logging
from sanic import Sanic, response, Request
from sanic.exceptions import NotFound, SanicException
from sanic_ext import validate
from tortoise.contrib.sanic import register_tortoise
from tortoise.exceptions import IntegrityError

from our_shared_kernel.models import User, Tenant
from our_shared_kernel.schemas import UserCreate, UserSchema, TenantCreate, TenantSchema

from .config import AppConfig

# Basic logging setup
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

app = Sanic("UserService")
app.config.update(AppConfig.TORTOISE_ORM_CONFIG)

# Register Tortoise ORM with the Sanic app
register_tortoise(
    app,
    config=AppConfig.TORTOISE_ORM_CONFIG,
    generate_schemas=False # We use Aerich for migrations, so we don't generate schemas on startup.
)

@app.post("/tenants")
@validate(json=TenantCreate)
async def create_tenant(request: Request, body: TenantCreate) -> response.JSONResponse:
    """Create a new tenant."""
    try:
        tenant = await Tenant.create(**body.model_dump())
        logger.info(f"Created tenant '{tenant.name}' with id {tenant.id}")
        # The schema from the shared kernel is used for the response
        return response.json(TenantSchema.model_validate(tenant).model_dump(), status=201)
    except IntegrityError:
        # A common mistake is not handling database constraints properly.
        # This provides a clear error message.
        logger.warning(f"Attempted to create duplicate tenant with name '{body.name}'")
        raise SanicException(f"Tenant with name '{body.name}' already exists.", status_code=409)

@app.post("/users")
@validate(json=UserCreate)
async def create_user(request: Request, body: UserCreate) -> response.JSONResponse:
    """Create a new user for a given tenant."""
    # Production-grade code must validate foreign key existence.
    if not await Tenant.filter(id=body.tenant_id).exists():
        logger.warning(f"Attempted to create user for non-existent tenant '{body.tenant_id}'")
        raise NotFound(f"Tenant with id '{body.tenant_id}' not found.")

    try:
        # Note: Password hashing logic is omitted for brevity but is essential in a real project.
        user = await User.create(
            email=body.email, 
            hashed_password=f"hashed_{body.password}", 
            tenant_id=body.tenant_id
        )
        logger.info(f"Created user '{user.email}' with id {user.id}")
        return response.json(UserSchema.model_validate(user).model_dump(), status=201)
    except IntegrityError:
        logger.warning(f"Attempted to create duplicate user with email '{body.email}'")
        raise SanicException(f"User with email '{body.email}' already exists.", status_code=409)


@app.get("/users/<user_id:uuid>")
async def get_user(request: Request, user_id: str) -> response.JSONResponse:
    """Retrieve a single user by their ID."""
    user = await User.get_or_none(id=user_id).prefetch_related("tenant")
    if not user:
        raise NotFound(f"User with id '{user_id}' not found.")
    
    logger.info(f"Retrieved user '{user.email}'")
    return response.json(UserSchema.model_validate(user).model_dump())

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8001, debug=True, auto_reload=True)

To initialize the database, we would run aerich init -t user_service.config.AppConfig.TORTOISE_ORM_CONFIG and aerich init-db. Any subsequent changes to models in shared-kernel would require a new package version, an update in user-service‘s pyproject.toml, followed by aerich migrate and aerich upgrade as part of the user-service deployment pipeline.

Implementing the Order Service

The order-service needs to associate orders with users, but it doesn’t “own” the User entity. It is a consumer of the shared kernel.

Repository: order-service
File: pyproject.toml

[tool.poetry]
name = "order-service"
# ...
[tool.poetry.dependencies]
python = "^3.10"
sanic = "^23.6"
sanic-ext = "^23.6"
tortoise-orm = "^0.20.0"
pydantic = "^2.0"
# It depends on the exact same version of the kernel.
our-shared-kernel = { version = "0.1.0", source = "private-pypi" } 
# ...

This service will have its own models (e.g., Order), but it will also need to reference the User model from the shared kernel. Critically, it does not contain any aerich configuration for the shared models.

File: order_service/models/order.py

# order_service/models/order.py
from tortoise import fields
# This service defines its own base model, but we'll re-use the kernel's for simplicity here.
# In a real project, it might have its own base model.
from our_shared_kernel.models.base import BaseModel

class Order(BaseModel):
    """
    Represents an order placed by a user.
    """
    # This foreign key is the link to the shared kernel.
    # The string reference 'models.User' works because Tortoise ORM will
    # resolve it from the list of registered model paths.
    user = fields.ForeignKeyField("models.User", related_name="orders")
    
    item_description = fields.TextField()
    amount = fields.DecimalField(max_digits=10, decimal_places=2)

    class Meta:
        table = "orders"

The service configuration is similar, but it must register both its own local models and the shared kernel models.

File: order_service/config.py

# order_service/config.py
# ... (similar DB configuration as user-service) ...
class AppConfig:
    # ...
    TORTOISE_ORM_CONFIG = {
        "connections": {"default": DATABASE_URL},
        "apps": {
            "models": {
                # This is the important part: both model locations are registered.
                "models": ["our_shared_kernel.models", "order_service.models", "aerich.models"],
                "default_connection": "default",
            },
        },
    }

This service would have its own aerich.ini file, but its location would point to a different directory (e.g., ./order_migrations) to manage only the orders table schema. It must never generate migrations for the users or tenants tables.

The Sanic application in the order-service can now use the User model as if it were local.

File: order_service/main.py

# order_service/main.py
import logging
import uuid
from sanic import Sanic, response, Request
from sanic.exceptions import NotFound
from sanic_ext import validate
from tortoise.contrib.sanic import register_tortoise
from pydantic import BaseModel, Field

from our_shared_kernel.models import User
from .models.order import Order
from .config import AppConfig

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

app = Sanic("OrderService")
app.config.update(AppConfig.TORTOISE_ORM_CONFIG)

register_tortoise(app, config=AppConfig.TORTOISE_ORM_CONFIG, generate_schemas=False)

# --- API Schemas specific to this service ---
class OrderCreate(BaseModel):
    user_id: uuid.UUID
    item_description: str
    amount: float = Field(gt=0)

class OrderSchema(BaseModel):
    id: uuid.UUID
    user_id: uuid.UUID
    item_description: str
    amount: float

@app.post("/orders")
@validate(json=OrderCreate)
async def create_order(request: Request, body: OrderCreate) -> response.JSONResponse:
    """Creates a new order for a user."""
    # A pitfall in this design is assuming the referenced entity exists.
    # In a real-world project, this service might need to call the user-service's API
    # or check the database directly to validate the user_id.
    # For this example, we check the DB.
    if not await User.filter(id=body.user_id).exists():
        logger.warning(f"Attempt to create order for non-existent user '{body.user_id}'")
        raise NotFound(f"User with id '{body.user_id}' not found.")
    
    order = await Order.create(
        user_id=body.user_id,
        item_description=body.item_description,
        amount=body.amount
    )
    logger.info(f"Created order '{order.id}' for user '{body.user_id}'")
    
    return response.json({
        "id": str(order.id),
        "user_id": str(order.user_id),
        "item_description": order.item_description,
        "amount": float(order.amount)
    }, status=201)

@app.get("/users/<user_id:uuid>/orders")
async def get_orders_for_user(request: Request, user_id: uuid.UUID):
    """Retrieve all orders for a specific user."""
    # This query demonstrates the power of the shared ORM model.
    # We can perform joins and lookups across service boundaries at the data layer.
    orders = await Order.filter(user_id=user_id)
    
    logger.info(f"Found {len(orders)} orders for user '{user_id}'")
    
    # We would use a Pydantic schema here in a real app.
    result = [
        {"id": str(o.id), "item_description": o.item_description, "amount": float(o.amount)}
        for o in orders
    ]
    return response.json(result)

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8002, debug=True, auto_reload=True)

This architecture provides a structured and maintainable way to handle shared data models in a polyrepo environment. The shared kernel becomes a formal, versioned contract that prevents schema drift. The clear ownership of migration responsibilities prevents conflicts in production.

However, this pattern is not a silver bullet. The biggest risk is that the shared kernel can grow too large, slowly turning back into a monolith of shared code that couples services together. Strict discipline is required to ensure only truly universal and stable entities are placed in the kernel. Anything specific to a single domain’s logic must remain within the service boundary. Furthermore, this approach assumes services can share a database. If services require complete data isolation, a different pattern based on asynchronous event-driven communication would be necessary to synchronize state, which introduces its own set of complexities around eventual consistency.


  TOC