Home Features Docs Blog Security Examples FAQ

Deep Dive: How djust's ORM JIT Pipeline Serializes Django Models at Rust Speed

djust Team | | 10 min read
Diagram showing djust's five-stage JIT serialization pipeline from template extraction to Rust serialization with Python fallback

Every Django developer has written the same boilerplate: a serialize_post() function that manually plucks fields off a model, calls .all() on M2M relations, and builds a dict. It works. It's also slow, fragile, and duplicates knowledge that already exists in your templates.

djust's JIT serialization pipeline eliminates that boilerplate entirely. It reads your template, figures out which fields you actually use, generates a custom serializer function, optimizes your database queries, and does most of the heavy lifting in Rust. This post explains the full pipeline, the performance characteristics, and the five fixes we shipped in 0.2.1 to close the remaining gaps.

The Problem with Manual Serialization

Consider a typical blog post view. You need to pass a post object to the template, but Django models aren't JSON-serializable. The traditional approach:

def serialize_post(post):
    return {
        'title': post.title,
        'slug': post.slug,
        'excerpt': post.excerpt,
        'category': {
            'name': post.category.name,  # N+1 query!
            'url': post.category.url,
        },
        'tags': [
            {'name': t.name, 'url': t.url}
            for t in post.tags.all()  # Another N+1!
        ],
        'featured_image_url': post.featured_image_url,  # @property
        'publish_date_formatted': post.publish_date_formatted,  # @property
        'reading_time': post.reading_time,
        'url': post.url,  # @property
    }

This has three problems:

  1. Duplication — the serializer mirrors what the template already declares via {{ post.title }}, {{ post.category.name }}, etc.
  2. N+1 queries — accessing post.category and post.tags.all() without select_related/prefetch_related fires extra queries per post.
  3. Maintenance burden — add a field to the template, forget the serializer, get a blank spot on the page.

How the JIT Pipeline Works

djust replaces manual serialization with a five-stage pipeline that runs automatically when your view renders:

Stage 1: Template Variable Extraction (Rust)

A Rust function (extract_template_variables) parses your Django template and builds a map of every variable access path. For a template containing:

<h1>{{ post.title }}</h1>
<span>{{ post.category.name }}</span>
{% for tag in post.tags.all %}
  <a href="{{ tag.url }}">{{ tag.name }}</a>
{% endfor %}
<time>{{ post.publish_date_formatted }}</time>

It produces:

{
    'post': [
        'title',
        'category.name',
        'tags.all.name',
        'tags.all.url',
        'publish_date_formatted',
    ]
}

This runs in sub-millisecond time because it's compiled Rust using PyO3.

Stage 2: Query Optimization

The path list is analyzed to determine which relations need select_related (foreign keys like category) and which need prefetch_related (M2M like tags). The query optimizer rewrites your QuerySet automatically:

# Your code:
self.posts = BlogPost.published()

# What JIT actually executes:
BlogPost.published().select_related('category').prefetch_related('tags')

This collapses hundreds of N+1 queries into 1–3 queries total.

Stage 3: Code Generation

A Python code generator builds a custom serializer function from the path tree. The paths are first organized into a tree structure:

_build_path_tree(['title', 'category.name', 'tags.all.name', 'tags.all.url'])
# Returns:
{
    'title': {},
    'category': {'name': {}},
    'tags': {'all': {'name': {}, 'url': {}}}
}

Then the tree is walked to emit Python code with safety checks at every level:

def serialize_post_a4f8b2(obj):
    result = {}
    if hasattr(obj, 'title') and obj.title is not None:
        result['title'] = obj.title
    if hasattr(obj, 'category') and obj.category is not None:
        result['category'] = {}
        if hasattr(obj.category, 'name') and obj.category.name is not None:
            result['category']['name'] = obj.category.name
    if hasattr(obj, 'tags') and obj.tags is not None:
        # tags.all has subtree {name, url} → iterate
        try:
            result['tags'] = {'all': []}
            for item in obj.tags.all():
                _item_result = {}
                if hasattr(item, 'name'):
                    _item_result['name'] = item.name
                if hasattr(item, 'url'):
                    _item_result['url'] = item.url
                result['tags']['all'].append(_item_result)
        except (TypeError, AttributeError):
            pass
    return result

Notice the key insight: when .all() appears as an intermediate node with children (name, url), the codegen emits a for loop that calls .all() and extracts fields from each item. When .all() is a leaf node (no children), it just calls the method and assigns the result directly.

Stage 4: Compilation & Caching

The generated code is compiled to Python bytecode using compile() and exec(), then cached with a key of (template_hash, variable_name, model_hash). On subsequent renders, the pipeline skips stages 1–3 entirely and uses the cached serializer function.

The template hash is itself cached per template content string, so SHA256 is computed at most once per unique template across all variables.

Stage 5: Rust Serialization with Python Fallback

For maximum performance, the pipeline first tries a Rust serializer (serialize_queryset) that can access Django model fields directly through PyO3. However, Rust can't call Python @property methods or custom get_*() methods. So after Rust serialization completes, the pipeline checks whether all expected top-level keys are present:

expected_keys = len(set(p.split('.')[0] for p in paths))
if result and len(result[0]) < expected_keys:
    # Rust missed @property attrs — fall back to Python codegen
    serializer = compile_serializer(code, func_name)
    result = [serializer(obj) for obj in items]

This expected key count is also cached to avoid recomputing the set operation on every render.

The Five Fixes in 0.2.1

The JIT pipeline described above is the current state. Getting here required closing five gaps that surfaced when we dogfooded the pipeline on this very blog:

Fix 1: .all() with Subtree Generates Iteration

The original codegen treated .all() as a leaf node regardless of context. If the template did {% for tag in post.tags.all %}{{ tag.name }}{% endfor %}, the path tree correctly showed tags.all.name and tags.all.url, but the code generator didn't recognize that all with children meant "iterate and extract." It would try to assign obj.tags.all as a value instead of calling obj.tags.all() and looping.

The fix adds a branch in _generate_nested_access: when the current attribute is all, count, exists, or starts with get_, and it has a non-empty subtree, emit a for loop that calls the method and extracts nested fields from each item.

Fix 2: Deep Dict Serialization

Context values aren't always QuerySets or Models. The blog's series navigation is a plain dict containing Model instances:

self.series_nav = {
    'previous': some_blog_post,  # Django Model
    'next': another_blog_post,   # Django Model
    'total': 5,
    'current': 3,
}

The original pipeline only JIT-serialized top-level QuerySets and Models. Nested Models inside dicts were passed through as-is and failed JSON serialization. The fix adds a recursive _deep_serialize_dict pass that walks dict values, JIT-serializing any Model or QuerySet it finds.

Fix 3: @property in Fallback Serialization

The DjangoJSONEncoder fallback (used when JIT is unavailable or fails) only serialized concrete model fields and explicit get_*() methods. It missed @property attributes entirely, so properties like featured_image_url, publish_date_formatted, and url disappeared from the serialized output.

The fix adds _add_property_values, which walks the model's MRO (stopping at django.db.models.Model) and includes any @property that returns a primitive value. Property names are cached per model class to avoid repeated MRO walks on subsequent serializations.

Fix 4: Model Before Duck-Typing

The encoder used duck-typing to detect file-like objects: anything with both url and name attributes was treated as a file field and serialized to its URL string. But Django Models can also have url and name properties (as our BlogPost model does). The fix reorders the checks to test isinstance(obj, models.Model) before the duck-typing heuristic.

Fix 5: Rust→Python Fallback When Incomplete

The Rust serializer can access concrete fields but not Python @property methods. Before this fix, there was no check. The pipeline now compares the number of top-level keys in the Rust output against the expected count derived from the template paths. If Rust returned fewer keys, the Python codegen serializer takes over for that variable.

Performance Results

With all five fixes and caching optimizations in place, here are the numbers for this blog's actual views (measured as get_context_data() wall time, warm cache, 100 iterations):

ViewMeanMedianMinP95
BlogPostView10.08 ms9.85 ms9.26 ms10.76 ms
BlogIndexView2.52 ms2.46 ms2.30 ms2.81 ms

The serializer microbenchmarks show the individual pipeline stages:

OperationTimeTarget
Code generation0.012 ms/call<5 ms
Code compilation0.060 ms/call<10 ms
Deep nesting (5 levels)0.62 µs/call<100 µs
Wide fields (10 fields)0.62 µs/call<50 µs

The caching strategy means that after the first render, the pipeline cost is dominated by database queries and Python attribute access—code generation and compilation are free.

What Gets Cached (and Where)

Understanding the cache layers helps explain why warm renders are fast:

  1. Template hash cache (_template_hash_cache) — maps template content string to its SHA256 hash. Avoids recomputing the hash for each variable in the same template.
  2. JIT serializer cache (_jit_serializer_cache) — keyed by (template_hash, variable_name, model_hash). Stores the compiled serializer function and query optimization plan.
  3. Expected keys cache (_expected_keys_cache) — keyed by (template_hash, variable_name). Stores the precomputed count of expected top-level keys for the Rust fallback check.
  4. Property name cache (DjangoJSONEncoder._property_cache) — keyed by model class. Stores the list of @property attribute names discovered via MRO walk.

All caches are process-level dictionaries, so they persist across requests within the same worker process. They reset on deploy, which is the right behavior since template content or model definitions may have changed.

The Private/Public Variable Pattern

One more piece makes the pipeline practical: the private/public variable pattern. In your view's mount() method, store QuerySets in private attributes (prefixed with _) while you're building and filtering them. Then in get_context_data(), assign them to public attributes just before calling super(). This prevents premature serialization:

class MyListView(LiveView):
    template_name = 'my_list.html'

    def mount(self, request, **kwargs):
        self._items = MyModel.objects.filter(active=True)

    def get_context_data(self, **kwargs):
        # Expose for JIT serialization at the last moment
        self.items = self._items.order_by('-created_at')
        return super().get_context_data(**kwargs)

The JIT pipeline then handles select_related, prefetch_related, serializer generation, and caching automatically. No manual serialize_item() function required.

Conclusion

The JIT pipeline turns a common Django pain point—manual model serialization—into a zero-config optimization. By analyzing templates, generating code, and leveraging Rust for the hot path, it delivers sub-10ms context serialization for real-world views with complex model relationships, M2M fields, and @property attributes.

The five fixes in 0.2.1 close the gaps that appeared when the pipeline met production data: M2M iteration, nested dicts, property access, type ordering, and Rust fallback detection. Combined with caching at every layer, the result is a serialization system that's both faster and more correct than hand-written code.

Share this post

Related Posts