Epic: Source-Generated Serialization #7465

Aaronontheweb · 2025-01-14T02:15:55Z

Abstract

There are two major performance issues that affect Akka.NET overall:

Using a single TCP socket in remoting (will be solved via Implement Akka.Remote.Artery transport #4436 and Classic Akka Remoting limitations (things to consider) #4757)
Using 2013-15 style serialization: working with byte[], allowing every serializer to have its own idea on how to allocate memory, reflection, and lots of redundant copying for envelope types (such as RemotingEnvelope, DDataEnvelope, and many others.)

This epic is about addressing issue 2 - the serialization system. There have been many good proposals on how to do this already, such as:

These are great ideas for making the serialization system faster - but, what this does not address are the following:

Writing custom serializers in Akka.NET is peak tedium and is generally not pleasant to do.
Default serialization in Akka.NET using Newtonsoft.Json, which is ancient and not supported long-term. It's not coming with us to high-performance land.
Reflection-based polymorphic serialization is not 100% secure, in addition to being slow and bug-prone - you have to add features such as Akka.Serialization: add type include / exclude for polymorphic serializers #5026, which we did for Hyperion on Add dangerous type blacklist feature to Akka.Serialization.Hyperion #5208. Schema-based serialization is the secure option here.
Replacing default serialization so users can kick the tires on Akka.Remote without having to manually write a serializer is a must-have for people trying to use the framework. Having Akka.Remote "just work" on the first try is a magical experience that is actually pretty important to keeping Akka.NET users happy and engaged.
Finally, we have a new requirement: Epic: Full AOT Support for Akka.NET #7246 - AOT support. Reflection-based serialization is a no-no and will never be supported without some type of manual schema.

Given all of this - there's a clear solution that solves all the problems at once: compile-time generated serialization.

Requirements

I'm going to break our requirements out into two areas - mandatory and "nice to have"

Mandatory

All message definitions must be explicitly marked with an interface or attribute indicating that they're intended for remote or persistent serialization. This is the marker the generator is going to look for.
The generator will fill in the following attributes or methods on source-generated serializer / message types: a size estimator, a writer method for writing to a System.Memory<byte> / whatever, and a reader for returning the original type T. The size estimator is actually the most important piece for performance reasons - this is what will allow memory pooling to work efficiently. We use this technique very successfully inside TurboMqtt: https://www.youtube.com/watch?v=owTeEYqi0AM&t=1002s (skips to 18:26)
Serializable types will be organized into a SerializerV2 classes that are then generated on a per-assembly basis, which uses System.Memory<T> constructs as its primary signature.
Registrations for using the custom serializer will be generated using either Akka.Hosting or an ActorSystemSetup - the user might have to manually pass these in order to work as that might be a bridge too far for the serializer definition. We'll see what we can do automatically, but if we're trying to be AOT friendly that means reflection-based type loading for the serializer itself (which is how we would do it auto-magically, typically) might be a no-no as well.
All current and existing serializers will be wrapped inside a SerializerV2Adapter and made backwards compatible - this is 10000000% necessary in order to prevent bricking historical data inside Akka.Persistence AND it's also necessary for people who are already using custom serialization to have some backwards compatibility.

Stretch Goals

I would love to have some degree of automatic detection and enforcement of extend-only design: https://aaronstannard.com/extend-only-design/ - this will stop developers from "having to know" to preserve this practice and will instead force them to do battle with the compiler. This will require the source generator to have some prior knowledge of what the code looked like. Probably tough to do.
I would love to do native code emission for F# if possible but the state of the Roslyn toolchain does not give me high hopes that this will be feasible.
Platform support - it's an open question whether or not we're going to drop .NET Standard support entirely in v1.6. I'd love to keep all of this and have it work in .NET Standard 2.0 / 2.1, but if we have to drop it (and non-.NET SDK projects, which required us to lower our Roslyn target recently for Akka.Analyzers Akka.Analyzers can't install in .NET Framework projects using pre-SDK project styles #7307)

Approach

Given the phases of code generation I've described here, the relatively new Incremental Source Generators from Roslyn sounds like the most promising way to accomplish this.

We've been building our muscle working with Roslyn on the https://github.com/akkadotnet/akka.analyzers project over the past year, partly because we knew we'd be headed down this road for v1.6. That's given us a lot of practical experience on how to keep a Roslyn project organized / versioned / tested.

I think that's the route we're going to take, unless a better / cheaper / faster alternative appears.

Implementation

Design the message-specific serialization stub / interface - this is 80% of the output from the code generator. What should this look like?
Introduce System.Memory APIs into all Serializer base types #3740 - need to design some serializer APIs that leverage the stub.
Have a serializerId / manifest generation system that ensures non-collision between serializers in the same project.
Write the stub generator
Write the serializer generator
Write the serializer configuration generator

There's going to be a billion edge cases and lots of whacky garbage users do that needs help ("oh no, my double upside down partial static abstract protected internal discriminated union that uses custom IReadOnlyCollection<T> implementations that are actually mutable doesn't serialize Exceptions correctly!") - we'll deal with that as best we can.

The most important requirement we have to observe is not allowing the serializer to rick-roll itself on successive runs - i.e. the serializer ids have to remain stable and constant. Hell, maybe we force the user to specify that in order to take the computer out of the equation as a potential problem.

The text was updated successfully, but these errors were encountered:

Aaronontheweb · 2025-01-14T15:32:20Z

Previous AOT proof of concept: #6904

Aaronontheweb added discussion serialization perf labels Jan 14, 2025

Aaronontheweb added this to the 1.6.0 milestone Jan 14, 2025

Aaronontheweb added this to Akka.NET v1.6 Jan 14, 2025

Aaronontheweb moved this to Backlog in Akka.NET v1.6 Jan 14, 2025

This was referenced Jan 14, 2025

Epic: Akka.Remote Quic-based Transport (Artery) #7466

Open

Epic: Full AOT Support for Akka.NET #7246

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epic: Source-Generated Serialization #7465

Epic: Source-Generated Serialization #7465

Aaronontheweb commented Jan 14, 2025

Aaronontheweb commented Jan 14, 2025

Epic: Source-Generated Serialization #7465

Epic: Source-Generated Serialization #7465

Comments

Aaronontheweb commented Jan 14, 2025

Abstract

Requirements

Mandatory

Stretch Goals

Approach

Implementation

Aaronontheweb commented Jan 14, 2025