Skip to content
This repository has been archived by the owner on Jun 1, 2021. It is now read-only.

Advanced schema evolution support #224

Open
krasserm opened this issue Feb 27, 2016 · 0 comments
Open

Advanced schema evolution support #224

krasserm opened this issue Feb 27, 2016 · 0 comments

Comments

@krasserm
Copy link
Contributor

krasserm commented Feb 27, 2016

Update: complete rewrite by @volkerstampa in Nov 2016

Event schema evolution

Eventuate based applications persist DurableEvents in event logs. Each DurableEvent carries a application defined payload the events of the application. During the lifetime of an application its event model may evolve, for example fields may be added to or removed from individual events, new events might be introduced or support for obsolete events dropped. To avoid persistent migration of an entire event log the application should be enabled to process old binary representations when receiving them through replication or reading them from an event log to ensure consistency within a location and across locations.

Scope

The main purpose of schema evolution is to avoid persistent migration of event logs. In this context Eventuate has to ensure that evolving an event schema is possible in a distributed setup of multiple replicated and/or collaborating applications without requiring a stop the world approach and thus allowing to maintain availability of locations and replication amongst them while other locations are upgraded.

Schema changes

To ease development the application logic should not have to deal with multiple event schemas. Instead the application should be provided with the current view of the events. How to ensure a consistent view in case of individual schema changes is out of scope of this issue. This has to be handled by the corresponding serializers. The section on Examples for schema evolution lists typical schema changes and how to deal with them. So the application defined serializers should always be able to read events they have read or written before. Note that this does not necessarily include replicated events the application has not seen before.

Upgrade scenarios

In this context only application upgrades that involve changes in the event schema are relevant. Depending on the use case different strategies can be correct when dealing with replication and processing of events in a network of partially upgraded locations. Eventuate offers options that enable applications to ensure their state consistency requirements. The section on Examples for upgrading distributed applications list for typical scenarios, suitable Eventuate options and the consequences for event replication and processing during the upgrade.

Problem space

Affected serialization processes

Problems regarding schema evolution can occur when deserializing events. In contrast, problems when serializing an event are considered a bug in the application. An Eventuate application deserializes DurableEvents during the following processes:

  1. reading replicated DurableEvents from another location over the net (deserialize) and writing them to the local log (serialize) as well as delivering them to event-sourced components
  2. reading DurableEvents from the local log (deserialize) to apply replication filters and replicate them to another location over the net (serialize)
  3. reading DurableEvents from the local log (deserialize) to deliver them to event-sourced components

Assuming replication filters based on the application defined payload of a DurableEvent, each process requires deserialization of this payload.

Potential deserialization problems

In the face of schema evolution two problems can occur when deserializing replicated events:

  1. Deserialization failure. A typical example for this is that a new event is introduced and gets persisted at an already upgraded location and replicated to a location that is not yet upgraded.
  2. Incomplete deserialization as parts of the serialized data are ignored. Accumulated state may differ from locations where those parts are not ignored. A typical example for this is that a field is added to an event and the event gets persisted at an already upgraded location and replicated to a location that is not yet upgraded.

Invariants

The following invariants regarding replicated events are guaranteed by Eventuate.

  1. The binary representation of an event is identical at all locations.
  2. Assuming that an application is always able to read (deserialize) events it has read or written before, Eventuate guarantees that its local event log will never contain events the application cannot deserialize.

The consequences of these guarantees are

  1. Replication of events must preserve their binary representation.
  2. When reading a replicated event from the net that cannot be deserialized it must not be written to the local log.

The second guarantee ensures that reading events from the local log for replication (process 2) or local consumption (process 3) will never see deserialization failures (problem 1). With the first guarantee incomplete serialization (problem 2) is not an issue for reading events for replication (process 2) as even if an event can only incompletely be deserialized it is still completely replicated to other locations (by preserving binary representation).

Behavior in case of deserialization problems

Even with the invariants in place, reading replicated events from the net (process 1) may face both deserialization problems (problem 1, 2) and reading events for local consumption (process 3) may face incomplete deserialization (problem 2). The idea is to enable the application to deal with deserialization problems only when reading replicated events from the net (process 1). The application can decide if the replication shall be stopped, continued or the affected events should be filtered. If the application needs to prevent incomplete deserialization (problem 2) during local consumption (process 3) to ensure its state consistency, it must prevent that such events are written to the local log during replication by stopping the replication or filtering the events.

Event Versioning

Incomplete event deserialization cannot be detected by the serializer as there is no deserialization failure. To allow the replication to detect that, event version information must be transmitted along with the binary representation and the manifest. Event versioning should follow semantic versioning so that potentially incomplete deserialization can be detected by differing minor version numbers.

For this Eventuate defines a special API for event serializers that allows to retrieve the event version that is supported by the serializer for each event manifest:

trait EventPayloadSerializer {
  def eventVersion(manifest: String): EventVersion
}

With this incomplete deserialization is equivalent to event minor version incompatibility.

Options when reading replicated event from the net

For each deserialization problem the application can define how Eventuate shall behave:

  1. On deserialization failure:
    • Stop replication, i.e. this event and no further events from the same source log is written to the local log until the deserialization works again (after an application upgrade).
    • Filter affected event, i.e. this event is not written to the local log, but replication from the same source log continues with following events.
  2. On incomplete deserialization (aka event minor version incompatibility):
    • Stop replication (as above).
    • Filter affected event (as above).
    • Continue with replication just like normal including the event that can only incompletely be deserialized (but preserve binary representation of event when writing it to the local log).

These strategies can be applied depending on the source log id where the replicated event is read from.

Examples for upgrading distributed applications

Replicated application

Scenario: An application running in multiple locations with state being replicated is upgraded one location after the other to retain availability. Temporarily running different versions in different locations concurrently must not lead to permanent divergence of replicated state.

Behavior: Stop the replication from a location in case of any deserialization problem.

Upgrade: Locations can be upgraded one after the other in any order.

Collaborating application, all upgraded

Scenario: Multiple applications collaborating remotely over a replicated event log are upgraded one after the other and the schema changes affect the collaboration (i.e. the shared events). Temporarily running some of the applications with the old schema must not lead to event loss. For example if a new event type cannot be deserialized an application that was not yet upgraded it must not simply be ignored by this application.

Behavior: Stop the replication in case of deserialization failures. In case of incomplete deserialization (aka event minor version incompatibility) depending on the application's requirements:

  • Stop the replication (limiting replication availability) or
  • continue with it (risking state divergence)

Upgrade: To minimize downtime of the replication, the location that actively emits new or updated events should be upgraded last so that all other (consuming or reacting) locations are already able to properly deserialize all events. If multiple locations actively emit new or updated events replication downtime can be minimized by ensuring that new or updated events are only produced after all locations have been upgraded (e.g. though configuration or a two step upgrade).

Two collaborating applications, only one upgraded

Scenario: Two applications collaborate remotely over a replicated event log and only one is upgraded while the other one is not upgraded any more and schema changes affect the collaboration. In this case the applications will permanently run with different versions of the shared schema and this must not lead to a permanent interruption of the collaboration, while ignoring unknown events is acceptable.

Behavior: Filter events on deserialization failures. In case of incomplete deserialization (aka event minor version incompatibility) depending on the application's requirements events are filtered or replication continues.

Upgrade: Only one application is upgraded.

Two replicated applications collaborate, only one upgraded

Scenario: Two applications both replicated to multiple location collaborate remotely over a replicated event log and only one is upgraded while the other one is not upgraded any more and schema changes affect the collaboration. In this case the applications will permanently run with different versions of the shared schema and this must not lead to a permanent interruption of the collaboration, while ignoring unknown events is acceptable.

Behavior: In case of any deserialization problems for events from:

  • application internal locations: Stop the replication
  • application external locations:
    • Filter events or
    • Stop the replication (which is the recommended strategy in case of collaborating applications where all applications are upgraded). To prevent that the replication is actually stopped the upgraded application must filter incompatible events (through a normal replication filter).

Upgrade: The locations of the upgraded application can be upgraded one after the other in any order.

Examples for schema evolution

Add a field to an existing event

Add country to support international addresses:

CustomerMoved(id: String, street: String, city: String, zipCode: String) -> 
CustomerMoved(id: String, street: String, city: String, zipCode: String, country: String)

How to handle with protobuf

Add optional field country with default value that matches the previously supported home country.

optional string country = 5 [default = "US"];

Alternatively the default value can be provided by the serializer implementation instead of the protobuf schema (note that schema defined default values are not supported in proto3).

Rename existing field of an existing event

Rename country to countryCode for consistent naming:

CustomerMoved(id: String, street: String, city: String, zipCode: String, country: String) ->
CustomerMoved(id: String, street: String, city: String, zipCode: String, countryCode: String)

How to handle with protobuf

As protobuf identifies fields by their field ids and not their names a field of a message can simply be renamed as long as the field id remains the same:

optional string countryCode = 5 [default = "US"];

Remove existing field of an existing event

Remove redundant timestamp as this is already tracked in the DurableEvent:

CustomerMoved(id: String, street: String, addressLine: String, city: String, zipCode: String, timestamp: Instant) ->
CustomerMoved(id: String, street: String, addressLine: String, city: String, zipCode: String)

How to handle with protobuf

Remove the field from the schema. Note that a message serialized with the new schema will result in default values when deserialized with the old schema. If the application cannot handle this correctly replication from new to old locations must be prevented by setting event serialization versions accordingly or making deserialization on old locations fail.

Split existing field of an existing event

addressLine containing a multi-line string is split into several addressLinei each containing just one line:

CustomerMoved(id: String, addressLine: String, street: String, city: String, zipCode: String) -> 
CustomerMoved(id: String, addressLine1: String, addressLine2: String, addressLine3: String, street: String, city: String, zipCode: String) 

How to handle with protobuf

Depending on the requirements a valid option could be to rename addressLine to addressLine1 and add addressLine2, addressLine3. When deserializing and old representation addressLine2 and addressLine3 are unset and the single addressLine could be split into multiple fields. Typically if all replication partners will get upgraded eventually replication from new to old locations must be prevented by setting event serialization versions accordingly or making deserialization on old locations fail.

Alternatively the protobuf schema is kept and address lines are always split/joined on de-/serialization.

Join existing fields of an existing event

street and streetNumber are joined into a single street:

CustomerMoved(id: String, street: String, streetNumber: String, city: String, zipCode: String) -> 
CustomerMoved(id: String, street: String, city: String, zipCode: String) 

How to handle with protobuf

Assuming that the information must not get lost the new schema must be able to read streetNumber from old serializations. So the field cannot be dropped from the protobuf schema. Instead it can be joined on deserialization. If it is not split on serialization (and all replication partners get upgraded eventually) replication from new to old locations must be prevented by setting event serialization versions accordingly or making deserialization on old locations fail.

Add sub type used in an existing event

A new address-subtype AddressDE is added to model the addresses of a new country:

CustomerMoved(id: String, address: Address)
AddressUS(street: StreetUS, city: CityUS, zipCode: ZipCodeUS) extends Address ->
AddressDE(street: StreetDE, city: CityDE, zipCode: ZipCodeDE) extends Address

How to handle with protobuf

The are various options on how to handle sub-classing in protobuf. Typically deserialization on old locations will fail and (if all locations get upgraded eventually) thus prevent further replication from new to old locations.

Rename a sub type used in an existing event

AddressGermany is renamed to AddressDE to avoid ambiguities:

AddressGermany(street: StreetDE, city: CityDE, zipCode: ZipCodeDE) extends Address ->
AddressDE(street: StreetDE, city: CityDE, zipCode: ZipCodeDE) extends Address

How to handle with protobuf

If the concrete sub-type is for example transmitted in form of an enum the message can simply be renamed as long as the enum-id remains the same.

enum TypeId {
  ...
  ADDRESS_DE = 5
}

message AddressDE {
...
}

Remove a sub type used in an existing event

AddressDE is no longer needed as shipping to Germany is no longer supported:

AddressDE(street: StreetDE, city: CityDE, zipCode: ZipCodeDE) extends Address ->

How to handle with protobuf

An unknown type can be deserialized into an application defined tombstone. The event containing the tombstone can then be ignored in the application logic.

Add a new event type

CustomerMoved event added to support use case:

->
CustomerMoved(id: String, street: String, city: String, zipCode: String) 

How to handle with protobuf

Adding a new event type is very similar to Add sub type used in an existing event. Deserialization of the new event will fail on old locations and (if all locations get upgraded eventually) thus prevent further replication from new to old locations.

Rename existing event type

AddressChanged renamed to CustomerMoved to match use case:

AdressChanged(id: String, street: String, city: String, zipCode: String) ->
CustomerMoved(id: String, street: String, city: String, zipCode: String) 

How to handle with protobuf

As long as the string manifest for the event is kept stable the protobuf message can simply be renamed.

Remove existing event type

CustomerBlinked removed as not evaluated any more:

CustomerBlinked(id: String) ->

How to handle with protobuf

Removing an existing event is very similar to Remove a sub type used in an existing event. The removed type can be deserialized into an application defined tombstone that can be ignored by the application logic.

Replace existing event type with multiple new types

AddressChanged replaced by CustomerMoved or AddressTypoFixed to match use case:

AdressChanged(id: String, street: String, city: String, zipCode: String) ->
CustomerMoved(id: String, street: String, city: String, zipCode: String) 
AddressTypoFixed(id: String, street: String, city: String, zipCode: String) 

How to handle with protobuf

The message definition for the replaced event cannot be dropped. On deserialization it has to be converted to one of the replacing types. Deserialization of the new events on old locations will fail and (if all locations get upgraded eventually) thus prevent further replication from new to old locations.

Split existing event type into multiple new types

CustomerUpdated split into CustomerNameChange and CustomerMoved to match use case:

CustomerUpdated(id: String, name: String, street: String, city: String, zipCode: String) ->
CustomerMoved(id: String, street: String, city: String, zipCode: String)
CustomerNameChanged(id: String, name: String)

Splitting an event into two is currently not supported by Eventuate as this would violate the uniqueness constraints for vector timestamps.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant