Aspect Versioning
As each version of metadata aspect is immutable, any update to an existing aspect results in the creation of a new version. Typically one would expect the version number increases sequentially with the largest version number being the latest version, i.e. v1 (oldest), v2 (second oldest), ..., vN (latest). However, this approach results in major challenges in both rest.li modeling & transaction isolation and therefore requires a rethinking.
Rest.li Modeling
As it's common to create dedicated rest.li sub-resources for a specific aspect, e.g. /datasets/{datasetKey}/ownership, the concept of versions become an interesting modeling question. Should the sub-resource be a Simple or a Collection type?
If Simple, the GET method is expected to return the latest version, and the only way to retrieve non-latest versions is through a custom ACTION method, which is going against the REST principle. As a result, a Simple sub-resource doesn't seem to a be a good fit.
If Collection, the version number naturally becomes the key so it's easy to retrieve specific version number using the typical GET method. It's also easy to list all versions using the standard GET_ALL method or get a set of versions via BATCH_GET. However, Collection resources don't support a simple way to get the latest/largest key directly. To achieve that, one must do one of the following
- a GET_ALL (assuming descending key order) with a page size of 1
- a FINDER with special parameters and a page size of 1
- a custom ACTION method again
None of these options seems like a natural way to ask for the latest version of an aspect, which is one of the most common use cases.
Transaction Isolation
Transaction isolation is a complex topic so make sure to familiarize yourself with the basics first.
To support concurrent update of a metadata aspect, the following pseudo DB operations must be run in a single transaction,
1. Retrieve the current max version (Vmax)
2. Write the new value as (Vmax + 1)
Operation 1 above can easily suffer from Phantom Reads. This subsequently leads to Operation 2 computing the incorrect version and thus overwrites an existing version instead of creating a new one.
One way to solve this is by enforcing Serializable isolation level in DB at the cost of performance. In reality, very few DB even supports this level of isolation, especially for distributed document stores. It's more common to support Repeatable Reads or Read Committed isolation levels—sadly neither would help in this case.
Another possible solution is to transactionally keep track of Vmax directly in a separate table to avoid the need to compute that through a select (thus prevent Phantom Reads). However, cross-table/document/entity transaction is not a feature supported by all distributed document stores, which precludes this as a generalized solution.
Solution: Version 0
The solution to both challenges turns out to be surprisingly simple. Instead of using a "floating" version number to represent the latest version, one can use a "fixed/sentinel" version number instead. In this case we choose Version 0 as we want all non-latest versions to still keep increasing sequentially. In other words, it'd be v0 (latest), v1 (oldest), v2 (second oldest), etc. Alternatively, you can also simply view all the non-zero versions as an audit trail.
Let's examine how Version 0 can solve the aforementioned challenges.
Rest.li Modeling
With Version 0, getting the latest version becomes calling the GET method of a Collection aspect-specific sub-resource with a deterministic key, e.g. /datasets/{datasetkey}/ownership/0, which is a lot more natural than using GET_ALL or FINDER.
Transaction Isolation
The pseudo DB operations change to the following transaction block with version 0,
1. Retrieve v0 of the aspect
2. Retrieve the current max version (Vmax)
3. Write the old value back as (Vmax + 1)
4. Write the new value back as v0
While Operation 2 still suffers from potential Phantom Reads and thus corrupting existing version in Operation 3, Repeatable Reads isolation level will ensure that the transaction fails due to Lost Update detected in Operation 4. Note that this happens to also be the default isolation level for InnoDB in MySQL.