🚧 This documentation is for the developer preview of m-ld.

m-ld Specification

This documentation presents the implementation-independent technical specification of m-ld. It covers internal and external APIs and protocols. Architecture, principles, use-cases, a list of clone engine implementations, and guiding narrative can be found on the m-ld documentation portal.

Terminology

Term	Definition
domain	A logical, decentralised set of shared data
clone	One of many physical containers of the domain data, typically embedded in the app
(clone) engine	A clone implementation, which presents the clone API to an app and enacts the clone protocol
app	The domain-specific application code that uses m-ld for data sharing
subject	A resource, represented as a JSON object, that is part of the domain data

Clone API

This narrative and the associated API definitions use Typescript as an abstract specification language, for familiarity. m-ld itself is language- and platform-agnostic. Clone engine implementations provide language-specific bindings which will differ from these definitions, both syntactically and in the completeness with which the engine implements them.

Initialisation

A m-ld clone must be initialised before it is ready to participate in data transactions. To initialise, it must be configured with (at least):

A domain name, representing the domain identity, such as test.m-ld.org. This is used to establish communication with other clones.
A default JSON-LD context (optional). This can be used to simplify entity and property identities when using the clone API.

m-ld natively uses JSON-LD as its data syntax, to ensure the widest possible applicability and ease of integration with existing systems. However, it is not generally necessary for users to have intimate knowledge of JSON-LD and Linked Data unless in advanced use-cases.

The clone API comprises three primary methods for interacting with data:

Read transactions, for an app to read data
Write transactions, for an app to write data
Followed events, for an app to react to data changes

In addition, it is possible to react to clone status via the status property.

During initialisation, a clone will determine its initial 'online' status, and if possible, rev-up with recent updates from the domain. See the Clone Protocol for more details.

Transactions

The read and write APIs take a single parameter, a JSON object which declaratively describes the transaction. The read method returns an observable stream of subjects, which represent query results.

The transaction request JSON object and each returned subject are a json-rql Pattern and Subject, respectively. json-rql is a superset of JSON-LD, designed for query expressions. The following provides an informal introduction to the syntax. Note that clone engines may legitimately offer a limited subset of the full json-rql syntax. Check the engine documentation for details.

The simplest transaction inserts some data. In this case the transaction description is just the data, a JSON subject, such as:

{
  "@id": "fred",
  "name": "Fred"
}

No data is returned.

The subject is identified in the domain by the keyword property @id. The value of this property must be unique.

This property is defined to be an IRI, but by default, m-ld will scope a relative IRI to the domain. For example, if the domain is test.m-ld.org, this subject's identity will actually be http://test.m-ld.org/fred. This scoping is not significant in most use-cases, since queries for this data also use and retrieve the un-scoped identity, as shown below.

To retrieve this data subject, a Query JSON object is used as the transaction description. A query uses a keyword property, in this case @describe, to indicate the data filter and return format:

{
  "@describe": "fred"
}

The return stream contains a single subject:

{
  "@id": "fred",
  "name": "Fred"
}

In this case the response to the query returns an identical subject to that first inserted. In general though, the inserted subject can be an arbitrarily nested JSON object, but a describe query will only return the top-level attributes.

A key difference between m-ld and typical JSON stores is that in m-ld, the JSON is a representation of a graph, and there is no storage of the original structure of any subject.

This affects how write transactions are processed. All raw subject transactions are treated as insertions to the data that already exists. For example, following the above transactions with:

{
  "@id": "fred",
  "age": 40
}

results in data that will be Described as:

{
  "@id": "fred",
  "name": "Fred",
  "age": 40
}

In order to update a subject with changed data, it is necessary to explicitly remove unwanted old data. This can be done with the more verbose Update syntax, for example:

{
  "@delete": {
    "@id": "fred",
    "name": "Fred"
  },
  "@insert": {
    "@id": "fred",
    "age": 40
  }
}

See the Data Semantics section below for more detail of subject representation.

The need for explicit removal of prior data can lead to unexpected data structure changes if not accounted for. Some clone engines provide an explicit PUT- or UPDATE-like API to reduce verbosity. However similar situations can also arise due to concurrent data changes, so it is important for an app to be aware of this characteristic.

The query language also supports @select statements, which are able to gather data values in arbitrarily complex ways from subjects in the domain. This requires the use of a @where clause and Variables, which are placeholders for subject keys, properties or values. For example:

{
  "@select": "?nm",
  "@where": { "@id": "fred", "name": "?nm" }
}

The return stream contains a single pseudo-subject with matching values for the variable:

{
  "?nm": "Fred"
}

🚧 Further documentation and examples coming soon. Please get in touch to tell us about your use-case!

Events

Whenever data changes in a clone, an update event is notified to "followers" who have subscribed using the follow API. Data can change due to both local and remote transactions, so this API is essential for an app to maintain a current view on the domain data. Such a view may be used for:

Displaying the live data to the user
Synchronising the data to some other (non-m-ld) database
Indexing the data in some domain-specific way

Each update has a strict structure indicating the data that has been deleted and inserted, in both cases as arrays of Subjects. Note that each subject is partial: it contains only the properties that were affected by the transaction.

For example, given the following subject:

{
  "@id": "fred",
  "name": "Fred"
}

and the following transaction (either remotely or locally):

{
  "@id": "fred",
  "age": 40
}

The resultant update event will include:

{
  "@delete": [],
  "@insert": [{ "@id": "fred", "age": 40 }]
}

On receipt of this update the app may not need to know any more about the current state of the object, for example because it is already displayed in the user interface; and the update can be trivially applied. If this is not so, then the app can make a query to retrieve current state.

Since data updates can arise at any time, to guarantee consistency in downstream data representations like a database, care may need to be taken to ensure that asynchronous queries do not receive data from more recent updates than intended. Update events and clone status include a field for local logical clock ticks, which can be used by the clone engine to identify a specific data snapshot. However due to differences in engine data stores and language concurrency models, engines may vary in how this field is used. Check the engine documentation for the necessary details.

Status

A clone engine's status can be obtained using the status property. This provides the current status description, an observable stream of changing status, and a way to await a particular status. This can be used to refine the app's behaviour depending on its requirements, for example:

Awaiting the latest data before showing the user interface (see Initialisation)
Warning the user or disabling features when offline
Additional data safety measures when the clone is 'siloed'

Data Semantics

Data in m-ld is structured, stored as a graph, and represented as JSON in the clone API.

This graph nature, along with the convergence model for concurrent updates, gives rise to the following set of semantic rules, awareness of which will help an app developer to correctly handle the data.

Subjects

A top-level JSON object represents a Subject, that is, something interesting to talk about in the domain.

Every Subject may have an identity, given with the @id field. If an identity is not provided on first insertion, an identifier will be generated of the form .well-known/genid/GUID, which is visible when querying the Subject.
```
"@id": "fred",
```
Properties of a Subject can be:
- The native JSON atomic values: strings, numbers, booleans
```
"name": "Fred Flintstone",
```
- Other Subjects, represented as JSON objects
```
"address": { "number": 55, "street": "Cobblestone Rd" },
```
- References to other Subjects: JSON objects with a single @id field
```
"spouse": { "@id": "wilma" },
```
- Arrays of any of the above.
Array properties have Set semantics by default, unlike normal JSON arrays, unless they are qualified with the @list keyword (see next). They do not contain duplicate members, and they are unordered. Insertion of duplicate values in a transaction results in only one of the values being stored.
```
"interests": ["bowling", "pool", "golf", "poker"],
```
A Subject having an @list property represents a list. The value of the @list key is the full, ordered content of the list (if an array), or a set of index-item pairs (if a hash). See Lists below for more details.
```
"episodes": {
   "@list": ["The Flintstone Flyer", "Hot Lips Hannigan", "The Swimming Pool"]
},
```
In the absence of a single-valued constraint (see below), any property of a Subject except the @id property can become multi-valued (an array) in the data. This can happen by inserting a value without deleting the old one, or due to conflicting edits.
```
"height": [5, 6]
```
In the absence of a mandatory constraint (see below), any property of a Subject except the @id property can become empty (see next).
When accepting data in a transaction, the following JSON values are equivalent, and represent an empty property:
- an empty array ([])
- null
- omission of the property
In particular, it is not possible to 'nullify' a value using an @insert clause, because passing a value of null actually tells the engine that the transaction has nothing to say about the value – as if it was not mentioned at all. To remove a value, it is necessary to use a @delete clause.
When providing data in response to a Read transaction, an engine will never emit null or an empty array ([]) – the property will be omitted.

Constraints

A 'constraint' is a semantic rule that describes invariants about the data. As part of m-ld's concurrency model, engines may provide a set of available constraints that can be declared in the engine initialisation.

🚧 Inclusion of declarative integrity constraints in m-ld is an experimental feature, and the subject of active research. The available constraints and the means by which they are declared for a domain is likely to change. Please do get in touch with your requirements.

Declarative constraints have two modes of operation:

They are checked during a local write transaction. If the update violates the constraint's invariant, then the transaction fails and no data changes are made. This is a 'fail fast' mechanism which prevents the majority of violations before they are committed to the domain.
They are applied when remote updates are applied to the local data. Application of a constraint involves an automatic resolution, the rules for which are defined in the constraint. This mechanism catches invariant violations arising due to conflicts between clone updates.

The following is a list of candidate declarable constraints. See the engine documentation for supported constraints and syntax.

single-valued: A subject property must have a single atomic value.

Conflict Scenario: Any subject property in the domain can become multi-valued (an array) if concurrent inserts are made to the same subject property.

Resolution: Pick a 'winning' value using a rule. This could be based on the conflicting values (e.g. maximum or average), or based on another property value (e.g. a timestamp).
mandatory: A subject must have a value for a property.

Conflict Scenario: If one app instance removes a subject in its entirety at the same time as another app instance updates a property, then the updated property value remains in the converged domain – all other properties are now missing, even if mandatory. (Note that neither app instance violated the rule locally.)

Resolution: Treat a subject without a value for a mandatory field as an invalid subject. The subject is deleted (note that in the conflict scenario, this was the intention of one of the updates).
unique: A set of subjects in the domain (e.g. of a specific type) must have unique values for a property (besides their identity).

Conflict Scenario: Concurrent updates to two different subjects could both update the property to the same value.

Resolution: Decide the Subject to receive the conflicting value. Delete the other subject's property. If the property is mandatory, revert the value to the previous (it must exist in the same transaction).

Lists

As noted above, plain JSON arrays as Subject property values are interpreted as unordered sets. However as in most programming languages, an ordered collection or list is also natively supported by m-ld, using additional syntax as follows.

A list in m-ld is a kind of Subject. It and can therefore have an identity and properties. It differs from a normal Subject by the inclusion of the @list keyword.

{ "@id": "shopping", "@list": ["Bread", "Milk"] }

This syntax is a super-set of standard JSON-LD, which does not permit a list object to have other properties. JSON-LD list objects can be loaded into m-ld as anonymous Subjects, but the reverse is typically not possible without some pre-processing.

The value of the @list property represents the ordered collection of 'items', which can be any normal Subject property value type such as JSON values (except null), Subjects and References. Duplicate items are allowed, and will remain duplicated when retrieved.

When retrieving a list, the contents of the @list property will always be consistently ordered. In the example above, "Milk" will always follow "Bread" unless an update has been made to the list.

Updating and querying a list makes use of an alternate syntax for the @list key, using a JSON object to specify index positions.

{ "@insert": { "@id": "shopping", "@list": { "2": "Spam" } } }

This appends "Spam" to the shopping list at index position 2. After this update, the shopping list content will be ["Bread", "Milk", "Spam"]. If the given index position was "1" instead, the final content would be ["Bread", "Spam", "Milk"].

Each key of the @list value object must be a non-negative integer JSON number or a variable. As with all JSON keys, this must be surrounded by quotes. Any other key format will cause an error.
Keys whose value is greater than the list length are interpreted as the list length. It is not possible to create a 'sparse' list with empty or undefined values.
Keys always represent indexes in the list before any other part of the update has been processed.

A variable can be used in the index or item position to query a list. The following query selects the index position of "Spam" in the shopping list.

{
  "@select": "?spamIndex",
  "@where": { "@id": "shopping", "@list": { "?spamIndex": "Spam" } }
}

The following query selects the item at position 1 in the shopping list.

{
  "@select": "?item",
  "@where": { "@id": "shopping", "@list": { "1": "?item" } }
}

It is therefore possible to delete items from a list using this syntax; for example, the item at index 1 regardless of its value:

{ "@delete": { "@id": "shopping", "@list": { "1": "?" } } }

Moving an item in a list can require a little more syntax, depending on the meaning of the list and so how concurrent edits should be understood. Like all transactions in m-ld, a move comprises a delete and an insert. Since list items can have duplicates, the outcome of a concurrent move of an item to two different locations could be:

The item now exists in both locations. This would make sense if the list were the notes of a piece of music.
The item is moved to one of the locations, but not both. This would make sense for a shopping list.

In m-ld, the latter meaning is captured with the concept of list slots. In this case it's not the item that is moved but the slot – like a box containing the item. Slots can only appear once in a list, so the final position is chosen as one of the two user-specified positions.

Most of the time list slots are implicit in the interface, for simplicity. They can be made explicit using the keyword @item. Just like a list is identified by having a @list property, a slot is identified by having an @item property.

{ "@insert": { "@id": "shopping", "@list": { "2": "Spam" } } }

(implicit slot) is the same as:

{ "@insert": { "@id": "shopping", "@list": { "2": { "@item": "Spam" } } } }

(the explicit slot is { "@item": "Spam" }). A slot is a Subject, and has an @id, which is normally automatically generated for implicit slots.

Using slots, it is possible to move an item as follows:

{
  "@delete": {
    "@id": "shopping",
    "@list": { "2": { "@id": "?slot", "@item": "Spam" } }
  },
  "@insert": {
    "@id": "shopping",
    "@list": { "0": { "@id": "?slot", "@item": "Spam" } }
  }
}

This moves the slot containing Spam at index 2 to the head of the list.

Clone Protocol

🚧 Documentation coming soon. If you are interested in the protocol details, or in developing a clone engine, please do get in touch.

Index

Enumerations

MeldErrorStatus

Interfaces

Generated using TypeDoc. Delivered by Vercel. @m-ld/m-ld-spec - v0.7.0 Source code licensed MIT. Privacy policy