Skip to content

mesh protocol

BlackEdder edited this page Sep 9, 2017 · 14 revisions

The protocol used by the mesh is JSON based. Here we describe the role and the JSON schema used for the different messages. The messages send between nodes can be subdivided into control messages and user messages. The control messages are send between nodes to exchange information about routing and sync the time between nodes, while the user messages are generated by the user and will send a JSON object either to a single node or broadcast it over the network.

The basic JSON schema for any message is:

{
    "dest": 887034362,
    "from": 37418,
    "type":6
}

where dest is the destination node, from is previous node the message passed through, type is the type of message and timestamp the time at which the message is send.

Addresses

Node address is derived from SOFT_AP mac address, actually last 4 bytes of mac address.

TCP connection

Connections between the nodes are TCP based. Since v0.6.0 messages can span multiple packets, but have to be explicitly terminated by a '\0' byte. In earlier versions each message needed to be send as a separate TCP packet.

Control packages

Control packages are only exchanged between neighbouring nodes and consist of a number of types.

Time sync

Nodes clocks are synchronized so that all them they share the same clock. So, they can run tasks synchronously.

Time in a node is synchronized with its neighbors. When node time is update all its connections but the just adjusted one is marked to be updated. So, mesh time is kept in sync in a few seconds span with a precision of a few milliseconds.

Time sync is started periodically every 10 minutes. A random delay (plus minus 35 %) is added to every iteration to avoid time sync collisions, although if happen they are considered and processed accordingly.

Besides this near periodic synchronization, a time sync is started every time a node connects to an AP.

Before starting a sync request, originating node decides if who (its pair or itself) should adapt its clock. Node with fewer connections and subconnections will adopt a new time to get synchronized. In case of an equal number of connections, the node bahabing as an AP will be the time master.

Sync precision is measured on every synchronization. If a minimum value of 10 ms is not achieved, the same process is repeated until that precision value is achieved. This normally takes 2 to 4 time sync requests to get good precision after first start up. This usually takes less than one second.

After network is synchronized accurately the number of required messages to maintain same precision is 1 or 2.

Time Sync JSON messages

After a node is first connected to mesh or connected to a new AP node it calculates the need to adopt other node's time as explained above.

If it has not to adopt time, then it asks the other party to request time using this message.

{
    "dest": 887034362,
    "from": 37418,
    "type":4,
    "msg":{
         "type":0
    }
}

The recipient will start a Time Sync procedure as follows.

On the other hand if it has to adopt time it send a time request like this:

{
    "dest": 887034362,
    "from": 37418,
    "type":4,
    "msg":{
        "type":1,
        "t0":32990
    }
}

t0 is internal clock value when the packet was generated in time sync adopter,

The recipient, then fills the response with other 2 timestams, t1 and t2.

{
    "dest": 37418,
    "from": 887034362,
    "type":4,
    "msg":{
        "type":2,
        "t0":32990,
        "t1":448585896,
        "t2":448596056,
    }
}

t1 is timestamp when request was received t2 is timestamp when response is generated

Adopter calculates t3as timestamp when response is received. In this example it can be 63221.

Time offset and round trip delay in adopter are calculated like

$$\text{offset} = \frac{t1 - t0}{2} + \frac{t2 - t3}{2}$$ $$\text{tripDelay} = (t3 - t0) - (t2 - t1)$$

In this example:

$$t0 = 32990, t1 = 448585896, t2 = 448596056, t3 = 63221$$

Then calculated values would be

$$\text{offset} = \frac{448585896 - 32990}{2} + \frac{448596056 - 63221}{2} = 448542871 \mu s$$ $$\text{tripDelay} = (63221 - 32990) - (448596056 - 448585896) = 20071 \mu s$$

Process is repeated the required number of times until calculated offset is less than 10 us. Normally, first sync takes 2 to 4 iterations.

This process is then repeated every 10 minutes ± 35%, to avoid time sync collisions.

Routing information

Routing information is shared in form of node synchronization messages. Every node inform its neighbors about all other nodes it is connected directly to and all their respective subconnections. In this way every node has a real time picture of the whole mesh and knows which nodes are connected to the mesh. This information is refreshed around every 3 seconds. Synchronization consists of a pair of message. First the node sends a NODE_SYNC_REQUEST to its neighbors. This message is of type 5 ("type": 5). Secondly, the neigbors reply with a NODE_SYNC_REPLY ("type": 6). Both messages have the following schema:

{
  "dest": ...,
  "from": ...,
  "type": ...,
  "subs": [
    {
      "nodeId": ...,
      "subs": [
        {
          "nodeId": ...,
          "subs": []
        }
      ]
    }
   ]
}

Every sub connection has a subs propery containing their sub connections again.

User packages

User messages can be sent over the network. Two kind of messages are possible.

Single addressed messages

These messages are tagged with the originator's node address and destination node address. The message type is 9 and a string containing the message is added, resulting in a the following JSON schema:

{
    "dest": 887034362,
    "from": 37418,
    "type":9,
    "msg": "The message I want to send"
}

Broadcast messages

Broadcast are virtually identical to single messages, but their type is set to 8 and the destination is equal to the receiving node id. When forwarding such a message the destination field is changed to the next node the message is send to.