Log storage durability plan

Durability plan for Log Storage

To better understand this plan you should be familiar with how Log Storage works.

Master / Slave approach

This approach considers that each box contains both the LogWriter and the IndexesLogServer. The API sends writes to both writers and only accepts the request if both succeed. This guarantees that all incoming data is being write in live segments in two boxes. The master IndexesLogServer behaves as usual dealing and serving the data. However, the slave one instead, periodically sync the dealt data from the master box. Once the data was synced, it can drop its live segments that are fully contained in the dealt data.

Failure handling

If one of the boxes fails to respond, errors will be sent to the final user until the decision is made that the box is lost. Then, if it's the slave, a new one will be spawned and the service is restored. If it's the master, then the slave should be promoted to master and new slave spawned.

Slave -> Master promotion

The promotion is trivial, since the slave has everything it needs to become a master, it has a copy of the data dealt up to a recent point, and every new change in live segments. Once promoted, it should continue dealing himself and start serving read requests.

Spawning new slaves

Simply creating the box and pointing the api to it would suffice. The IndexesLogServer would start syncing from the current master and eventually overlap the dealt data with the live data. Until this is achieved the slave should be marked incomplete. In case it needs to be promoted to master it can be, but the IndexesLogServer cannot deal segments nor serve reads. The dealt data needs to be recovered from backups to restore the full data. In the mean time, two boxes could be running simply writing live segments, once the master is fully restored the slave will sync up and become a valid slave.

Provide feedback

Saved searches