Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: No BEP discussing the UTXO implementation #20

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions utxo-info/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
```
shortname: ?/UTXO-IMPL
name: A short description of the current UTXO implementation
type: informational
status: Draft
editor: Vanshdeep Singh <vanshdeep@bigchaindb.com>
```

## Description
The UTXO (unspent transaction output) of each transaction is tracked using the a [merkle tree](https://en.wikipedia.org/wiki/Merkle_tree). The merkle root of this tree is used as the app hash and returned to Tendermint during `commit`. Below is a short summary of how the UTXO is calculated,

- Each unspent output is considered as a leaf node in the merkle tree.
- The unspent outputs are stored in `utxo` collection in MongoDB.
- The unspent outputs are added/removed from the `utxo` collection with each incoming transaction i.e. during `commit` when the transactions are being processed for bulk write the UTXO are updated in the `utxo` collection.
- Once the transactions in a block are processed and stored, all the objects in the `utxo` collection are fetched.
- This list of unspent outputs fetched from the `utxo` collection is considered as the leaves of the UTXO merkle tree. Refer the image below,
- The list of leaves is hashed and sorted after which the merkle root is calculated by pairing adjacent leaves.

![Merkel Tree](https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Hash_Tree.svg/800px-Hash_Tree.svg.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merkle

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🇩🇪 🌲



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not clear to me where one puts a new leaf node (UTXO). There are many possible spots to choose from. And it matters. The choice will affect the final hash. It seems that's where the AVL rules come into use. Is this right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, lets take a step back. The unspent outputs or UTXO is stored in the utxo collection in MongoDB (there is no insert order here). When the merkle root is calculated all the leaves are fetched from the utxo collection, the leaves are hashed (lets call this H), then this hash list is sorted (Hs) and the merkle root is caculated by pairing the leaves. So if a new leaf is inserted we don't particularly insert it to specific position in the tree, its position among leaves will be determinted by the sort function which sorts the list of hashes H.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, seems like we are relying on some mysterious sort function. It would be nice to know what it does and how it works. Can we reliably get a nice balanced tree doing this?

Copy link
Contributor Author

@kansi kansi Mar 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ttmc You can have a look at the code here. The list of hashes are sorted and the merkle root is calculated. Since the whole tree is re-built everytime the merkle root is calculated so it would be balanced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, rebuilding the whole tree from scratch every time will make it balanced, but that's a real waste of computing time and power. It's okay for now, but we need to do it smart in the future, using a proper IAVL+ library. That library should have a method like tree.insert_node(new_node) and then there will be other methods to compute the Merkle root of the tree (i.e. the app hash. That will also be fast, because most of those hash computations were done before and are stored somewhere handy).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ttmc You are right. We need to have some discussions around how to improve this implementation and make it crash recovery friendly.



## Copyright Waiver
To the extent possible under law, the person who associated CC0 with this work has waived all copyright and related or neighboring rights to this work.