Skip to content

Latest commit

 

History

History
61 lines (40 loc) · 4.72 KB

README.md

File metadata and controls

61 lines (40 loc) · 4.72 KB

patest_new

Programs to test preferential attachment in a growing network (new implementation in c++).

This repository contains code and scripts that can be used to test for the presence of preferential attachment in growing networks such as Bitcoin and Ethereum. This code was used to create the main results in our paper:

Kondor D, Bulatovic N, Stéger J, Csabai I, Vattay G (2021).
The rich still get richer: Empirical comparison of preferential attachment via linking statistics in Bitcoin and Ethereum.
Frontiers in Blockchain, 4 (August), 35.
https://doi.org/10.3389/fbloc.2021.668510
https://arxiv.org/abs/2102.12064

Data used in our analysis is available as separate downloads for Bitcoin and Ethereum.

Programs included

This repository includes various code, organized under the following subdirectories:

  • patestgen: Code that reads a set of edges and nodes, and creates a set of ''events'' that can be used to generate transformed rank statistics.
  • patestrun: Code that calculates transformed ranks for a set of events (patest_ranks.cpp) or for a set of transaction inputs and outputs, based on the balance distribution (patest_balances.cpp). Both are based on a custom binary tree implementation that allows the efficient computation of partial sums of a function over ordered sets and maps.
  • misc: Additional code used during preprocessing and programs to calculate the indegree and balance distribution of the networks at given time intervals.

Scripts included

  • compile_programs.sh: Simple commands to compile all C++ code with GCC. Feel free to open an issue if you run into any issues or if you feel that using a build system should be necessary.

Bitcoin

  • bitcoin_download.fsh: Script to download Bitcoin data used in our paper from Dryad.
  • bitcoin_preprocess.sh: Preprocessing needed to generate a list of edges from the list of transaction inputs and outputs.
  • bitcoin_patest_run.fsh: Main computations for calculating the transformed rank statistics.
  • bitcoin_figs.fsh: Script to generate the main figures in the paper and many additional variants.

Ethereum

  • eth_download.sh: Script to download the Ethereum dataset used in our paper from Zenodo.
  • eth_preprocess.sh: Script to preprocess the Ethereum dataset to create properly sorted lists of edges.
  • eth_patest_run.fsh: Main computations for calculating the transformed rank statistics for Ethereum.
  • eth_figs.fsh: Script to generate the main figure in the paper and many additional variants.

Note: all scripts could be run as a whole. It might still make sense to run them step-by-step, especially the main processing scripts that calculate many variants of the main research question. Some of the steps could be run in parallel, if there is enough memory.

Requirements

  • At least 64 GiB memory to process the Bitcoin network, at least 4 GiB for the Ethereum network.
  • At least 150 GiB free disk space for the Bitcoin network (including all the raw and processed data and results); at least 20 GiB free disk space for the Ethereum network.
  • A C++ compiler that supports the C++14 standard. GCC is used in the example scripts, but other compilers should work as well. Feel free to open an issues if your compiler does not work. GCC is typically available as the g++ package on Linux distributions including recent versions of Ubuntu.
  • The bash shell, typically the default shell on many Linux distributions or available as a package.
  • The fish shell, typically available as a package (fish) on most Linux distributions.
  • awk and mawk, typically installed by default on most Linux distributions. If mawk is not available on your system, feel free to replace it with awk everywhere (mawk is only used as a performance optimization).

The above packages should be available by default or via Homebrew on MacOS as well. On Windows, it is recommended to use WSL.

Additional requirements for creating figures

The following programs are only required if you use the included scripts to create figures (i.e. bitcoin_figs.fsh and eth_figs.fsh). The main analysis can be run without these, and you can use any other software to create figures.

  • gnuplot
  • The epstopdf script, included in the texlive-font-utils package on Ubuntu.
  • ImageMagick, the convert utility is used to create PNG figures