Skip to content

graphbig/graphBIG

Repository files navigation

Build Status

  ________                    .__   __________.___  ________
 /  _____/___________  ______ |  |__\______   \   |/  _____/
/   \  __\_  __ \__  \ \____ \|  |  \|    |  _/   /   \  ___
\    \_\  \  | \// __ \|  |_> >   Y  \    |   \   \    \_\  \
 \______  /__|  (____  /   __/|___|  /______  /___|\______  /
        \/           \/|__|        \/       \/            \/

GraphBIG

GraphBIG is a graph benchmarking effort initiated by Georgia Tech HPArch and inspired by IBM System G. By supporting a wide selection of workloads from both CPU and GPU sides, GraphBIG covers the broad spectrum of graph computing and fulfills multiple major requirements, including framework, representativeness, coverage, and graph data support.

Introduction

GraphBIG is a comprehensive benchmark suites for graph computing. The workloads are selected from real-world use cases of IBM System G customers. GraphBIG covers a broad scope of graph computing applications, much more than simple graph traversals. To ensure the representativeness and coverage of the workloads, we analyzed real-world use cases and summarized graph computing features by computation types and graph data sources. GraphBIG workloads then cover all major computation types and data sources.

GraphBIG benchmarks were built on an open source graph framework named "openG", which follows the similar design methodology as IBM System G framework. It represents architectural/system behaviors of real-world graph computing practices.

(For commercial packages of the IBM System G, please visit IBM System G)

Features

GraphBIG contains the following main features

  • Framework: based on the property graph framework from real-world graph computing practices
  • Representativeness: workloads are selected from real-world use cases
  • Coverage: covers multiple graph computation types, much more than just graph traversal
  • GPU: provides GPU workloads under the unified framework
  • Dataset: provides both real-world and synthetic datasets
  • C++ code base: pure C++ code requiring only C++0x. can be supported by most gcc versions
  • Standalone package: can be compiled without external libraries
  • Profiling tools: provides tools to profile the code section of interest with hardware performance counters (libpfm code is integrated)

Publication

Lifeng Nai, Yinglong Xia, Ilie G. Tanase, Hyesoon Kim, and Ching-Yung Lin. GraphBIG: Understanding Graph Computing in the Context of Industrial Solutions, To appear in the proccedings of the International Conference for High Performance Computing, Networking, Storage and Analysis(SC), Nov. 2015

Tutorial

The World is Big and Linked: Whole Spectrum Industry Solutions towards Big Graphs, IEEE BigData 2015, Oct. 2015

Updates

v3.2 is released. It includes a few new workloads, multiple issue fixes, simulation annotations, and a new compile/test structure. Please feel to free to contact us if you notice an issue.

Compile/Run

  • CPU benchmarks:
$ git clone https://github.com/graphbig/graphBIG.git GraphBIG
$ cd GraphBIG
$ cd benchmark
$ make clean all
$ cd [bench dir]
$ make run
$ cat output.log
  • GPU benchmarks:
$ git clone https://github.com/graphbig/graphBIG.git GraphBIG
$ cd GraphBIG
$ cd gpu_bench
$ make clean all
$ cd [bench dir]
$ make run
$ cat output.log

Documents

Documents can be found in the GraphBIG-Doc repository in the same graphbig organization.

Datasets

To cover the diverse features of graph data, GraphBIG present two types of graph data sets, real-world data and synthetic data. The real-world data sets can illustrate real graph data features, while the synthetic data can help workload characterizations because of its flexible data size.

The detailed dataset list and download links can be found at our wiki page.

Contributors

  • Lifeng Nai, Georgia Tech (lnai3 at gatech.edu / lifeng at us.ibm.com)
  • Yinglong Xia, IBM Thomas J. Watson Research Center (yxia at us dot ibm dot com)
  • Ilie G. Tanase, IBM Thomas J. Watson Research Center
  • Hyesoon Kim, Georgia Tech
  • Ching-Yung Lin, IBM Thomas J. Watson Research Center

Development

Want to contribute? Great!

GraphBIG benchmarks and underlying framework are C++ codes with a bit STL. You are more than welcome to contribute new workloads, new datasets, or new tools. Please feel free to contact us.

License

BSD license

Version

3.2

Contact us

Lifeng Nai (lnai3 at gatech.edu / nailifeng at gmail.com)

Or submit issues via github

Graph Computing, Hell Yeah!