Skip to content

Code for the paper: Performance of Local Push Algorithms for Personalized PageRank on Multi-core Platforms. Link to the paper: https://par.nsf.gov/servlets/purl/10339666

License

Notifications You must be signed in to change notification settings

madhavaggar/Forward_Local_Push

Repository files navigation

Single Source Personalized PageRank

Tested Environment

  • Linux Ubuntu
  • C++ 11
  • GCC 4.8
  • Boost (Boost 1.65 on our machine)
  • cmake

Compile

$ cmake .
$ make

Parameters

./fora action_name --algo <algorithm> [options]
  • action:
    • query
    • topk
    • generate-ss-query: generate queries file
    • gen-exact-topk: generate ground truth by power-iterations method
  • algo: which algorithm you prefer to run
    • fwdpush: Forward Push
    • revpush: Reverse Push
  • options
    • --prefix <prefix>
    • --epsilon <epsilon>
    • --dataset <dataset> [Can use any dataset]
    • --query_size <queries count>
    • --k <top k>
    • --exact_ppr_path <directory to place generated ground truth>
    • --result_dir <directory to place results>

Data

The example data format is in ./data/webstanford/ folder. The data for DBLP, Pokec, Livejournal, Twitter, Friendster are not included here for size limitation reason. You can find them online. For datasets with node numbers greater than the node count kindly reassign remap variable in config.h to True and define maxid in attribute.txt and then recompile. Set the parameter after --dataset in order to change the dataset. You can download the datasets at https://snap.stanford.edu/data/. Define your attribute.txt under ./data/<dataset> along with graph.txt as given below.

m = (number of nodes), 
n = (number of edges), 
maxid = (maximum possible node ID)

make

Changing the Number of Threads

In order to change the number of threads you can set the NUM_THREADS macro under query.h to a number of your choice. Post the change please run:

make

Generate queries

Generate query files for the graph data. Each line contains a node id.

$ ./fora generate-ss-query --prefix <data-folder> --dataset <graph-name> --query_size <query count>
  • Example:
$ ./fora generate-ss-query --prefix ./data/ --dataset webstanford --query_size 1000

Query

Process queries.

$ ./fora query --algo <algo-name> --prefix <data-folder> --dataset <graph-name> --result_dir <output-folder> --epsilon <relative error> --query_size <query count> (--opt)
  • Example:
$ ./fora query --algo fwdpush --prefix ./data/ --dataset webstanford --epsilon 0.5 --query_size 20



## Top-K
Process top-k queries.

```sh
$ ./fora topk --algo <algo-name> --prefix <data-folder> --dataset <graph-name> --result_dir <output-folder> --epsilon <relative error> --query_size <query count> --k <k>
  • Example
$ ./fora topk --algo fwdpush --prefix ./data/ --dataset webstanford --epsilon 0.5 --query_size 20 --k 500

Exact PPR (ground truth)

Construct ground truth for the generated queries.

$ ./fora gen-exact-topk --prefix <data-folder> --dataset <graph-name> --k <k> --query_size <query count> --exact_ppr_path <folder to save exact ppr>
  • Example
$ mkdir ./exact
$ ./fora gen-exact-topk --prefix ./data/ --dataset webstanford --k 1000 --query_size 100 --exact_ppr_path ./exact/

About

Code for the paper: Performance of Local Push Algorithms for Personalized PageRank on Multi-core Platforms. Link to the paper: https://par.nsf.gov/servlets/purl/10339666

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published