Skip to content

DataSystemsGroupUT/PAPyA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prescriptive Performance Analysis in Python Actions

DOI

This library provides prescriptive analysis for the complex solution space of (RDF relational schema, Partitioning, and Storage Formats) that emerges with querying large RDF graphs over Relational Big Data (BD) System, e.g., Apache Spark-SQL.

  • Documentation and more details about PAPya

Installation

PAPyA is uploaded to PyPI package manager to help users find and install PAPyA easily in their environment.

To use PAPyA on an environment, run:

pip install PAPyA

Or clone this repo:

git clone https://github.com/DataSystemsGroupUT/PAPyA.git

Examples (PAPyA in practice)

Below are some examples of our running experiments using PAPyA library with different sets of configurations generated from ipython notebooks:

When running the experiments, We used two different datasets to test replicability on the system namely Watdiv and SP2Bench.

Watdiv

  • Full Experiment
    Complete running experiment without removing any configurations.
  • Mini
    Remove some configurations in each dimensions (schemas: {extvp, wpt} , partition: {predicate}, storage: {avro, csv}).
  • Single Partition
    Only having one partitioning technique (horizontal).
  • No ExtVp & WPT Remove extvp and wpt schemas from the configuration.

SP2Bench

  • No Partition
    Remove one dimension from the configuration (partition)