Prescriptive Performance Analysis in Python Actions
This library provides prescriptive analysis for the complex solution space of (RDF relational schema, Partitioning, and Storage Formats) that emerges with querying large RDF graphs over Relational Big Data (BD) System, e.g., Apache Spark-SQL.
- Documentation and more details about PAPya
PAPyA is uploaded to PyPI package manager to help users find and install PAPyA easily in their environment.
To use PAPyA on an environment, run:
pip install PAPyA
Or clone this repo:
git clone https://github.com/DataSystemsGroupUT/PAPyA.git
Below are some examples of our running experiments using PAPyA library with different sets of configurations generated from ipython notebooks:
When running the experiments, We used two different datasets to test replicability on the system namely Watdiv and SP2Bench.
- Full Experiment
Complete running experiment without removing any configurations. - Mini
Remove some configurations in each dimensions (schemas: {extvp, wpt} , partition: {predicate}, storage: {avro, csv}). - Single Partition
Only having one partitioning technique (horizontal). - No ExtVp & WPT
Remove extvp and wpt schemas from the configuration.
- No Partition
Remove one dimension from the configuration (partition)