Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
  • Loading branch information
ShubhamVashisth7 committed Jul 29, 2023
1 parent 7080766 commit b472a6e
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 3 deletions.
15 changes: 12 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,10 @@
## 📐 System Design
<p align="center"><img src="docs/graphics/architecture.png" alt="kgfarm" height="450" width="400"/></p>

<p align="justify">Data preparation and feature engineering are critical for improving model accuracy. However, data scientists often work independently and spend most of their time writing code for these steps without support for automatic learning from each other’s work. To address this challenge we developed KGFarm, a holistic platform automating data preparation and feature engineering based on machine learning models trained using the semantics of data science artifacts, including pipeline scripts applied to different datasets. We capture the semantics of these artifacts as a knowledge graph (KG). KGFarm provides seamless integration with existing data science platforms, enabling scientific communities to automatically discover and learn about each other’s work. We trained KGFarm’s models on top of a KG constructed from top-rated 1000 Kaggle datasets and 13800 pipeline scripts with the highest number of votes. KGFarm is tested on <a href="experiments/benchmark/README.md">130 unseen datasets</a> collected from different AutoML benchmarks to compare KGFarm against the state-of-the-art (SOTA) systems in data cleaning, transformation, and feature engineering. Our <a href="experiments/README.md">experiments</a> show that KGFarm consumes significantly less time and memory w.r.t the SOTA systems while achieving comparable or better accuracy than them. </p>
<p align="justify">Data preparation is critical for improving model accuracy. However, data scientists often work independently and spend most of their time writing code for preparing data without support for automatic learning from each other’s work. To address this challenge we developed KGFarm, a holistic platform for automating data preparation using machine learning models trained on knowledge graph capturing the semantics of data science artifacts, including datasets and pipeline scripts. KGFarm provides seamless integration with existing data science platforms, enabling scientific communities to automatically discover and learn about each other’s work.</p>

<br>
<p align="center" style="margin-top: 50px"><b>Unleashing the power of Automated <img src="docs/graphics/icons/data_preparation.gif" width="19%" style="margin-bottom: -9px"/></b></p>


## ⚡ Quick Start

Try the sample <a href="https://colab.research.google.com/drive/1u4z4EKGd8G1ju61Q3sPk5fH9BrMp8IRM?usp=sharing"><span style="color: orange;">KGFarm Colab Notebook</span></a> for a quick hands-on!
Expand Down Expand Up @@ -53,6 +51,17 @@ python build.py -db Database_name
## ⚙️ APIs & Library Interface
KGFarm APIs are designed to promote seamless integration with conventional ML workflows. For taking advantage of KGFarm with your data, checkout [KGFarm_tutorial.ipynb](docs/KGFarm_tutorial.ipynb).

[//]: # (## 🧪 Experiments )

[//]: # ()
[//]: # (We [evaluated]&#40;experiments/README.md&#41; KGFarm to several state-of-the-art systems on [130 open datasets]&#40;experiments/benchmark/README.md&#41;. More information regarding our evaluations per task is available below:)

[//]: # (1. [Data Cleaning]&#40;experiments/results/data_cleaning.pdf&#41;)

[//]: # (2. [Data Transformation]&#40;experiments/results/data_transformation.pdf&#41;)

[//]: # (3. [Feature Engineering]&#40;experiments/results/feature_engineering.pdf&#41;)

## <img src="docs/graphics/icons/youtube.svg" alt="youtube" height="20" width="29"> KGFarm Demo
<a href="https://rebrand.ly/kgfarm"><img src="docs/graphics/icons/kgfarm_tutorial.png"/></a>

Expand Down
Binary file modified docs/graphics/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b472a6e

Please sign in to comment.