- In this project I am using spark to read datasets
- Merging the data using spark.sql, making small matrices of data for training
- Using PySpark FPGrowth model and association rule on the metrices
To access all the datasets, Kaggle link: https://www.kaggle.com/blackgraywhitewhite/healthcare-dataset
To access complete notebook: https://colab.research.google.com/drive/1w-Q7aoweihgWMA8lsrL-TBZXpSejmOW1#scrollTo=7WLony0jO4We