Privacy preserving data mining become a serious concern nowadays. There is a demand to make sure all data which used in data mining process to be secure and doesn't violate any privacy infringement. This software use randomization techniques to eliminate privacy in data but still can be used for data mining. Two randomization techniques are implemented in this software i.e. Random Rotation Perturbation and Random Projection Perturbation.
This software is intended to be built for a research namely Analysis of Random Rotation Perturbation and Random Projection Perturbation Techniques in Randomizing Data for Data Mining.
This software can be run in any environment which support Python 3. You need to install the following program. The following prerequisites installation instruction will be provided in the next section.
- Python 3 - https://www.python.org/
- Kivy 2 - https://kivy.org/
- Plyer - https://github.com/kivy/plyer
- Sklearn - https://scikit-learn.org/
- Pandas - https://pandas.pydata.org/
The following instruction will demonstrate how to install the dependency and run the software in Microsoft Windows environment from the source code using Python.
-
Install Python 3 using executable file installer. Download here and follow the instruction there
-
Place this software source code in your local drive at a directory
-
Go to the directory then open command prompt in that directory
-
Create a Python virtual environment first (Note: actually you can skip this step to not use virtual environment but it's a better practice to use it)
python -m venv venv
-
Activate the newly created Python virtual environment
venv\Scripts\activate
-
Install Kivy 2
pip install kivy[base] kivy_examples --pre --extra-index-url https://kivy.org/downloads/simple/
-
Install Plyer, Sklearn, Pandas
pip install plyer sklearn pandas
-
Run the main source code. Remember to always run the software using the Python virtual environment, follow step 5
python __main__.py
- Python - Programming Language
- Kivy - UI Python Framework
- Plyer - Platform-independent Native Functionality API
- Sklearn - Machine Learning Python Library
- Pandas - Data Manipulation and Analysis Python Library
- Numpy - A Python Library Used for Working with Arrays
- Scipy - Scientific Computing and Technical Computing Python Library
- Chris Eldon - Computer Science - Undergraduate Student at Parahyangan Catholic University