Anomaly (fraud) detection pipeline on credit card transaction data using Isolation Forest machine learning model and Kedro framework
Link to article: https://neptune.ai/blog/data-science-pipelines-with-kedro
Develop a data science pipeline to detect anomalous (fradulent) credit card transactions with the use of:
- Isolation Forest machine learning model - For unsupervised anomaly detection
- Kedro - An open-source Python framework for creating reproducible, maintainable, and modular data science code. This framework helps to accelerate data pipelining, enhance data science prototyping, and promote pipeline reproducibility.)
- Explore how unsupervised anomaly detection works, and better understand the concept and implementation of isolation forest
- Leverage Kedro framework to optimally structure data science pipeline projects
The credit card transaction data is obtained from the collaboration between Worldline and Machine Learning Group. It is a realistic simulation of real-world credit card transactions and has been designed to include complicated fraud detection issues.
- Change path to project directory in command line -
cd C:/Anomaly-Detection-Pipeline-Kedro
- Initialize Conda virtual environment (create one if not done so) -
conda activate env_kedro
- Execute a pipeline run with
kedro run
Please see the walkthrough article for details