This repository demonstrates a practice project in the fraud detection field using data from Kaggle competition - IEEE-CIS Fraud Detection.
This project consists of 2 major parts: the fraud detection analysis notebook and the dash app.
dash
folder: this folder contains the python scripts and assets for the dash app.
notebooks
folder: this folder contains two notebooks EDA.ipynb
and Fraud_Detection.ipynb
and the Fraud_Detection.ipynb
notebook contains the end to end fraud detection analysis.
The fraud detection analysis contains 4 sections:
- Data Cleaning;
- Feature Engineering;
- Missing values analysis
- Dimensionality Reduction
- for categorical variables, merge small levels
- for numerical variables, PCA
- Fraud Detection Modelling
- lightGBM;
- XGBoost;
- Hyperparameter fine-tuning + cross validation
- Feature Mmportance Analysis.
The dash app is coded in python and two screenshots are shown as follows.