Skip to content

An application to predict heart disease using patient data.

License

Notifications You must be signed in to change notification settings

maxim-eyengue/Heart-Disease-App

Repository files navigation

Heart illustration

❀️ Heart Disease Prediction Project

This project focuses on predicting heart disease using machine learning models. It includes data cleaning, exploratory data analysis (EDA), feature importance analysis, model selection, parameter tuning, and deployment via a web service. The solution is designed for effective containerization and deployment.


πŸ—‚οΈ Table of Contents

  1. πŸ“Œ Project Overview
  2. πŸ“ Directory Structure
  3. ❓ Problem Description
  4. βš™οΈ Installation and Setup
  5. ▢️ Running the Project
  6. πŸš€ Local Model Deployment
  7. 🐳 Docker Containerization
  8. ☁️ AWS Elastic Beanstalk Deployment
  9. πŸ§ͺ Testing the Application
  10. 🀝 Contributing
  11. πŸ“œ License

πŸ“Œ Project Overview

Heart disease remains one of the leading causes of death globally. This project leverages machine learning techniques to predict the likelihood of heart disease based on patient data.

Key features include:

  • 🧹 Data preparation and cleaning.
  • πŸ” Exploratory Data Analysis (EDA) to uncover patterns and relationships.
  • 🧠 Model training, evaluation, and parameter optimization.
  • 🌐 Deployment via Flask and containerization using Docker for scalable web service hosting.
  • ☁️ Cloud deployment using AWS Elastic Beanstalk.

πŸ“ Directory Structure

Heart-Disease-App/
β”‚
β”œβ”€β”€ data/                          # Contains the dataset
β”œβ”€β”€ images/                        # Illustrations and deployment screenshots
β”œβ”€β”€ midterm_project.ipynb          # Jupyter Notebook with data preparation, analysis and model planning
β”œβ”€β”€ train.py                       # Script for training and saving the model
β”œβ”€β”€ predict.py                     # Web service for serving the model
β”œβ”€β”€ no_app_predict_test.py         # Test script for direct model testing
β”œβ”€β”€ predict_test.py                # Script for testing the web service
β”œβ”€β”€ predict_test_cloud.py          # Script for testing the app deployed on AWS Elastic Beanstalk
β”œβ”€β”€ Pipfile                        # Dependencies for pipenv
β”œβ”€β”€ Pipfile.lock                   # Locked versions of dependencies
β”œβ”€β”€ Dockerfile                     # Docker configuration for containerization
β”œβ”€β”€ LICENSE.txt                    # Project MIT License
└── README.md                      # Project description and instructions

❓ Problem Description

Cardiovascular diseases are a major global health challenge. This project aims to use machine learning to:

  • ⚠️ Identify individuals at risk of heart disease.
  • 🩺 Assist healthcare professionals in making informed decisions.
  • 🌍 Provide an easily deployable service for real-world applications.

Heart Disease Prediction Dataset πŸ“Š

The dataset combines five publicly available heart disease datasets, with a total of 2181 records:

  • πŸ“ Heart Attack Analysis & Prediction Dataset: 304 reccords from Rahman, 2021
  • πŸ“ Heart Disease Dataset: 1,026 records from Lapp, 2019
  • πŸ“ Heart Attack Prediction (Dataset 3): 295 records from Damarla, 2020
  • πŸ“ Heart Attack Prediction (Dataset 4): 271 records from Anand, 2018
  • πŸ“ Heart CSV Dataset: 290 records from Nandal, 2022

Merging these datasets provides a more robust foundation for training machine learning models aimed at early detection and prevention of heart disease. The resulting dataset contains anonymized patient records with various features, such as age, cholesterol levels, and blood pressure, which are crucial for predicting heart attack and stroke risks, covering both medical and demographic factors.

Heart features illustration

Features Description:

  • age: age of the patient [years: Numeric]
  • sex: gender of the patient [1: Male, 0: Female]
  • cp: chest pain type [0: Typical Angina, 1: Atypical Angina, 2: Non-Anginal Pain, 3: Asymptomatic]
  • trestbps: resting blood pressure [mm Hg: Numeric]
  • chol: serum cholesterol level [mg/dl: Numeric]
  • fbs: fasting blood sugar [1: if fasting blood sugar > 120 mg/dl, 0: otherwise]
  • restecg: resting electrocardiographic results [0: Normal, 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), 2: showing probable or definite left ventricular hypertrophy by Estes' criteria]
  • thalach: maximum heart rate achieved [Numeric value between 60 and 202]
  • exang: exercise-induced angina [1: Yes, 0: No]
  • oldpeak: ST depression induced by exercise relative to rest [Numeric value measured in depression]
  • slope: slope of the peak exercise ST segment [0: Upsloping, 1: Flat, 2: Downsloping]
  • ca: number (0-3) of major vessels (arteries, veins, and capillaries) colored by fluoroscopy [0, 1, 2, 3]
  • thal: Thalassemia types [1: Normal, 2: Fixed defect, 3: Reversible defect]
  • target: outcome variable for heart attack risk [1: disease or more chance of heart attack, 0: normal or less chance of heart attack]

βš™οΈ Installation and Setup

Requirements: Python 3.11, Ubuntu with WSL 2.0

a. Clone the Repository

git clone https://github.com/maxim-eyengue/Heart-Disease-App.git
cd Heart-Disease-App

b. Install Dependencies

Use pipenv to manage dependencies:

pip install pipenv
pipenv install flask scikit-learn==1.5.1 gunicorn

c. Create and Activate the Environment

pipenv shell

NB: You can also directly use:

pipenv run `add the command to execute`

▢️ Running the Project

i. πŸ‹οΈβ€β™‚οΈ Training the Model

Train the model and save it as a binary file:

python train.py

ii. 🌐 Running the Web Service

Start the Flask application:

gunicorn --bind 0.0.0.0:9696 predict:app

iii. βœ… Testing the Web Service

Send a test request using predict_test.py:

python predict_test.py

Model deployment only with Flask


πŸš€ Local Model Deployment

The model is deployed using Flask in an environment created with pipenv.

Serve the app using Flask and test its functionality:

python predict_test.py

Model deployment only with pipenv and flask

You can now transition to containerized deployment with Docker.


🐳 Docker Containerization

a. Build the Docker Image

Create a Docker image for the project:

docker build -t heart-prediction-app .

b. Run the Docker Container

Run the image and map the port:

docker run -it --rm -p 9696:9696 heart-prediction-app

c. Test the Application

Send a request to the service using:

python predict_test.py

Model deployment with Docker


☁️ AWS Elastic Beanstalk Deployment

1. Install AWS Elastic Beanstalk CLI

Install the AWS Elastic Beanstalk CLI in your environment:

pipenv install awsebcli --dev

2. Initialize the Application

After activating the environment with pipenv shell, initialize the project for Elastic Beanstalk:

eb init -p docker -r us-east-1 heart-prediction-app

If errors occur, use:

eb init -p "Docker running on 64bit Amazon Linux 2" heart-prediction-app -r us-east-1

Provide your AWS credentials when prompted. These can be generated from the AWS IAM service.

NB: You can follow Alexey's tutorial to create an account on AWS.

3. Deploy Locally

Deploy the application locally:

eb local run --port 9696

Local model deployment with Elastic Beanstalk Use python predict_test.py to send a request to the locally running app for testing. Test local model deployment with Elastic Beanstalk

4. Deploy to the Cloud

Deploy the application to Elastic Beanstalk:

eb create heart-prediction-app-env --enable-spot

Model deployment to the Cloud with Elastic Beanstalk After deployment, the app was accessible at the Elastic Beanstalk URL.

To test the deployment, we used:

python predict_test_cloud.py

5. Terminate the Service

To terminate the Elastic Beanstalk environment:

eb terminate heart-prediction-app-env

πŸ§ͺ Testing the Application

Note that we tested the model in the following ways:

i. πŸ”¬ Without Flask: Directly test the model using:

python no_app_predict_test.py

ii. 🌐 Flask Web Service, Docker & Local EB: Send requests to the Flask app, or to the docker image, or when running Elastic Beanstalk locally:

python predict_test.py

iii. ☁️ Cloud Deployment: Test the application on AWS:

python predict_test_cloud.py

🀝 Contributing

We welcome contributions to enhance this project. Please:

  • Fork the repository.
  • Create a new branch for your feature or bug fix.
  • Submit a pull request with a detailed description of your changes.

πŸ“œ License

This project is licensed under the MIT License.


Heart attack

;) We will miss you...

About

An application to predict heart disease using patient data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published