GitHub - fabricionarcizo/human_hand_detector: A Supervised Machine Learning to Segment Skin in Images Depicting Human Hands

This project aims to segment the skin in images depicting human hands. The solution is based on the analysis of RGB and HSV color spaces in 4 images (i.e., 1.jpg, 2.jpg, 3.jpg, and 4.jpg) available in the folder dataset.

00_manual_segmentation.py

I have decided to create a Python script to define a binary threshold manually based on 2 (two) color spaces. This script converts each pixel of an input image into a row with the RGB (3 numbers), HSV (3 numbers) and the label (0: non-skin or 1: skin). You can use one of the images available in the folder dataset to generate the dataset as a text file.

This Python scripts contains an user interface (see Figure bellow) to help you segment the human skin. It has (6) six trackbars, namely: (1) RGB <=> HSV, choose the RGB or HSV color spaces; (2) R <=> H, select the Red or Hue values; (3) G <=> S, select the Green or Saturation values; (4) B <=> V, select the Blue or Value values; (5) Lower Range and (6) Upper Range, to define the lower and upper boundary parameters used by the function cv2.inRange().

Execute the following command to run this script:

python 00_manual_segmentation.py --image "datasets/1.jpg"

Then, move the trackbars until you define a good mask to segment the user's hand (see the OpenCV windows binary and mask). When you are satisfied with the final result, press the button s of your keyboard to save the text file with the RGB and HSV values. The script saves the text file in the same folder of the input image. The dataset contains a sequence of number which represents: R,G,B,H,S,V,label.

Info: In my experiments, the best option was to use the HSV color space because I could segment the user's hand and remove the entire background.

01_plot_dataset.py

I have evaluated if it is feasible to use RGB and HSV color spaces to develop a skin color detector. This Python scripts plots 3 (three) graphics based on: (1) RGB color space [3D]; (2) HSV color space [3D]; and (3) RGB+HSV [6D]. Each plot shows the non-skin (red) and skin (blue) classes, and the decision boundary (plane) based on Logistic Regression.

As each dataset has information about 7.990.272 pixels, I have decided to use Pandas to speed up the file reading process. I also have decided to use a Gaussian sample of 1.000 pixels to generate and render graphics using Matplotlib.

Hint: You can change the sample size on line 58 [sample = dataset.sample(1000, random_state=1)].

I have used dimensionality reduction to plot the RGB+HSV data into 3D. The figure below shows an example of one plot generated with this Python script. It presents that the dataset contains two different clusters to represent non-skin and skin pixels. The grey plane represents the decision boundary used to classify the pixels into these two classes. According to my initial evaluation, it is possible to use any of the 3 (three) approaches mentioned above as features to segment the skin color in images depicting human hands.

Execute the following command to run this script:

python 01_plot_dataset.py --dataset "datasets/1.txt"

02_training_classifiers.py

I have decided to use 5 (five) different Supervised Machine Learning (ML) methods for binary classification, namely: (1) Logistic Regression; (2) K-Nearest Neighbors (KNN); (3) Decision Tree; (4) Support Vector Machine (SVM); and (5) Neural Network (NN) based on Multi Layer Perception (MLP). I have used scikit-learn library to create, train and evaluate the skin color classifiers.

This Python scripts uses one of the datasets generated with the Python script 00_manual_segmentation.py. Some ML methods are fast to process of 7.990.272 pixels in a short time. However, others methods (i.e., KNN, SVM) are very slow to process the entire dataset. Therefore, you can change the sample size on line 208.

I have evaluated each ML method for RGB, HSV and RGB+HSV features. I calculated the accuracy and confusion matrix to evaluate the ML method using 70% of training data and 30% of test data. The sample in this analysis had 100.000 pixels from the dataset ./datasets/1.txt. All methods have achieved and accuracy bigger than 99%. The best one was Decision Tree using HSV and RGB+HSV colorspaces, in which it has achieved 100% of accuracy. See bellow the results of my evaluation:

LOGISTIC REGRESSION (RGB)

Accuracy: 0.9991666666666666

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21138	13
Class 2 Actual	12	8837

LOGISTIC REGRESSION (HSV)

Accuracy: 0.9999666666666667

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21150	1
Class 2 Actual	0	8849

LOGISTIC REGRESSION (RGB+HSV)

Accuracy: 0.9999

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21148	3
Class 2 Actual	0	8849

K-NEAREST NEIGHBORS (RGB)

Accuracy: 0.9990666666666667

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21139	12
Class 2 Actual	16	8833

K-NEAREST NEIGHBORS (HSV)

Accuracy: 0.9997333333333334

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21148	3
Class 2 Actual	5	8844

K-NEAREST NEIGHBORS (RGB+HSV)

Accuracy: 0.9996

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21147	4
Class 2 Actual	8	8841

DECISION TREE (RGB)

Accuracy: 0.9992

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21145	6
Class 2 Actual	18	8831

DECISION TREE (HSV)

Accuracy: 1.0

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21151	3
Class 2 Actual	0	8849

DECISION TREE (RGB+HSV)

Accuracy: 1.0

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21151	0
Class 2 Actual	0	8849

SUPPORT VECTOR MACHINE (RGB)

Accuracy: 0.9973666666666666

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21075	76
Class 2 Actual	3	8846

SUPPORT VECTOR MACHINE (HSV)

Accuracy: 0.9947

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	20992	159
Class 2 Actual	0	8849

SUPPORT VECTOR MACHINE (RGB+HSV)

Accuracy: 0.9939

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	20969	182
Class 2 Actual	1	8848

Multi Layer Perception (RGB)

Accuracy: 0.9993

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21138	13
Class 2 Actual	8	8841

Multi Layer Perception (HSV)

Accuracy: 0.9987333333333334

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21120	31
Class 2 Actual	7	8842

Multi Layer Perception (RGB+HSV)

Accuracy: 0.9992666666666666

Confusion Matrix:

	Class 1 Predicted	Class 2 Predicted
Class 1 Actual	21133	18
Class 2 Actual	4	8845

Execute the following command to run this script:

python 02_training_classifiers.py --dataset "datasets/1.txt"

03_human_hand_detector.py

Finally, this Python script detects and segments the user's hand based on HSV color space. It uses the models trained with the script 02_training_classifiers.py. Execute the following command to run this script:

python 03_human_hand_detector.py --path "models"

First, it uses the 4 images from the dataset (i.e., 1.jpg, 2.jpg, 3.jpg, and 4.jpg). Figure bellow shows the results of Logistic Regression, Decision Tree, SVM and Multi Layer Perception using the image 3.jpg. As you can see, the detector is able to segment the user's hand and remove the entire background from the image.

Hint: Press any button from your keyboard to go to the next image.

After showing all images from the dataset, the Python scripts opens the video capture device and uses the detector to segment the hands of images from your main webcam. You have to press the button q to quit the program.

Future Improvements

To improve the skin detector, it is necessary to perform some additional task, e.g.:

Join the features of all images from the dataset in a single dataset
Add more images of human hand with different background and illumination conditions
Use pre-processing methods (e.g. spatial filtering, mathematical morphology) to remove noise from the input images
Include infrared images to handle
Adapt the detector to detect different skin colors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

00_manual_segmentation.py

01_plot_dataset.py

02_training_classifiers.py

LOGISTIC REGRESSION (RGB)

LOGISTIC REGRESSION (HSV)

LOGISTIC REGRESSION (RGB+HSV)

K-NEAREST NEIGHBORS (RGB)

K-NEAREST NEIGHBORS (HSV)

K-NEAREST NEIGHBORS (RGB+HSV)

DECISION TREE (RGB)

DECISION TREE (HSV)

DECISION TREE (RGB+HSV)

SUPPORT VECTOR MACHINE (RGB)

SUPPORT VECTOR MACHINE (HSV)

SUPPORT VECTOR MACHINE (RGB+HSV)

Multi Layer Perception (RGB)

Multi Layer Perception (HSV)

Multi Layer Perception (RGB+HSV)

03_human_hand_detector.py

Future Improvements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
datasets		datasets
images		images
models		models
00_manual_segmentation.py		00_manual_segmentation.py
01_plot_dataset.py		01_plot_dataset.py
02_training_classifiers.py		02_training_classifiers.py
03_human_hand_detector.py		03_human_hand_detector.py
INSTALL.md		INSTALL.md
README.md		README.md

fabricionarcizo/human_hand_detector

Folders and files

Latest commit

History

Repository files navigation

00_manual_segmentation.py

01_plot_dataset.py

02_training_classifiers.py

LOGISTIC REGRESSION (RGB)

LOGISTIC REGRESSION (HSV)

LOGISTIC REGRESSION (RGB+HSV)

K-NEAREST NEIGHBORS (RGB)

K-NEAREST NEIGHBORS (HSV)

K-NEAREST NEIGHBORS (RGB+HSV)

DECISION TREE (RGB)

DECISION TREE (HSV)

DECISION TREE (RGB+HSV)

SUPPORT VECTOR MACHINE (RGB)

SUPPORT VECTOR MACHINE (HSV)

SUPPORT VECTOR MACHINE (RGB+HSV)

Multi Layer Perception (RGB)

Multi Layer Perception (HSV)

Multi Layer Perception (RGB+HSV)

03_human_hand_detector.py

Future Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages