My project, digit gesture recognition, is designed to recognize gestures of digits from 0 – 5. This project is based on Raspberry Pi, when the camera captures the image that contains the figure of hand, the machine learning algorithm will predict the digit and print it on the screen with the figure of hand that is selected automatically.
Achieve skin detection, select the only area that is marked as hand.
Extract features through the figure of hand using contours and Fourier Descriptor
Build random forest model to mapping each figure to each digit.
At first, the OpenCV and camera module were installed on the Raspberry Pi. The next step was to use a skin detection algorithm to select the areas of skin and try to separate the hand, head and other parts. Several methods have been developed and tested in various situations of illumination and finally the OTSU threshold based on Cr (from YCrCb) is chosen because of the higher robustness to different illuminating situations. After that, I chose a Fourier Descriptor to represent each figure of hands instead of using a convex hull. The Fourier Descriptor can ensure that all figures that have different sizes have the same dimension for the machine learning model in the next step. In the third step, I compared the SVM and random forest and finally chose the random forest to predict the output and display it on the real time image on the screen. The training data was all collected by myself to ensure the variety of data set.
I have tested this program in different illuminating situations and found that it has enough robustness to solve lighting problems. The closer the hand to the camera, the higher accuracy it can be. When the hand is further to the camera, there is a little chance that the algorithm mis-recognizes the number 4 to number 5.
This version of the project can predict digit 0 to 5 and has more than 92% accuracy. I didn’t take the picture of number 7 because it is represented by different gestures in different cultures and countries. Adding number 7, 8, 9 works for the current model but it would lower the test accuracy with current number of training data. It will take longer time to collect enough data(pictures) and increase the number of features that mapping to each figure of hand to maintain the accuracy, which may slow the running speed and has some differences to achieve real time display.
Designed the overall project and test it (Just being himself).