Wrap-up

Results and Conclusion

We were successfully able to create a robot that was controlled by audio commands and could be trained to learn new hand gestures and new audio commands. We had not seen the training process for a pet animal being shown in robotics before and we are glad that we could demonstrate it through Stage 1 and Stage 2 Training. We were able to achieve all the functionality mentioned in the Project Proposal, except the FETCH command. The speech recognition module and the motor control were implemented fairly easily by the second week. However, we were set back a week from our proposed schedule due to initial issues faced with the hand detection module. Once we were able to detect hand gestures, implementing Stage 1 training was very smooth. The results from the ML models used were impressive, especially for such small data sets. To implement Stage 2 training, we had to change our core implementation details a lot to accommodate the sequence of motions instead of a single motion and the new command functionality. Separating all the four modules since the beginning was extremely beneficial as it helped us keep our code clean and work on the modules separately. Our strategy for the design process was to implement the functionalities one by one. Once the functionality was tested individually, it was integrated with the rest of the system.

As mentioned above the setbacks faced were due to installation issues and using external libraries. We had to spend a lot of time setting up the initial system, including switching back the kernel used to the original version. Throughout the process we took backups and uploaded the code to github to ensure that our work was safe. As it was described in the Software Implementation section, our design of the system divided into four modules helped us test and debug them individually before integrating with the other modules. Given the nature of our project, our primary testing methodology was observation over multiple runs, which helped us catch and fix many bugs. We have also included support for log files from each module and FIFOs that enable us to view exactly the behavior of the system and verify proper functionality.

The final product of this project - PiDog is a completely autonomous, wireless, embedded system that uses the Raspberry Pi as its brain. We have created this robot such that almost the entire functionality is limited to the embedded Linux operating system on the Pi, while several techniques learnt this course to build an effective and efficient device. GPIO and PWM are used on the Raspberry Pi with the motor driver to run both the motors. The PyGame library is used to display the animations on the PiTFT. We have four modules, each running as its own process (making use of the multi core processor), which communicate with each other using blocking FIFOs. We have also written bash scripts and used Python’s subprocess module to run the linux commands.

Future Work

Given more time for the project, we would have liked to add the functionality for the FETCH command with a red ball, which was a part of our initial proposal. FETCH is in an inbuilt command, on recognizing it the PiDog will look for its small red ball which should be thrown in front of it. Using object detection the PiDog will track the ball and move towards. The PiDog will bark when it finds the ball. To include this functionality, a front facing camera is required. We could either use two different cameras (the current PiCam facing upwards and an additional front facing webcam) or use a Pan Tilt Platform to move the PiCam in a different direction. Adding this feature would increase PiDog’s resemblance of a real pet dog. Improving the voice recognition and gesture detection to work in unpredictable environments would be very helpful. Replacing the wheels and the robot frame with four legs would mimic the movements of a real dog to a greater extent. Each leg would have three servos, one for each joint - ankle, knee and hip. With legs, more basic commands like left leg forward or jump can be added. We hope to implement these additional features in the future and develop PiDog prototype into a fully functioning product.

Team

Aryaa
Aryaa Vivek Pai

Computer Science, 2022

avp34[at]cornell.edu

Krithik
Krithik Ranjan

Electrical and Computer Engineering, 2022

kr397[at]cornell.edu


We worked on this project collaboratively. Initially, Krithik worked on the Speech Recognition and Motor controls module, whereas Aryaa implemented the Hand Detection module with computer vision and machine learning modules. The Stage 1 and Stage 2 Training, along with animations were implemented together. We used github to maintain the code base. As we both are currently located in Ithaca, we were able to meet in person to work together. We completed the documentation and website together as well.

References

Speech Recognition

Hand Gesture Recognition

Animation

Website

Student Projects