ECE5725 Spring 2019 Final Project
Mark Li (mkl53) and Serena Krech (sk2282)
In this project, we used a Raspberry Pi to process images and control servos to create a small robot that would mimic a person's movements. The RasPi recieved images as input from a PiCam and processed them in real time. We used OpenCV to identify the person as well as find their chest, elbow, and arm points. These points were then used to find angles between the upper arm and forearm as well as the angle of the upper arm at the shoulder. With this angle, we were able to map the value to a PWM pulsewidth to control each of our four servos in order for our robot to copy the movements of the person.
Body detection starts with noise reduction. To reduce small imperfections in the image which may affects good detection, we run our initial frame through a Gaussian filter. Since the thresholding operation requires a grayscale image, we also convert from standard RGB to grayscale in this step.
For thresholding, we use Otsu's Binarization. This method of thresholding attempts to separate pixels in the image into two categories, based on brightness. We found that this method works the best when performed on a person wearing dark clothes, standing against a white background. The output of the thresholding operation is a binary image, with darker pixels set to value 1, and lighter pixels set to value 0.
We then draw contours in the image, which finds all non-contiguous blobs of dark colors, and choose the largest blob to be our person. An example of this contour is shown on the right.
To perform elbow detection, we first draw the person contour onto a blank canvas, resulting in the left image below. To restrict the search, we then perform an erosion on the image, which cuts away from the edges. The result is the right image below.
Using a designated chest point, we perform a sweeping angle search, following rays from the chest point to the edge of the body. The longest rays found are the arms, which are highlighted as green lines in the image to the right. Since following the ray returns the intersection point between the ray and edge of the body, we can step backwards from this point into the body, which gives a decent elbow point.
Redrawing the contour onto a blank canvas. The right image is after erosion
According to this paper, a simplified YUV colorspace conversion, coupled with a few simple checks are enough to determine human skin colors. The U component, which is normally a float operation, can be simplified to U = R - G. For human skin, 10 < U < 74. Another simple check, described in this paper, is R > B.
After performing these operations, we are left with a somewhat noisy, but binary image of where human skin is found. To reduce noise, we use the morphological transformation, opening, which is erosion followed by dilation. This removes small patches of noise. We then perform an additional dilation, which closes small holes between larger blobs, and makes contours more accurate. The final image is shown to the left.
Similar to body detection, we draw contours to find all the blobs, and choose the biggest three. From left to right, we label these left hand, head, and right hand. The centroid points of these blobs, found using the moments of the contours, are used instead of the blobs themselves. A small y-axis offset from the head centroid gives us a suitable chest point.
There were two angles that we needed to know per arm, the angle of the shoulder joint, and the angle at the elbow. Finding the angle of the shoulder joint was relatively easy since we iterated through angles to find the elbow point. We just returned this angle from the function to use. The angle of the elbow was calculated based on three points: chest, elbow, and hand. From these points we created two vectors and found the angle between them using dot product, cross product and acos. We also needed to find a way to determine which way the angle was opening, up or down. We discovered that the sign of the cross product revealed this information so we returned that value along with the angle.
From here, we needed to map this angle to an angle based on our robot's orientation. So for the left side, straight down was 180 degrees and straight up was 0 degrees. This means that if the arm is straight, the getAngleABC() function would return 180, and this would be mapped to the robots 90 degrees.
Without any optimization, the code uses just a single core. On a larger computer this is fine, but on the Raspberry Pi, we need a bit more processing power. The solution is multithreading, as each frame can be processed somewhat individually of other frames. Although Python does offer a Threading module, it is still limited by the Global Interpreter Lock. To get around this, we use the Multiprocessing module, which creates worker processes that bypass the GIL.
Spawning 4 worker threads saturates the Rapsberry Pi's 4 cores, as the main thread mostly waits for asynchronous results. Because the main thread can blocking wait for results from the worker threads, only 4 processes are ever doing meaningful work at one time.
The result is going from about 25-30% CPU usage to about 95% CPU usage at all times. This gives a big increase in throughput, but frame delay still remains the same.
To help with frame delay, speeding up any function that gets run often helps a lot. For our implementation, this function is findWhite, which performs the raytrace described in Elbow Detection. Since it is pure python, fairly simple, and run almost constantly, it is a prime candidate for precompiling using Cython. Just precompiling using Cython gives a ~2x speed increase for the function, but with a couple of small changes, we achieved up to ~100x over pure python. The major changes that we needed to make were just static type declarations and a couple of Cython compiler directives. Below is a comparison between the original code and the code with static declarations:
Function findWhite. Right is pure python, and left is statically typed.
Adding a few Cython compiler directives helps as well. We disable array bounds checking, array index wraparound, and Nonetype checks.
The result is a python module that can be imported and functions identically to the original, but 100 times faster. Although we don't get 100 times more frames per second, the difference is very noticeable. Coupled with the multiprocessing changes, we're able to run at full resolution without it being unreasonably slow.
The body of the robot was a relatively simple design. For the front, two rectangular holes were made for the motors to fit through as well as two screw holes for each motor for mounting. We originally 3D printed two triangles to make the body stand up. However we found during testing that these were not very strong and were too light compared to the front of the robot, causing it to easily tip over. We then made the base larger, filling in the area between the triangles. This helped balance he robot as well as created more surface area to epoxy the base on, creating a stronger hold. As for the camera, this was mounted to the top. It protrudes from the back of the body and two screws are used to hold the camera in place. The arms have the same mounting structures for the servos as the main body. The arms also include two other holes to mount the servo horns to connect each arm segment together.
We used four micro servos in this project. Each servo was controlled by a different GPIO pin. In order to generate a PWM signal, we used the Pigpio library. This allowed us to use hardware timed PWM signals, creating a stable signal so the motor wouldn't jitter. The motors required 5V for power which we sourced from a power supply. The motor ground connection was wired to the ground of the RasPi so that they would have a common reference ground for the signal. We also added resistors between the RasPi pins and the connection to the servo in order to protect the GPIO pins from current spikes. Once we knew we could control the servos, we found their PWM range for the 180 degrees that we wanted. We found that our lower bound was a pulsewidth of 670, middle was 1550, and the upper bound was 2475. The middle value was used in initialization to attach the arms in the correct orientation, and the upper and lower bounds were used in the mapping function to convert angles to pulsewidths.
Software and hardware were mainly tested separately until we had a stable working algorithm since early tracking was quite jumpy. Testing of the servos was relatively simple. We started with testing the control of a software PWM signal. The motors could not hold in once position, which was expected since we knew from earlier labs that a software PWM signal was generally not accurate or stable. We then looked into using hardware PWM signals. The RasPi has two hardware PWM modules that we could have used, however with four motors this was not sufficient. We next turned to the Pigpio library which used hardware timers to create a PWM signal on any of the GPIO pins. Once the timers were set, they would constantly create the signal until updated with a new PWM. With this method, we were able to stably control our servos. We also tested the servos ability to lift weight using thin popsicle sticks found in lab. We wanted to make sure the servos would have enough power to lift an arm with another servo attached. This test was succesful and we proceeded with our servos.
Software was first tested on our own computers where processing would be faster. When we started testing we used a neural network based library, OpenPose. It was extremely accurate, but also ran very slow even on our computers. A single frame took up to 50 seconds on the Raspberry Pi, so we had to move on to different methods. We first experimented with contours and obtaining points from the outline as well as optical flow. While we did use contours to an extent in our final version, we did not use the outline of points that we were originally considering since the group of points didn't tell us a lot of information. As for optical flow, test code was developed to select hands, elbows, and chest points to track. However if a point didn't have a good reference the point was easily lost so we did not continue with this method.
A lot of testing and iteration was performed on the raytracing method for both performance and accuracy, and we put a lot of work into trying to get hand locations using ray tracing. However, we eventually settled on the fact that it wasn't accurate enough, and with the help of an old ECE 5760 project, put together the current hand detection algorithm. However, we were satisfied with its elbow detection, and kept that part for the final code.
Optimization was a slightly bigger challenge, as on our laptops there was no significant drop in framerate from the webcam's 30 frames per second, even for processing-heavy versions of the code. Careful examination of the code, and timing tests on the Cythonized findWhite function helped us achieve better performance on the Raspberry Pi despite no real difference on our laptops.
When choosing where we were going to display the image of the person and their limbs, we wanted to use the PiTFT at first. However, OpenCV is unable to easily output to the PiTFT, and duplicating the desktop to the PiTFT was both unwieldy and took precious processing power. We settled on displaying it on the big screen.
Once confident in the code, we put the system together to test. Software and hardware integrated easily as we already had the angles we needed within the code, all we needed to do was map the values. We first initialized the motors to horizontal so we could attach the arms in the correct orientation. When running the full program, we found that the arms would shake a lot and often wouldn't stay in one place. This was most likely because the motors at the elbows would shake a bit when they moved, and the torque that the shoulders needed to move the upper arm was higher than the motor could provide. In order to rectify this, we reprinted each arm 0.75" shorter than they were previously. After installing the new arms, the robot was much more stable.
We found during testing that the arms would be pretty jumpy, responding to every little change of the detected arms. This wasn't great because the detection was jumpy on it's own. We decided to add a rolling average of angles from the last five frames to help smooth out the movements. We also added bounds to the angles that we could move to so that the system would not overexert itself attempting to ove to an angle over 180 degrees, which would be impossible for it go to.
In the end we were successfully able to create a robot that would accurately mimic a user's movements in real time, as shown in the video above. The only limitation on the system is that the servos are restricted to 180 degrees so we were not able to have sharper angles at the elbow. Overall, we were able to meet the goals that we had set in the project outline.
This project could be further expanded to body movements such as moving up and down or possibly angling the upper body side to side by adding other servos to control the movement in the chest area. By tracking the chest point, up and down motion would be relatively simple. To track side to side motion, another point of reference might be needed. Controlling more servos wouldn't be a problem since any GPIO can use a hardware timer for PWM. The body of the robot would need to be modified to be able to move at all since it is currently just a stand.
Our project was able to successfully mimic the arm movements of a person. We origninally found a program do detect all body parts and overlay a stick figure onto each frame, however this ran in a neural net which took 40 seconds per frame to process. We were able to achieve a similar program using OpenCV than ran much more effiently so we were able to track movements in real time. On limitation of our system was that the robots arms were limited in their motion since they can only rotate 180 degrees. So while the software can detect sharper angles, our robot could not move to that location.
Software design, system testing
Robot and hardware design, system testing
|Tower Pro Micro Servos||4||5.95||23.80|