A radio controlled car that navigates through traffic autonomously

The autonomous radio-controlled (RC) car utilizes a Raspberry Pi 3 as a means of providing basic self-driving capabilities to our RC car. These capabilities include road tracking and following on straight and curved roads, stop sign and traffic light detection, as well as collision avoidance. The RPi3, along with a Pi camera module, streams a video link to a computer via Wi-Fi. Through machine learning and training, the computer takes in the video link and performs image processing. From there, road tracking and traffic sign detection were achieved. Corresponding actions from both road tracking and traffic sign detection such as forward, brake, left, and right, are determined and sent through a modified RC car controller that is connected to an Arduino. Also equipped with an ultrasonic distance sensor, the RC car continuously checks for possible obstacles in front, and brakes if deemed necessary.


Overview & Objective

In recent years, both the public and industry's interest in autonomous cars have increased dramatically. The reason that our group decided to pursue such a project is due to our shared interest in this matter. The goal of this project is to provide the team with hands on experience on creating an autonomous car that has basic self-driving capabilities. Most importantly, we learned and understood the enormous challenges it takes to create a fully functional autonomous car safe for real-life use. Despite our project is only a simple model, we aimed to perfect each of our capabilities - road tracking, traffic sign detection and collision avoidance.

Fig. 1 Autonomous Driving RC Car Model with stop sign and traffic light models


Overall Design

Fig. 2 Design overview with self driving RC car and Raspberry Pi 3 components.

Input Stage - Pi Camera & Ultrasonic Sensor

Our project uses Picamera and ultrasonic sensor as our input stage, the Picamera and ultrasonic sensor are connected to the Raspberry Pi and communicates with the laptop via TCP socket, each input has its unique port number and same IP address, which is our laptop current IP address. The Picamera takes 10 frames per second, and send the streamed image frames to the laptop, the image data is used to achieve the road tracking and traffic signs detection. The ultrasonic sensor is HC-SR04, and its operating voltage is 5V. Raspberry Pi can only provide 3.3V, and the voltage level translator BOB-12009 is used to step down 5V signal to 3.3V signal. As the Raspberry Pi gets the sensor data, the data will send to the laptop. The sensor is put on the front of car, it is used to measure the distance of the front object in order to avoid collision.

Fig. 3 Sparkfun BOB-12009 Voltage level translator for 5V - 3.3V step down.

Training Stage - Artificial Neural Network (ANN) & Classifier

After making sure that we can stream real-time video into our own computer by frames, we began to collect data and then train the data by using two simple machine learning algorithms which are 3-layer Artificial Neural Network for road tracking and Haar feature-based cascade classifiers for object detection.

For road tracking part, we aimed to achieve that car can predict four directions while driving by classifying road frames token in real time into those four categories thus it can run automatically by tracking the road. Every input frame will be reshaped into 320*120 size with 38,400 pixels. Every pixel is used as one neuron xk in the input layer. The hidden layer will have 32 neurons as the basic feature extraction. For the output layer, we will have four neurons for the four different directions which are forward, left, right, and reverse.

Fig. 4 3-layer neural network consisting of input layer, hidden layer, and output layer.

We wrote two python scripts to collect frames data first and to train data for direction prediction. In details, we built white road and drove the car running along the road manually by using the four-direction keys in our keyboard to control the directions for the car. While car was running, it would stream the real-time video back to our laptop and when we clicked the right decision for one input frame by using pygame module, the frame would be added into image array as the data for the input layer and the decision will be added into label array as the date for the output layer. All the data will be kept in file. After collecting more than 4500 frames each time, we splitted all the frames into training data (0.6) and testing data (0.4). We put all the training data into training python script which we used the 3-layer ANN module with backpropagation in OpenCV to get the right prediction parameters. At the same time, testing data can help us see the performance of the prediction parameters. We have trained for numerous times and finally got the 96% accuracy for training and  93% accuracy for testing.


Fig. 5 Forward training sample image

"Forward and Right"

Fig. 6 Forward and right training sample image

“ Forward and left”

Fig. 7 Forward and right training sample image

For object detection part, our objects are “stop sign” and “traffic lights”. We mainly used the module of Haar feature-based cascade classifiers in OpenCV to implement the whole process. The training process need positive samples and negative samples. The positive samples are images which only contain target objects and the negative samples are arbitrary images without objects. There are applications in OpenCV to train the object classifiers we need based on the positive and negative samples and all the parameters are stored in xml file.

Driving Stage - Image Processing, Arduino & Radio Controller

For the image processing of the streamed image frames from the Picamera, first in order to speed up real-time image processing time, the frames from Picamera are scaled down to 320*240 pixels. Since we need to detect the distance between the traffic signs and our car, the detection of object through Picamer needs to be used. Because each Raspberry Pi can only support one camera module, the distance measurement function by using one camera is used in our project. Based on the equation d=h/tan(α+arctan((v−v0)/a_y)), where h is the height of optical center of Picamera, α is the angle between horizontal line and optical lense of Picamera, v is the camera coordinates on vertical position and can be returned from the object detection process, other parameters are the intrinsic parameters of Picamera, we can get them by the function of camera calibration in OpenCV. From the camera calibration, we can get the parameters in camera matrix, which is a_y = 332.213346347 and v0 = 119.907780142. The distance measurement results are shown in the traffic signs detection image. For our project, we set the distance threshold between the traffic signs and Picamera optical center is 25cm, if the distance is less than 25cm, it will generate the driving signals according to the different traffic signs, such as for stop sign and red traffic light, it will stop the car, and for the green traffic light, it will drive the car continue along the road.

For the real-time traffic signs detection, since the OpenCV has the detection function of cascading classifier, we directly use it to detect the stop sign and traffic light based on the classifier we trained in the training stage. The cascading classifier is an effective object detection method, it can handle the real-time object detection for our project, which has 10 frames per second. The detection results are shown in figures 8 - 10.

For the road tracking, with the training results from the 3-layer Artificial Neural Network in the training stage, it can make the prediction according to the position of the road for every frames taken from Picamera. Based on the prediction, we can generate the driving signals and send the signals to the Arduino to control the radio controller. And the radio controller will steer the car along the road.

Fig. 8 Stop sign detection result with distance measurement

Fig. 9 Green traffic light detection result with distance measurement

Fig. 10 Red traffic light detection result with distance measurement

In order to control the radio controller, we use Arduino Mega 2560 as our radio controller driver. The Arduino is connected to the laptop using the serial communication. In order to drive the radio controller, we replaced the 4-pin buttons in the radio controller with K4017 N-channel MOSFET. Its gate threshold voltage is 2.5V with turn-on time 25ns, the Arduino can easily drive it with 5V output voltage and output duration time is 50ms. So it can be considered as a good switch to drive the radio controller. Four GPIOs on Arduino are used to send the forward, backward, left and right output signals. When the laptop sends the driving signals, which are generated of image processing, to the Arduino, the Arduino will send the corresponding signals from each GPIO to drive the MOSFET, and the MOSFET will act as the switch to open or close the circuit of each direction on the radio controller, then the radio controller will steer the car.

Fig. 11 Arduino Mega 2560 and Radio controller


Artificial Neural Network Implementation

For our 3-layer ANN training model, initially we only got the training and testing accuracy of about and even blow 30%, in order to increase our accuracy, we tried different methods, such as increase the number of layer, increasing the size of second layer, and try different driving method to collect data. For increasing the number of layer, we found that it took much longer time for training and the accuracy of results is not improved much, so it was not a good choice for improving the accuracy. The second method is to increase the size of second layer, we tried the different number of neurons on the second layer, such as 16, 32 and 64, we found that if the number is 16, the training time is fast, but the accuracy was not improved much; if the number of neurons is 32, it cost a little bit longer time for training, but the accuracy improved a lot, which is more than 80%; and for 64 neurons, it took much longer time than 32 neurons, but the accuracy was close to 32 neurons. So we set the neuron number is 32 of second layer as a good balanced number. And we changed our method for collecting data, initially the speed of driving the car manually for collecting data was high, and it caused the data from each loop is not much and about 100 images. Then we slow down the speed for collecting data, and increase the data to more than 200 images each loop, it highly increased the data for different cases on the road, and deeply increased our training accuracy. Based on the different method we tried, our final training accuracy achieve more than 96%.  

Challenges & Potential Improvements

While testing and validating our design, we discovered many variables that would directly affect the perfromance and reliability of the autonomous car .The following three issues are some of the main issues we faced:

1) Battery Issues

The RC car model that we used for our project is powered by 4 AA batteries. As the batteries’ capacity decreases, the speed and turning capabilities of the RC car are affected. As the batteries deplete, the output voltage and current draw both drop. The voltage directly affects the RC car’s speed, while the current directly affects the RC car’s torque. The biggest problem this created is a discrepancy between training and actual trial runs. If we had chosen an RC car that is powered by Lithium ion batteries, the results and performance may be more robust.

2) Pi Camera Module Limitations and Sensitivity to Error

The Pi Camera presents us with several challenges as well. As mentioned above, during our training process, thousands of pictures were taken at a specific angle aimed towards the ground. If the camera angle was moved or adjusted, the system may not be able to recognize the current track status.

3) Network Issues

In order to determine the current track status, the Pi camera connected to the RPi3 provides a live feed to our computer via a Wi-Fi link. Ideally, our computer will perform a real-time evaluation on the live feed, determining the correct command to be sent through the RC controller. However, if the network was unable, the video feed may lag and get stuck on a single frame of picture. Once that occurs, the system performs detection on the same picture over and over again, providing incorrect commands to the RC car. A reasonable and robust way to tackle this issue would be to perform the real time detection on the Raspberry Pi. Since the more computationally intensive processes such as machine learning and training can be done ahead of time on a more powerful machine, we can transfer these data sets over to the RPi3 upon completion. If we perform detection directly on the RPi3, we eliminate the possible issues presented in an unstable Wi-Fi network. If this alternative works, we can even lift our 10 frames/sec limitation from our Pi camera module. A higher framerate will provide smoother and faster updates. Combined with the smooth detection, we can make road tracking even more accurate and robust.

Track Setup

Fig. 12 Track setup for stop sign detection

Fig. 13 Track setup for road tracking and path following




Most things performaed as planned in this project. As mentoined in our challenges section, high variability of the Wi-Fi connection along with other factors greatly reduces our test reliability. However, our team was still able to meet all goals outlined in the description. Our self-driving car demonstrated abilities to follow along a track, track traffic signs and lights, as well as avoiding collisions. Videos showing sample runs of our project can be found at:


Self driving cars will be the future of travelling in no time. With great interests shown by the general public as well as industry, hands-on experience on creating such a project has provided us with great opportunities to understand the challenges, complexity, and effort that engineers have put into developing them.

The overall experience of the project was extremely rewarding. It incorporated skills acquired from labs 1-4 throughout the semester, and our team was able to successfully create an autonomous car that was able to perform the following tasks: 1) Perform road tracking on both striaght and curved paths, 2) Perform stop sign and traffic light detection, and 3) Perform collision avoidance using ultrasonic sensors. Due to the nature of training and neural network that we utilized, we were only able to perform this experiment on a track that we designed, as the performacne highly depends on training data from the same track.

Our autonomous RC car adheres strictly to IEEE’s Code of Ethics. Design decisions were certainly  made consistent with the law, safety, health, and welfare of the public. It does not endanger the public or the environment in any way. We did not encounter any sorts of conflicts of interest, and are prepared to disclose them to affected parties if that does happen. Our data presented were realistic. We did not receive any sort of bribery, and aimed to improve the appropriate application and understanding of this product as specified in the previous paragraph. Moreover, we are open to any sort of honest and constructive criticism because we strive for a more robust design. Each team member was treated fairly with absolutely no discrimination based on race, religion, gender, disability, age, national origin, sexual orientation, gender identity, or gender expression. With safety precautions done in all kinds of areas, we avoided any sort of injuries on others, their property, reputation, or employment by any sort of false or malicious action. Most importantly, we assisted and supported each other professionally through the design phase. As mentioned, this is a project that provided each of us with a great opportunity to learn more about autonomous cars. A fully functional and safe autonomous car can greatly benefit the public as it eliminates many human errors made in every day driving. However, it is absolutely crucial that all components are robust and are bug-free because the mass public can begin using.

Future Work

Adding image filter for training image data

Currently, we have not added any image filter for the training image data, so the training result parameters have limitation that it can only work under the similar environment and road situation. We want to add more image filter function, such as low pass filter (Gaussian filter, mean or median filter) to remove image noise, edge detection and adding threshold. Since the image data after adding these function, the feature will become more distinct and make it more general. Then after training the processed image data, the training result will be more general that it can work not only under the similar environment, but also the unknown situation. Thus, it will make our autonomous car more powerful to drive on more different road situations.

Running everthing on the Raspberry Pi 3

Since now most of our processing stage is on the laptop, we plan to move all the processing programs to the Raspberry Pi. To achieve that, we will finish our training parameters, and then move all the training results to the Raspberry Pi, then the image processing part can run on the Raspberry Pi. And also instead of driving the radio controller to steer the car, we will use DC motor driver to drive the DC motor in the car directly, and use Raspberry Pi to control the motor driver. It will not only make our car more autonomous, but also can deal with some current issues, such as the unstable Wi-Fi connection issue and speed of image processing.



Appendix A

The group approves this report for inclusion on the course website.
The group approves the video for inclusion on the course youtube channel.

Appendix B - Raw Code


// Aruidno Control
// This program is uploaded to Arduino in order to control the radio controller

// assign 4 pins for 4 directions
int right = 6;
int left = 7;
int forward = 2;
int reverse = 5 ;

// duration time for signals
int time = 50;

// set initial command
int command = 0;

void setup() {
  pinMode(right, OUTPUT);
  pinMode(left, OUTPUT);
  pinMode(forward, OUTPUT);
  pinMode(reverse, OUTPUT);

void loop() {
  //receive command from laptop
  if (Serial.available() > 0){
    command =;

void right(int time){
  digitalWrite(right, HIGH);

void left(int time){
  digitalWrite(left, HIGH);

void forward(int time){
  digitalWrite(forward, HIGH);

void reverse(int time){
  digitalWrite(reverse, HIGH);

void forward_right(int time){
  digitalWrite(forward, HIGH);
  digitalWrite(right, HIGH);

void reverse_right(int time){
  digitalWrite(reverse, HIGH);
  digitalWrite(right, HIGH);

void forward_left(int time){
  digitalWrite(forward, HIGH);
  digitalWrite(left, HIGH);

void reverse_left(int time){
  digitalWrite(reverse, HIGH);
  digitalWrite(left, HIGH);

void reset(){
  digitalWrite(right, LOW);
  digitalWrite(left, LOW);
  digitalWrite(forward, LOW);
  digitalWrite(reverse, LOW);

void send_command(int command, int time){
  switch (command){

     //reset command
     case 0: reset(); break;

     // single command
     case 1: forward(time); break;
     case 2: reverse(time); break;
     case 3: right(time); break;
     case 4: left(time); break;

     //combination command
     case 6: forward_right(time); break;
     case 7: forward_left(time); break;
     case 8: reverse_right(time); break;
     case 9: reverse_left(time); break;

     default: Serial.print("Inalid Command\n");

# Picamera
# This program is to drive Picamera and set the resolution 320 * 240 and 10 fps,
# using socket communicatin to connect with laptop

import io
import socket
import struct
import time
import picamera

    with picamera.PiCamera() as camera:
        # scale down the resolution to 320 * 240
        camera.resolution = (320, 240)
        # the speed of camera is 10 fps
        camera.framerate = 10
        time.sleep(1)                       \
        start = time.time()
        stream = io.BytesIO()
        # set the format as jpeg
        for foo in camera.capture_continuous(stream, 'jpeg', use_video_port = True):
            connection.write(struct.pack('L', stream.tell()))
            if time.time() - start > 400:
    connection.write(struct.pack('L', 0))

# using socket to connect to laptop
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect(('', 8000))
connection = client_socket.makefile('cam')

# sensor
# This program is set the pin for ultrasonic sensor and
# sending the sensor data to laptop using socket connection
from socket import *
import time
import RPi.GPIO as GPIO

def measure():
    GPIO.output(GPIO_TRIGGER, True)
    GPIO.output(GPIO_TRIGGER, False)
    start = time.time()
    while GPIO.input(GPIO_ECHO)==0:
        start = time.time()
    while GPIO.input(GPIO_ECHO)==1:
        stop = time.time()
    elapsed = stop-start
    distance = (elapsed * 34300)/2
    return distance
# assign TIGR pin as GPIO 5
# assign ECHO pin as GPIO 6
GPIO_ECHO    = 6
# Trigger pin
# Echo pin
# initial to low
GPIO.output(GPIO_TRIGGER, False)
    while True:
        distance = measure()
        print "Distance : %.1f cm" % distance
        # send data to the host every 0.5 sec
# create a socket and bind socket to the host
client_socket = socket(AF_INET, SOCK_STREAM)
client_socket.connect(('', 8008))

# car self-driving while processing a set of works including road tracking,
# traffic signs detection, and distance sensoring 

import cv2
import numpy as np
import math
import threading
import SocketServer
import serial

# multi-threads for video and distance processing
class Thread(object):
    def thread1(host, port):
        server = SocketServer.TCPServer((host, port), SensorDataHandler)
    def thread2(host, port):
        server = SocketServer.TCPServer((host, port), VideoStreamHandler)
    distance = threading.Thread(target=thread1, args=('', 8008))

    video = threading.Thread(target=thread2, args=('', 8000))

distance_data = " "

# processing sensor data in real time by thread 1
class Distance_data(SocketServer.BaseRequestHandler):
    data = " "
    def handle(self):
    	global sensor_data
       = self.request.recv(1024)
                distance_data = round(float(, 1)
                print distance_data
            print "sensor connection closed"

# processing video data in real time by thread 2
class Video_data(SocketServer.StreamRequestHandler):
    # create 3-layer ANN model
    model = 3_ANN()
    traffic_sign = ObjectDetection()
    self_drivingCar = Car()

    h1 = 0.5 # stop sign
    d_stop_sign = 25
    h2 = 0.5 # traffic light
    d_light = 25

    # cascade classifiers
    stop = cv2.CascadeClassifier('cascade_xml/stop_sign.xml')
    light = cv2.CascadeClassifier('cascade_xml/traffic_light.xml')
    d_camera = Distance_camera()

    # time the stop time
    stop_time = 0
    stop_start = 0              
    stop_finish = 0
    drive_time = 0

    def handle(self):
		global distance_data
        stream_bytes = ' '
        stop_flag = False
        stop_sign = True
        # stream video frames one by one
            while True:
                stream_bytes +=
                first = stream_bytes.find('\xff\xd8')
                last = stream_bytes.find('\xff\xd9')
                if first != -1 and last != -1:
                    jpg = stream_bytes[first:last+2]
                    stream_bytes = stream_bytes[last+2:]
                    gray = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8), cv2.CV_LOAD_IMAGE_GRAYSCALE)
                    half_g= gray[120:240, :]
                    image = cv2.imdecode(np.fromstring(jpg, dtype=np.uint8), cv2.CV_LOAD_IMAGE_UNCHANGED)
					# detection and distance measurement
                    stop_d = self.obj_detection.detect(self.stop_cascade, gray, image)
                    light_d = self.obj_detection.detect(self.light_cascade, gray, image)
                    if v_param1 > 0 or v_param2 > 0:
                        d1 = self.d_to_camera.calculate(v_param1, self.h1, 300, image)
                        d2 = self.d_to_camera.calculate(v_param2, self.h2, 100, image)
                        self.d_stop_sign = d1
                        self.d_light = d2
                    cv2.imshow('image', image)
                    image_array = half_gray.reshape(1, 38400).astype(np.float32)
                    prediction = self.model.predict(image_array)
                    # obstacle detection
                    if distance_data is not None and distance_data < 80:
                        print("Stop for obstacle")
                    # traffic light detection
                    elif 0 < self.d_light < 30:
                        if self.obj_detection.red_light:
                            print("Red light")
                        elif self.obj_detection.green_light:
                            print("Green light")
                        self.d_light = 30
                        self.traffic_sign.red_light = False
                        self.traffic_sign.green_light = False
                    # stop sign detection, when detected, stop for 5 seconds
                    elif 0 < self.d_stop_sign < 25 and stop_sign:
                        print("Stop for stop sign")
                        if stop_flag is False:
                            self.stop_start = cv2.getTickCount()
                            stop_flag = True
                        self.stop_finish = cv2.getTickCount()
                        self.stop_time = (self.stop_finish - self.stop_start)/cv2.getTickFrequency()
                        if self.stop_time > 5:
                            stop_flag = False
                            stop_sign = False
                    # normal situation
                        self.stop_start = cv2.getTickCount()
                        self.d_stop_sign = 25

                        if stop_sign_active is False:
                            self.drive_time_after_stop = (self.stop_start - self.stop_finish)/cv2.getTickFrequency()
                            if self.drive_time_after_stop > 5:
                                stop_sign_active = True

                    if cv2.waitKey(1) & 0xFF == ord('q'):
            print "Video connection closed"

class ObjectDetection(object):
    def __init__(self): = False = False
    def detect(self, cascade_classifier, gray_image, image):
        detect_f = 0
        # minimum value to proceed traffic light state validation
        threshold = 150     
        cascade_obj = cascade_classifier.detectMultiScale(
            minSize=(10, 10),
        for (x_pos, y_pos, width, height) in cascade_obj:
            cv2.rectangle(image, (x_pos+5, y_pos+5), (x_pos+width-5, y_pos+height-5), (255, 255, 255), 2)
            v = y_pos + height - 5
            # stop sign
            if width/height == 1:
                cv2.putText(image, 'STOP', (x_pos, y_pos-10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
            # traffic lights
                roi = gray_image[y_pos+10:y_pos + height-10, x_pos+10:x_pos + width-10]
                mask = cv2.GaussianBlur(roi, (25, 25), 0)
                (minVal, maxVal, minLoc, maxLoc) = cv2.minMaxLoc(mask)
                # check if light is on
                if maxVal - minVal > threshold:
          , maxLoc, 5, (255, 0, 0), 2)
                    # Green light
                    if 5.5/8*(height-30) < maxLoc[1] < height-30:
                        cv2.putText(image, 'Green', (x_pos+5, y_pos - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
               = Tr
                    # Red light
                    else 1.0/8*(height-30) < maxLoc[1] < 4.0/8*(height-30):
                        cv2.putText(image, 'Red', (x_pos+5, y_pos-5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
               = True
        return detect_f
# car driving
class self_drivingCar(object):
    def __init__(self):
        self.port = serial.Serial('/dev/tty.usbmodem14521', 115200, timeout=1)

    def steer(self, prediction):
        if prediction == 2:
        elif prediction == 0:
        elif prediction == 1:

    def stop(self):

# distance measurment between camera to objects
class Distance_camera(object):
    def __init__(self):
    	# calibration parameters
        self.v0 = 119.865631204
        self.ay = 332.262498472
        self.alpha = 8.0 * math.pi / 180
    def measurement(self, v, h, x_shift, image):
        dis = h / math.tan(self.alpha + math.atan((v - self.v0) / self.ay))
        if dis > 0:
            cv2.putText(image, "%.1fcm" % d,
                (image.shape[1] - x_shift, image.shape[0] - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)
        return dis

# 3-layer ANN model
class 3_ANN(object):
    def __init__(self):
        self.model = cv2.ANN_MLP()
    def create(self):
        neuron_size = np.int32([38400, 32, 4])
    def predict(self, samples):
        ret, resp = self.model.predict(samples)
        return resp.argmax(-1)

if __name__ == '__main__':