Building an Image Recognition Car with Raspberry Pi, Arduino, and TensorFlow

Zhao Zhichen, undergraduate at the Department of Physics, Yuanpei College, Peking University, PhD from the Department of Physics, University of Michigan, currently working at a hedge fund in New York, doing financial modeling and quantitative work, and a Raspberry Pi enthusiast.

Since buying my first Arduino kit, I have been involved in robotics for several years, but only recently started working on complete projects. During this time, two skills opened a new world for me: Python and Linux. Behind them is a powerful open-source community. Mastering these two tools (meta-tools) makes you feel like there are plenty of handy weapons online. Last week during an internal programming training at the company, a saying resonated with me: We are software engineers, not programmers. Our job is not to write programs, but to use tools wisely to solve problems. In Google, if you feel you have to start from scratch to write a certain feature, it just means you haven’t found the right tool yet. This is especially true in the open-source community.

This is a remote-controlled car that can be controlled by infrared remote control or wireless keyboard, with TensorFlow monitoring the camera feed in real-time and vocally announcing the objects it recognizes. All the code is available on my GitHub.

This idea is not my original concept; it comes from a blog written by Lukas Biewald last September. The core part, where TensorFlow recognizes the camera images and outputs voice, is based on open-source work by my company’s AI engineer Pete Warden. Unlike the original blog, I incorporated Arduino as the mechanical control and learned how to communicate between Arduino and Raspberry Pi (serial communication). Throughout the process, I utilized many useful skills and tools, and I would like to summarize them here and welcome fellow enthusiasts to share their thoughts!

The entire project was completed in a command-line environment, without a graphical interface. If you don’t understand the Linux system, it may be a bit challenging. However, since you have started playing with robotics, how can you not learn Linux? I self-studied Linux through “The Linux Command Line” by William Shotts, and later tried to build Linux from source code, finally overcoming the resistance to the command line that grew from growing up in a Windows environment. Trust me, overcoming this barrier will open the door to a new world. Moreover, working in the command line is cooler and more geeky, isn’t it? Besides Linux, you also need to understand C++ and Python to complete this project.

Additionally, this article mainly introduces the electronic parts, without discussing the mechanical aspects and aesthetics. As you can see, this car is ugly and broke my aesthetic bottom line; I didn’t spend much effort on its appearance. I hope to work on electronic projects that combine aesthetics and functionality in the future, perhaps collaborating with designer friends!

1. Raspberry Pi

First, you need the latest Raspberry Pi, with a customized Linux system installed, connected to the wireless network. You also need an official camera module and set it to be usable in the Raspberry Pi. You can connect the Raspberry Pi to a monitor via HDMI, but a more convenient way is to log in remotely via SSH, so you don’t have to repeatedly unplug, remove, connect the screen, and reinstall the Raspberry Pi during debugging; you can modify the car’s kernel in real-time remotely. Even my Arduino program was written, uploaded, and communicated through the Raspberry Pi, which eliminates the need for a computer connection to the Arduino, making everything smoother and seamless.

The Linux system on the Raspberry Pi supports a graphical desktop, and you can use RealVNC (for Windows) or TightVNC (for Mac) to log in to the graphical desktop remotely (this project doesn’t require it).

2. TensorFlow

This is the core part of the project, and it is actually the simplest to operate because everything is clearly written here; just follow the steps. Run the code here.

Note: A pre-trained model is used here, meaning TensorFlow has predefined parameters, and the training image library is ImageNet. In other words, the objects recognized by the car can only be those included in the image library labels, and there is no “learning” process.

3. The Car

There are many robot chassis kits; choose one you like. A standard kit includes a base, two sets of motors and wheels, a caster wheel, and a battery holder. This project does not require four-wheel drive, and the motor controller that will be used later may only support two motors. I used the first DIY kit that Zhang Yaojie gave me: a wooden board with many holes and 3D printed wheels and connecting parts. This is probably the earliest kit from Luobotai, a maker space in Silicon Valley. Now, Luobotai has become a leading company in domestic robotics education, and their officially produced “Origin” kit is very refined and complete, with mature teaching resources available online. The servos and metal connecting parts used in this project come from the second kit Yaojie sent me – the “Origin” kit. But sentimentally, that rough wooden board kit makes me feel closer, aligning with the idea of “using the simplest materials to achieve prototypes”.

Power supply: The Raspberry Pi requires a 5V, 2A power supply. If placed on the car, it needs a sufficiently powerful power bank. The connection cable between the Raspberry Pi and Arduino also powers the Arduino. However, for the motors, I used an external power source (battery holder). You will find that even without an external power source, the power bank can still drive the motors (though slowly). However, good practice is to have the mechanical part powered independently; the logic circuit part powered by the power bank.

The next step is to control the car. There are two options here; the first does not require Arduino. I used the second one.

3.1: Raspberry Pi as the Mechanical Control

I believe the essence of a microcontroller is not its small size, but the rich GPIO (General Purpose Input-Output) pins, which serve as windows for the program to interact with the outside world. The various electronic components, probes, soldering, and breadboards you see are all interacting with GPIO. You need to understand basic circuit knowledge and know their arrangement on the microcontroller. The Raspberry Pi has a very useful GPIO Python library: gpiozero, which is very straightforward to use.

Typically, four ports are used to control the motors, connecting the positive and negative terminals of two motors, achieving forward/backward/turning by rotating each motor in a forward/reverse direction. The standard circuit model for implementing bidirectional current is the H-bridge. You can purchase a basic H-bridge module.

Since I didn’t have an H-bridge on hand, I didn’t implement this option.

3.2: Arduino as the Mechanical Control

I didn’t have an H-bridge but had a Motor stacking shield for Arduino, which is an H-bridge on Arduino. So I simply used Arduino to control the mechanics (motors + servos), acting as the body; the Raspberry Pi only handles image recognition, acting as the brain.

Arduino is not a Linux system and cannot be directly accessed via SSH to write programs; you need to write, compile, and upload the code externally. I connected the Raspberry Pi and Arduino with a data cable, wrote the program on the Raspberry Pi, and uploaded it. I found a very useful command-line IDE: PlatformIO (which also has a great graphical interface editor). The installation process on Linux is based on Python 2.7. You need some initialization; if you are using an Arduino Uno board like me, just enter the following command:

pio init -b uno

The C++ source code for Arduino is here. After entering this folder, enter the following command to upload:

pio run –target upload

Later I discovered that PlatformIO doesn’t seem to support C++11 for Arduino boards; if you need this, consider using inotool.

4. Wireless Remote Control

There are also two options: wireless keyboard and infrared remote control. I implemented both options.

4.1: Wireless Keyboard

If you used option 3.1 in the previous step, the wireless keyboard control module can be directly embedded into the mechanical control code (I didn’t implement this). If you used option 3.2, you need to convert key operations into mechanical control signals (in text form) on the Raspberry Pi, controlling the Arduino through serial communication. The Python code is here, using a library I wrote to detect keyboard presses. This library matches single key presses to actions like moving forward/backward/turning/stopping; however, I hope to implement long key presses for moving forward/backward/turning, and stopping when not pressing. But I never found a ready-made library (Update: It is said that there is one in PyGame). Later, I tried to write a library using background threads (threading) and system delays, but the results were not ideal; the errors from system delays and program running time never matched well, so I gave up. The code now uses the single key press action/stop scheme. If readers have good libraries, please recommend!

One thing to note is that before using serial communication, you need to disable login (since you have already logged in remotely via SSH); this article explains it clearly.

4.2: Infrared Remote Control

My wife urged me to watch a movie, so I won’t introduce the principle of infrared remote control. The long press of infrared returns a separate value (REPEAT), which allows me to easily implement “long press – car moves, not pressing – car stops.” Additionally, the infrared remote control code is written directly in the Arduino C++ code, eliminating the need for communication through the Raspberry Pi and aligning with the design principle of Arduino as the mechanical control.

PlatformIO does not come with an infrared library; I used this one. Using third-party libraries in PlatformIO is very simple; you don’t need to download and install them, just add the GitHub link in the configuration, as shown in my configuration file.

Another point is that every infrared remote is different. The remote controls for your TV, speakers, and air conditioner can all be used; you just need to match the keys with the corresponding codes before use. The bunch of KEY codes I defined in the code only applies to my remote. You can use this code to obtain the key codes. Note: There are several modes for infrared remotes; mine uses the most common NEC mode. If you match a bunch of garbled codes, you might want to check other modes in the library.

By the way, if you use infrared remote control, you also need to install an IR Receiver on the car. I installed it on the Arduino, using port 8.

If you used option 3.1, you can also directly install the IR Receiver on the GPIO of the Raspberry Pi.

5. Others

These are enough to get you on the road. I installed a servo on the car to control the up and down movement of the camera. The operation is very intuitive; you can understand it by looking at the code. I didn’t install an ultrasonic probe, which can help you detect obstacles and stop before hitting a wall.

If you want to remotely view the real-time camera feed, VNC may not suffice. You can consider this option. However, in this case, TensorFlow will not be able to use the camera anymore. There should be a shared solution, but I haven’t explored it.

That’s about it; my code doesn’t have many comments, and I’ll add more when I have time. If you have questions, feel free to leave a comment.

Bonus: Here is a simple time-lapse photography program; I set it in crontab to take a photo every minute, then turn the photos taken that day into a video at midnight. Next week, I plan to bring it to the company, find a scenic spot, leave it for a few days, and capture the 24-hour scenery of New York.

Related posts

Leave a Comment Cancel reply