Since I bought my first Arduino kit, I have been involved in robotics for several years, but it wasn’t until recently that I started working on complete projects. During this time, two skills opened the door to a new world for me: Python and Linux. Behind them are powerful open-source communities. Once you master these two tools (meta-tools), you will feel that there are plenty of handy weapons available online.
Last week, during an internal programming training at the company, a phrase resonated with me: We are software engineers, not just programmers. Our job is not to write programs, but to use tools effectively to solve problems. At Google, if you feel you have to start from scratch to write a feature, it just means you haven’t found the right tool yet. This is even more true in the open-source community.
This is a remote-controlled car that can be controlled via infrared remote control or a wireless keyboard, with TensorFlow monitoring the camera’s feed in real-time and vocalizing the recognized objects. All the code is available on my GitHub.
This idea is not originally mine; it comes from a blog post by Lukas Biewald from last September. The core part, where TensorFlow recognizes the camera’s images and provides vocal output, is the open-source work of my company’s AI engineer, Pete Warden. Unlike the original blog, I incorporated Arduino as the main control unit and learned how to communicate between Arduino and Raspberry Pi (serial communication) during the process. Many useful skills and tools were utilized, and I’m compiling them here for fellow enthusiasts to exchange ideas!
The entire project is completed in a command-line environment, with no graphical interface. If you are not familiar with the Linux system, it may be a bit challenging. However, since you are starting to play with robotics, how can you not learn Linux? I self-studied Linux through ‘The Linux Command Line’ and later tried building Linux from source code, finally overcoming the resistance to command line that grew from growing up in a Windows environment. Trust me, overcoming this barrier will open the door to a new world for you. Besides, working in the command line is much cooler and geekier, isn’t it? In addition to Linux, you also need to know C++ and Python to complete this project.
Additionally, this article mainly covers the electronic part and does not discuss the mechanical and aesthetic aspects. As you can see, this car is quite ugly and does not meet my aesthetic standards; I did not invest effort into its appearance. I hope to do some electronic projects that combine aesthetics and functionality in the future, perhaps collaborating with designer friends!
1. Raspberry Pi
First, you need the latest model of Raspberry Pi, with a customized Linux system installed and connected to Wi-Fi. You will also need an official camera and set it up for use in the Raspberry Pi. You can connect the Raspberry Pi to a monitor via HDMI, but a more convenient approach is to SSH into it remotely, so you don’t have to repeatedly unplug the Raspberry Pi from the car, connect it to a screen, and then reinstall it back on the car during debugging; you can modify the car’s kernel remotely in real time. Even my Arduino program is written, uploaded, and communicated through the Raspberry Pi, eliminating the need to connect the computer to the Arduino, making everything smoother and seamless.
The Linux system on Raspberry Pi supports a graphical desktop, and you can use RealVNC (for Windows) or TightVNC (for Mac) to log in remotely to the graphical desktop. (This project does not require this.)
2. TensorFlow
This is the core part of the project, and surprisingly, it is the simplest to operate because everything is clearly written out, and you just need to follow the steps. The code runs here.
Note: This uses a pre-trained model, meaning TensorFlow has predefined parameter sets with a training image library of ImageNet. In other words, the objects recognized by the car can only be those included in the image library, and there is no ‘learning’ process involved.
3. The Car
There are many robot chassis kits available; choose one that you like. A standard kit includes a base, two sets of motors and wheels, a caster wheel, and a battery box. This project does not require a four-wheel drive, and the motor controller that will be used later may only support two motors. I used the first DIY kit given to me by Zhang Yaojie: a wooden board with many holes and 3D printed wheels and connecting parts. This is probably the earliest kit from Luobo Tailala, sourced from a maker space in Silicon Valley.
Now, the officially produced ‘Origin’ kit by Luobo Tailala is quite complete, and there are mature teaching resources available online. The servos and metal connecting parts used in this project come from the second kit given to me by Yaojie—the ‘Origin’ kit. But emotionally, that rough wooden kit feels closer to me, aligning with the idea of ‘using the simplest materials to realize prototypes.’
Power supply: The Raspberry Pi requires a 5V, 2A power supply, and if you place it on the car, you will need a power bank with sufficient current. The connecting wires between the Raspberry Pi and Arduino also power the Arduino. However, I used an external power source (battery box) for the motors. You will find that even without an external power source, the power bank can still drive the motors (though very slowly). However, good practice is to have the mechanical parts powered independently; the logic circuit part can be powered by the power bank.
The next step is to control the car. There are two options: the first does not require Arduino. I used the second option.
3.1 Raspberry Pi as the Main Control Unit
I believe the essence of a microcontroller is not its small size, but its rich GPIO (General Purpose Input-Output) capabilities, which serve as the window for the program to communicate with the external world. Various electronic components, probes, soldering, and breadboards you see are all interacting with GPIO. You need to understand basic circuit knowledge and know their arrangement on the microcontroller. Raspberry Pi has a very handy GPIO Python library: gpiozero, which is straightforward to use.
Typically, four ports are used to control the motors, connecting the positive and negative terminals of two motors, allowing the forward/reverse rotation of each motor to achieve the car’s forward/backward/turning movements. The standard circuit model for bidirectional current is the H-bridge. You can purchase a basic H-bridge module.
Since I didn’t have an H-bridge on hand, I did not implement this option.
3.2 Arduino as the Main Control Unit
I didn’t have an H-bridge, but I had a Motor stacking shield for Arduino, which acts as an H-bridge on Arduino. So I simply used Arduino to handle the mechanical aspects (motors + servos), equivalent to the body; the Raspberry Pi only handles image recognition, equivalent to the brain.
Arduino is not a Linux system, and you cannot directly SSH into it to write programs; you need to write the code externally, compile it, and upload it. I connected the Raspberry Pi and Arduino via a data cable, and after writing the program on the Raspberry Pi, I uploaded it. I found a very handy command-line IDE: PlatformIO (which also has a great graphical interface editor). The installation process on Linux is based on Python 2.7. You need some initialization; if, like me, you are using an Arduino Uno board, you can enter the following command:
pio init -b uno
The C++ source code for Arduino is here. After entering this folder, simply enter the following command to upload:
pio run –target upload
Later I found that PlatformIO does not seem to support C++11 for Arduino boards; if you need that, you might consider inotool.
4. Wireless Remote Control
There are also two options: wireless keyboard and infrared remote control. I implemented both options.
4.1 Wireless Keyboard
If you used option 3.1 in the previous step, the wireless keyboard control module can be directly embedded in the mechanical control code (which I did not implement). If you used option 3.2, you need to convert the key operations on the Raspberry Pi into mechanical control signals (in text form) to control Arduino via serial communication.
The python code is here, which uses my own library to detect keyboard keys. This library matches single key presses to actions such as forward/backward/turn/stop; however, I hope to implement long key presses for forward/backward/turning, and stopping when no key is pressed. But I have never found an existing library (Update: it is said that there is one in PyGame).
Later, I tried to write a library using background threads (threading) and system delays, but the results were not ideal; the mismatch between system delays and program running time always caused discrepancies, so I gave up. Currently, the code uses a single key action/stop scheme. If readers have good libraries, please recommend!
One point to note is that before using serial communication, you need to disable login (since you have already SSHed in remotely), and this article explains it quite clearly.
4.2 Infrared Remote Control
The long press of the infrared remote returns a unique value (REPEAT), which allows me to easily implement “long press – car moves, no press – car stops.” Additionally, the infrared remote control code is directly written in the Arduino’s C++ code, eliminating the need for communication through the Raspberry Pi and serial communication, which aligns better with the design principle of Arduino as the main control unit.
PlatformIO does not come with an infrared library; I used this one. Using third-party libraries in PlatformIO is incredibly easy; you just need to add the GitHub link directly in the configuration without downloading and installing, as shown in my configuration file.
Another point is that every infrared remote is different. The remotes for your TV, sound system, and air conditioning can all be used; you just need to match the keys with the corresponding codes before using them. The keys I defined in the code only apply to my remote. You can use this code to obtain the key codes. Note: There are several modes for infrared remotes; my remote uses the most common NEC mode, so if you get a bunch of garbled codes, consider other modes in the library.
By the way, if you use an infrared remote, you also need to install an IR Receiver on the car. I installed it on the Arduino, using pin 8.
If you used option 3.1, you can also directly connect the IR Receiver to the GPIO of the Raspberry Pi.
5. Others
These are enough to get you on the road. I installed a servo on the car to control the camera’s up and down movements. The operation is very intuitive, and you can understand it just by looking at the code. I did not install an ultrasonic probe, which can help detect obstacles and forcibly stop before hitting a wall.
If you want to remotely view the real-time feed from the camera, VNC cannot handle it. You might consider this solution. However, this way, TensorFlow will no longer be able to use the camera. There should be a shared solution, but I haven’t explored that.
That’s about it; my code doesn’t have many comments, and I will add them when I have time. If you have any questions, feel free to leave a message for me.
Bonus: Here’s a simple time-lapse photography program; I set it to take a picture every minute using crontab, then convert the pictures taken that day into a video every night. Next week, I plan to take it to the company, find a scenic spot, leave it for a few days, and capture a 24-hour view of New York.
Leave a Comment
Your email address will not be published. Required fields are marked *