[26 days until the IAIS2016 International Artificial Intelligence Industry Forum opens]
The IAIS 2016 International Artificial Intelligence Industry Forum is offering limited free tickets, and the deadline for free tickets is November 7. Friends, seize the opportunity! Click Read the original article to get your free ticket now.
This article has been very popular recently: How to build a robot that “sees” with $100 and TensorFlow (by Lukas, founder of CrowdFlower), the Chinese translation is “如何用100美金和TensorFlow来造一个能’看’东西的机器人”, and many public accounts have been reprinting it.
The article is quite interesting, and most of the technologies involved are familiar to me, so I decided to implement it myself.
To realize the entire project, I still lack a car chassis, so let’s first implement the core part of the project: using Raspberry Pi and TensorFlow to recognize objects in the real world. I will add the car later when I have time.
I casually took a picture of an orange given by a colleague on the table, and then we tried to let the Raspberry Pi recognize it.
Task Description
How to build a robot that “sees” with $100 and TensorFlow has clearly stated what we need to do.
Object recognition is one of the hot topics in the field of machine learning recently. Computers are already adept at recognizing faces or distinguishing between cats and dogs, but recognizing a specific object in a larger set of images is still the “Holy Grail” of artificial intelligence, although significant progress has been made in recent years.
We will build a robot that can recognize objects by itself (without cloud services).
Tool Introduction
Raspberry Pi
Raspberry Pi is a single-board computer based on Linux, which is only the size of a palm but has amazing computing power; you can use it as a regular computer.
The mission of Raspberry Pi is to create a computer that inspires children and lowers the cost of trial and error for them.
The latest version of Raspberry Pi is Raspberry Pi 3, which has upgraded its processor to a 64-bit Broadcom BCM2837, and for the first time includes Wi-Fi and Bluetooth functionality, without increasing the price.
TensorFlow
TensorFlow is a machine learning library developed by researchers from the “Google Brain” team, which is open-sourced under the Apache License 2.0. This system can be used in various fields such as speech recognition and image recognition.
In this project, we mainly use a model called inception (based on the ImageNet dataset). It can accomplish object recognition, and we directly use the pre-trained model. Training a model can be a time-consuming and labor-intensive task.
You don’t need to feel guilty about taking shortcuts when using intelligent systems as black boxes (haha, I still feel a bit guilty).
When the electrical era came, those who transformed society were not only those who generated electricity, but also those who understood how to use electricity to transform traditional industries and create new ones, which may have a more profound impact on social change. Even if they might not know about the Carnot cycle or how to convert the kinetic energy from steam into work to drive a generator.
ImageNet Dataset
This dataset contains about 1.2 million training images, 50,000 validation images, and 100,000 test images, divided into 1,000 different categories, used to train image recognition systems in machine learning.
Preparation Work
We first prepare the Raspberry Pi; I am using Raspberry Pi 3 with the raspbian-2016-05-31 version installed (using other versions should also be fine). For related configuration of Raspberry Pi, you can refer to my previous article: Raspberry Pi Tinkering Notes on System Installation and Configuration.
Installing TensorFlow
In the article How to build a robot that “sees” with $100 and TensorFlow, the author used the makefile command provided by TensorFlow to compile locally in Raspberry Pi, which took the author several hours. However, the advantage is that it is done in one go. After installation, you can run:tensorflow/contrib/pi_examples/label_image/gen/bin/label_image
to recognize objects.
I do not plan to compile and install because the process is cumbersome, and I need to worry during the several hours of installation, fearing that some dependency issue will lead to failure and require recompilation. I have suffered greatly when manually compiling OpenCV.
My Installation Process
We first install the TensorFlow that is already suitable for Raspberry Pi; this work has certainly been done by someone else, and sure enough, a quick search on GitHub leads to: tensorflow-on-raspberry-pi. Let’s start the installation:
wget https://github.com/samjabrahams/tensorflow-on-raspberry-pi/raw/master/bin/tensorflow-0.9.0-cp27-none-linux_armv7l.whl
sudo pip install tensorflow-0.9.0-cp27-none-linux_armv7l.whl # This step will install other dependencies; if it's too slow, you can use the -i parameter to use Douban source
The installation process is very quick, just enough time for a cup of tea, and the process is very smooth.
After TensorFlow installation is complete, we start loading the model, and you can refer to here for the installation process: pi_examples.
mkdir ~/tf
cd /usr/local/lib/python2.7/dist-packages/tensorflow/models/image/imagenet
python classify_image.py --model_dir ~/tf/imagenet #--model_dir specifies the directory to store model data
After completion, let’s test if everything is normal.
python /usr/local/lib/python2.7/dist-packages/tensorflow/models/image/imagenet/classify_image.py --model_dir ~/tf/imagenet
If the output is as follows, then everything is ready:
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89233)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00859)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00264)
custard apple (score = 0.00141)
earthstar (score = 0.00107)
Testing
Let’s try my umbrella (taken in the office):
python /usr/local/lib/python2.7/dist-packages/tensorflow/models/image/imagenet/classify_image.py --model_dir ~/tf/imagenet --image_file /tmp/test.jpg # The image must be in jpg format
The output is
The program will give 5 possible objects, with the highest score being the umbrella, which is very accurate.
Next, let’s show it a picture of an orange:
The output is
lemon (score = 0.72036)
orange (score = 0.16516)
spaghetti squash (score = 0.01571)
butternut squash (score = 0.00304)
ocarina, sweet potato (score = 0.00298)
It thinks the highest possibility is lemon, and indeed, orange and lemon are very similar.
If you want to train your own model, you can refer to this article on googleblog: Train your own image classifier with Inception in TensorFlow.
Optimization
Currently, the performance of object recognition is not high and requires some time. Lukas’s robot is very interesting; every time it takes a photo and starts calculating, it saysI'm thinking
. The delay feels very natural, and the machine is indeed “thinking”.
Here are a few possible ways to improve computation speed:
-
Utilizing GPU for computation. Raspberry Pi supports GPU computation, but tensorflow-on-raspberry-pi currently does not have a GPU version of the whl file; both Linux and Mac have GPU versions of the whl file. Detailed discussion on this issue can be found in: Question on GPU.
-
Overclocking the Raspberry Pi can speed up computation.
-
Deploying TensorFlow on the local computer where the car controller sits (the local computer), performing actual computations locally (this allows TensorFlow to be used in any client, but requires an internet connection).
-
Deploying TensorFlow in the cloud to provide network services.
Another optimization is to reduce the image size; you can use the convert command provided by ImageMagick:convert -resize 100x100 test.png dest.jpg
. This way, the image is converted to a smaller size (100×100), which can effectively improve computation speed.
Todo
1. Chinese voice output
-
Bluetooth speaker
-
English to Chinese translation
-
Voice output
2. Load onto the car model
-
L298N driver board
Leave a Comment
Your email address will not be published. Required fields are marked *