Wang Xiaoxin, compiled from Google Cloud Blog | Produced by QbitAI

Programmers, even parenting is so tech-savvy…

This summer, Kaz Sato, who is responsible for developer relations at Google Cloud, developed a “rock-paper-scissors machine” with his son using some sensors and a simple machine learning linear model that can detect gestures for rock, paper, and scissors.

Recently, he also wrote a tutorial based on this process, detailing how to build this machine and how to use machine learning algorithms to solve everyday problems.

Qbit has compiled and organized the following, suitable for students with some programming foundation, requiring about $200 worth of hardware.

Let’s first take a look at this machine:

In the video above, the system we built is detecting my son’s gestures using sensors on the glove, aided by a simple machine learning algorithm written in TensorFlow, then selecting the corresponding option: rock, paper, scissors.

The project source code is here: https://github.com/kazunori279/ml-misc/tree/master/glove-sensor

How is this implemented? Next, I will explain step by step.

Step 1: Making the Glove Sensor

We used littleBits to construct the hardware system. This set of devices is very friendly for children, containing various components such as LED lights, motors, switches, sensors, and controllers that can be linked magnetically without soldering. In this experiment, we used three bending sensors and attached them to a plastic glove.

△ littleBits bending sensor

When you wear the glove and bend your fingers, the sensor outputs a voltage signal that varies from 0V to 5V. By adding an indicator, such as an LED light bar, you can see in real-time the pressure applied to each sensor.

△ Bending sensor outputs 0V-5V signal

Step 2: Installing Arduino and Servo Module

To read the output signals from the bending sensors and control the range of motion of the machine, we used an Arduino module and a servo module. The Arduino module has a microcontroller chip inside and multiple input and output ports. You can write a program in Processing language (similar to C language) on your laptop and compile it, then transfer it to the module via USB.

△ littleBits Arduino module

△ Servo module

△ My son drawing the indicator diagram

Now, all the hardware needed to build the rock-paper-scissors machine is accurately complete, and next, we need to write the code.

△ Hardware part of the rock-paper-scissors machine

Step 3: Write the Program to Read Data from the Bending Sensors

After configuring the hardware, we started writing code on the Arduino module to read data from the bending sensors. In the Arduino IDE, set it to read sensor data every 0.1 seconds and log it to the serial console. The code is as follows.

△ Writing the program in Arduino IDE

When you run this code, you will see such numbers on the console:

Each line of the three numbers represents the output of the three bending sensors. The Arduino module converts the input signal voltage (0V – 5V) into numbers ranging from 0 to 1023.

The above image shows the data for the “rock” gesture, where all sensors are bent. If switched to the “paper” gesture, all sensors are unbent, and the data will tend to 0.

Step 4: Visualizing Data with Cloud Datalab

How do we determine which combination of these three numbers represents “rock”, “paper”, or “scissors”?

The simplest way is to write IF statements that can judge thresholds and conditions. For example:

When all three output values are below 100, output “paper”;
When all three output values are above 400, output “rock”;
If neither of the above conditions is met, output “scissors”.

This program may meet the current task requirements, but it is neither flexible nor stable.

If my son asks me to add more sensors to the glove to capture 10 different gestures, what should I do? Or how to add multiple sensors to a bodysuit to recognize different body postures? Obviously, the above program cannot handle such complex tasks.

Of course, mainly because I am a bit lazy and want to write more powerful and flexible code that can flexibly handle various requests from the changing client (my son) without changing the basic design.

To find a better data processing method, I did some quick analyses of the glove sensor data. The tool I used is Cloud Datalab, a popular version of Jupyter Notebook integrated into the Google Cloud platform, providing a one-stop service for cloud data analysis. You can write Python code in the Web UI, using libraries like NumPy, Scikit-learn, and TensorFlow, and combine them with Google Cloud services (such as BigQuery, Cloud Dataflow, and Cloud ML Engine).

Based on different gestures, I separated the glove sensor data and saved them into three CSV files, each containing 800 rows of data. You can write Python code in Cloud Datalab to read them and convert them into NumPy arrays. Sample code is as follows:

△ Using Cloud Datalab to read CSV files into NumPy arrays

Full code: https://github.com/kazunori279/ml-misc/blob/master/glove-sensor/Rock-paper-scissors.ipynb

You can also use the Matplotlib library to visualize the NumPy arrays. The following code draws a 3D graph, where each axis corresponds to a different sensor.

△ Plotting sensor data in 3D, scaling the original multidimensional data

By observing the above 3D graph, you can see the spatial distribution of the data more clearly.

Step 5: Creating a Linear Model

Next, we need to classify these raw sensor data into three different gesture categories. This will involve linear algebra, which you should have learned in high school or college.

Linear algebra is a mathematical method that can map one space to another. For example, the following formula represents a linear mapping from one-dimensional space to another one-dimensional space.

△ Univariate formula

Here, x and y are variables in two one-dimensional spaces, w is the weight, and b is the bias. Using this formula, you can map the one-dimensional space “the distance traveled by a NYC taxi” to another one-dimensional space “taxi fare”, where 2.5 dollars (cost per mile) is set as the weight, and 3.3 dollars (initial fare) is set as the bias.

△ Mapping function for “distance traveled” and “taxi fare”

From the figure, you can see that the weight and bias (also called parameters) define the slope and initial position of the line. You can create any linear mapping from one-dimensional space to another by adjusting these parameters.

The advantage of linear algebra is that the same formula can be used for linear mappings from any m-dimensional space to any n-dimensional space. For example, when mapping a point from three-dimensional space (x1, x2, x3) to another three-dimensional space (y1, y2, y3), the following formula can be used.

△ Mapping function between three-dimensional spaces

Mathematicians found the above formula too lengthy, so they designed an easier representation: matrix multiplication.

The three-dimensional mapping relationship can also be represented as follows:

Or, more simply written as:

Where x and y are 3-dimensional column vectors, W is a 3×3 weight matrix, and b is a 3-dimensional bias column vector. Yes, it is completely the same as the mapping function in one-dimensional space. Moreover, this formula can be applied to any linear mapping between m-dimensional and n-dimensional spaces, which is called a “linear model”.

So, what role does the linear model play in this project? We can use it to convert the “glove sensor data” in 3-dimensional space into the “rock-paper-scissors” in 3-dimensional space, as shown below:

△ Dynamic transformation of 3-dimensional space

After matching the glove sensor data with the “rock-paper-scissors” 3-dimensional space, it is easy to write IF statements for classification as follows:

When the value in the rock direction is higher than the other directions, output “rock”;
When the value in the paper direction is higher than the other directions, output “paper”;
When the value in the scissors direction is higher than the other directions, output “scissors”.

The linear model can transform the original input data into feature space, where different directions can be set for each feature to be captured, making it easier to handle the transformed data. That’s why I think linear algebra is not only a wonderful mathematical tool for data scientists but also for lazy programmers.

Linear models are particularly important when input data has multiple dimensions or multiple different attributes.

For example, when you connect dozens of bending sensors to a bodysuit, you can use linear models to map the raw data from the sensors to a feature space represented by multiple directions for different body postures (such as standing, sitting, or squatting), without writing many unstable IF statements based on the raw data.

Of course, linear models can also handle unstructured or dense data to extract the specific features you need. This type of data generally has hundreds or even thousands of dimensions, such as images, audio, natural language, and time-series data.

But be aware that linear models are not a panacea.

To achieve higher accuracy in complex unstructured or dense data classification tasks, you may need to use nonlinear models, such as neural networks or support vector machines. In this way, you can extract useful features through nonlinear transformations, which can adjust the raw data in a more flexible manner.

When first dealing with complex data, you can start by trying linear models, and if you cannot extract the required features satisfactorily, you can further try nonlinear models for better results.

Step 6: Let TensorFlow Find Parameters

Now that we understand linear models are very useful and powerful, you may wonder:

How do we determine the best mapping parameters (i.e., weights and biases)?

The answer is: machine learning.

You can use machine learning to let the computer calculate the best parameter combination for the linear model based on the measured input data. Using TensorFlow, it is easy to implement these ideas. In TensorFlow, you simply define the linear model formula “y = Wx + b” as a computation graph, as follows:

In the above code, tf.Variable creates two variables initialized to 0, which store the 3 x 3 weight matrix and the 3-dimensional bias column vector, respectively; tf.placeholder creates a placeholder that can receive any number of glove sensor data as input; tf.matmul is the function for matrix multiplication between glove sensor data and weights, and according to the usage of tf.matmul, glove data is placed in the former.

Note that when you call these functions (the low-level API in TensorFlow), no calculations are performed; you are only building a computation graph, as shown below:

△ Computation graph

The power of machine learning and TensorFlow lies in the ability to let the computer find the best parameters (including weights and biases). In the above example, we input the three sensor data from the glove and their expected output (rock, paper, or scissors). TensorFlow can use this data to perform back-calculation in the graph to find the best weights and biases to achieve the desired linear transformation. This process is called “training the machine learning model”.

With machine learning, you only need to set the inputs and outputs, and you can use the computer to train the best mapping function, which is like automatic programming. In the 21st century, machine learning can be seen as a calculator for engineers to some extent. Anyone can use it to perform simple tasks, reducing coding work.

Step 7: Define a Training “Coach”

When training the linear model, a supervising “coach” is needed. We guide model training to achieve the desired effect with the following two lines of code.

rps_labels is a placeholder used to receive the labels of each row of glove sensor data, defining labels for each glove sensor data in a certain format as follows:

Where [1 0 0] represents rock, [0 1 0] represents paper, and [0 0 1] represents scissors. This is called one-hot encoding, a common method to represent labels in training classification models.

In the second line, we called tf.losses.softmax_cross_entropy to define the loss function. For a detailed introduction to softmax, cross-entropy, and loss functions, you can refer to Wikipedia. For these three, you just need to understand the following:

Softmax compresses the values in rps_data to the range [0, 1], so that it can output estimates of the probabilities for rock, paper, and scissors.
Cross-entropy returns the degree of difference between two probability distributions: the one-hot labels in rps_labels (true values) and the estimated probabilities output by the softmax function.
The loss function is a function that measures the actual accuracy of the model. Therefore, we use cross-entropy as the loss function.

△ “Cross-entropy indicates the degree of difference between actual labels and computed probabilities” — Martin Gorner, TensorFlow and deep learning, without a PhD

In this case, the loss function is regarded as a combination of the softmax function and cross-entropy, indicating the error value corresponding to the current parameters in the linear model. This function is the “coach” in TensorFlow, guiding the model to find the best parameters in the right direction.

By the way, the combination of linear models and softmax function is called multinomial logistic regression or softmax regression, which is a commonly used classification algorithm in statistics and machine learning.

Step 8: Train the Linear Model

Next, we prepare to add an optimizer in TensorFlow to train the model.

tf.train.GradientDescentOptimizer is a commonly used optimizer in TensorFlow that adjusts parameters through the gradient descent algorithm to minimize the error returned by the loss function.

In actual training, you need to create a session and pass the glove sensor data and labels to the optimizer. Since the optimizer gradually changes the parameter values at a specified learning rate, it may need to run thousands of times. By observing the loss values during the training process, you can find that it gradually decreases, indicating that the model’s error rate is getting lower.

After training is complete, you will obtain a series of trained weights and biases, which can be used to map glove sensor data to the corresponding decision space using softmax probabilities. Plotting the softmax probability distribution in the original glove sensor space, as shown below.

△ Estimated probability distribution for rock, paper, and scissors

Step 9: Apply Linear Model on Arduino

Now that we have a practical method to classify glove sensor data, we can complete the coding for Arduino.

Run sess.run(weights) in Datalab to output the trained weight values. Copy these weight values and write them into the Arduino code, and do the same for the bias.

Finally, using the linear model in Arduino, you can map the glove sensor data to the decision space. You can use the following Arduino code to perform matrix multiplication calculations between data, weights, and biases.

Then, compare these values and find the maximum value. Once the gesture represented by the glove is determined, the Servo can correctly control the robotic hand to win the game. In this example, you don’t need to compute the softmax value; you just need to compare the three output values of the linear transformation, which correspond to rock, paper, and scissors.

At this point, you have completed the project, and you can use machine learning to create your own rock-paper-scissors machine.

Next Steps

As mentioned in this article, linear models are powerful tools that can map any m-dimensional space to n-dimensional space through linear transformations. If you find writing multiple IF statements to validate raw input data under complex conditions too tedious, consider using this tool. Unlike directly processing raw data, this method is simpler when dealing with data that can be mapped to feature space. In this article, the feature space refers to the decision space of rock, paper, and scissors.

The key technologies used here are machine learning and TensorFlow, which can help you find the best parameters when constructing linear models. These technologies are not just tools for deep learning and artificial intelligence but can also be used to build powerful and flexible code for various programming tasks.

Original article link: https://cloud.google.com/blog/big-data/2017/10/my-summer-project-a-rock-paper-scissors-machine-built-on-tensorflow

— End —

Join the community

The Qbit AI community group 10 is now recruiting, welcome students interested in AI to add the assistant WeChat qbitbot3 to join the group;

In addition, Qbit is recruiting professional sub-groups (autonomous driving, CV, NLP, machine learning, etc.), aimed at engineers and researchers engaged in related fields.

To join, please add the assistant WeChat qbitbot3 and be sure to note the corresponding group keywords~ After passing the review, we will invite you to join the group. (Professional group review is strict, please understand)

Sincere Recruitment

Qbit is recruiting editors/reporters, with the work location in Zhongguancun, Beijing. We look forward to talented and passionate students joining us! For related details, please reply “recruitment” in the dialogue interface of the Qbit public account (QbitAI).

Mastering Parenting with Technology: Build Your Own Rock-Paper-Scissors Glove for $200 Using TensorFlow and Sensors

Qbit QbitAI

v’ᴗ’ v Tracking new trends in AI technology and products