Embedded Machine Learning (TinyML) Practical Tutorial

Introduction: This issue is the AI Briefing for 20210219, bringing you 8 news articles. The first wave after the Spring Festival is here, spring has come, and work resumes! New year, new atmosphere, and I hope everyone pays more attention to the community, where more exciting content awaits you!

This article has a total of 2760 words and will take about 6 to 10 minutes to read through.

1. STM32 Embedded Machine Learning (TinyML) Practical Tutorial – 01 | Edge Intelligence Lab

Embedded Machine Learning (TinyML) Practical Tutorial
Part 1: Overview
Using machine learning techniques to develop machine vision applications on the STM32H747I Discovery board. This tutorial is officially published by ST (STMicroelectronics).
Original Chinese subtitles by Edge Intelligence Lab, thanks for the support.
Documents mentioned in the video (also available for download from ST’s official website):

Link: https://pan.baidu.com/s/1K1Dr2vMUZ8UmtVbZHkKAVA

Extraction code: w41p

Embedded Machine Learning (TinyML) Practical Tutorial

2. Born for AI, Breaking the Storage Wall, New Type of Embedded Capacitor-less DRAM Proposed by Georgia Tech and Others | Machine Heart

Embedded Machine Learning (TinyML) Practical Tutorial
One of the biggest issues in modern computing is the “storage wall,” which refers to the gap between processing time and the time taken to transfer data from separate DRAM memory chips to the processor. The increasing prevalence of AI applications will only exacerbate this issue, as vast networks involved in facial recognition, speech understanding, and product recommendations rarely fit into the onboard memory of processors.
At the IEEE International Electron Devices Meeting (IEDM) held in December 2020, some research groups suggested that a new type of DRAM might be a solution to the “storage wall” problem. They stated: “This new type of DRAM is made from oxide semiconductors and is built into layers above the processor, having bit lengths hundreds or thousands of times that of commercial DRAM, and can provide a large area when running large neural networks, saving a significant amount of energy.”
The new embedded DRAM is made with only two transistors and no capacitors, abbreviated as 2T0C. This is possible because the gate of the transistor acts as a natural capacitor (albeit a small one). Therefore, the charge representing the bit can be stored here. This design has several key advantages, especially for AI.

Original link: https://spectrum.ieee.org/tech-talk/semiconductors/memory/new-type-of-dram-could-accelerate-ai

3. Next Year, I Want to Use AI to Write Couplets for the Whole Village | HyperAI Super Neural

Embedded Machine Learning (TinyML) Practical Tutorial
As the Spring Festival comes to an end, are you still immersed in the festive atmosphere?
On the 29th and 30th of the last lunar month, every household starts pasting couplets. This year, various AI couplet writing applications have gone online to help everyone write couplets. Why not give it a try?
Couplets are all about being “paired” and require symmetry and tonal coordination. However, modern people’s skills in couplet writing are far inferior to those of ancient literati, and sometimes they might even confuse the upper and lower lines. Yet, clever AI has learned to write couplets on its own.
  • Test address:

    https://ai.binwang.me/couplet/

  • Github:

    https://github.com/wb14123/couplet-dataset

  • Dataset address:

    https://hyper.ai/datasets/14547

4. DeepMind’s Latest Research NFNet: Achieving Unprecedented Accuracy in Deep Learning Models Without Normalization | Machine Heart

Embedded Machine Learning (TinyML) Practical Tutorial
  • Paper:

    https://arxiv.org/abs/2102.06171

  • DeepMind also released the implementation of the model:

    https://github.com/deepmind/deepmind-research/tree/master/nfnets

We know that in the data fed to machine learning models, we need to perform normalization on the data.
After data normalization, the data is “flattened” into a uniform range, with the output range reduced to between 0 and 1. It is generally believed that such operations make the process of finding the optimal solution smoother, allowing the model to converge more easily to the best level.
However, this “stereotype” has recently been challenged. DeepMind researchers proposed a deep learning model NFNet that does not require normalization, yet achieves state-of-the-art performance on large image classification tasks.
The first author of the paper, DeepMind research scientist Andrew Brock, stated: “We focused on developing high-performance architectures that can be trained quickly and demonstrated a simple technique (Adaptive Gradient Clipping, AGC) that allows us to train on large batches and large-scale data augmentation while achieving state-of-the-art performance.”

5. Restore Your Face in Games, Just Give AI a Photo, Produced by NetEase & University of Michigan | AAAI 2021 Open Source | Quantum Bit

  • Paper address: https://arxiv.org/abs/2102.02371

  • GitHub address: https://github.com/FuxiCV/MeInGame

Give AI a photo of Mao Buyi, and it can automatically generate a traditional style character.
Embedded Machine Learning (TinyML) Practical Tutorial
Now, if you want to customize your face in a game, you no longer have to spend time figuring out parameters.
Friends familiar with games may recognize that this AI face customization technique comes from NetEase Fuxi AI Lab and the University of Michigan.
Now, the latest relevant research has been published at AAAI 2021.
According to the authors, this method called MeInGame can be integrated into most existing 3D games and is more cost-effective and generalizable compared to methods based solely on 3D Morphable Face Models (3DMM).

6. Google Open Source Computing Framework JAX: 30 Times Faster than Numpy and Can Run on TPU! | New Intelligence

Embedded Machine Learning (TinyML) Practical Tutorial
  • Github:

    https://github.com/google/jax

  • Quick start link:

    https://jax.readthedocs.io/en/latest/notebooks/quickstart.html

Everyone is already familiar with numpy, TensorFlow, and PyTorch, but do you know about JAX?
After JAX was released, some users tested it and found that using JAX, Numpy operations can be over thirty times faster!
Below is the performance using Numpy:
1import numpy as np  # Using standard numpy, operations will be executed on CPU.
2x = np.random.random([5000, 5000]).astype(np.float32)
3%timeit np.matmul(x, x)

Run results:

1 loop, best of 3: 3.9 s per loop

And below is the performance using JAX’s Numpy:

1import jax.numpy as np # Using the "JAX version" of numpy 
2from jax import random # Note that the random number API under JAX is different 
3x = random.uniform(random.PRNGKey(0), [5000, 5000]) 
4%timeit np.matmul(x, x)

Run situation:

1 loop, best of 3: 109 ms per loop

We can see that using standard numpy, the runtime is about 3.9s, while using JAX’s numpy, the runtime is only 0.109s, resulting in a speedup of over thirty times!
So what exactly is JAX?
JAX is Google’s open-source numpy that can run on CPU, GPU, and TPU, designed as a high-performance automatic differentiation computing framework for machine learning research. In simple terms, it’s numpy accelerated by GPU and TPU, supporting automatic differentiation (autodiff).

7. On the Fourth Day of the New Year, It’s Time to Learn: MIT 6.S191 Video and PPT Updated! Users: This is One of the Best Introductory Courses on Deep Learning | Machine Heart

Embedded Machine Learning (TinyML) Practical Tutorial

Course homepage: http://introtodeeplearning.com/

Editor: I have previously recommended this course, and the feedback after the course started has indeed been good.

With the Spring Festival just passed, let’s learn something simple first.
The more introductory the course, the harder it might be to teach.
In the field of deep learning, we can find various introductory courses, but very few can truly help people get started.
Among the few “true introductory” courses, MIT’s “Introduction to Deep Learning (6.S191)” is indeed worth mentioning. Recently, the update of this course has attracted a new wave of attention, with over 70,000 views on the released videos in just a few days.
Some even praised it as: “Among existing courses, this is definitely one of the best introductory courses on deep learning.”
From various feedback, its “goodness” mainly reflects in the following aspects:
  1. Low threshold. Some students reported that this course does not require deep foundational knowledge, nor does it require proficiency in Python, and many students from non-computer science fields can benefit greatly.
  2. The instructor’s explanations are clear and easy to understand, and the PPT is very well made.

8. Set a Small Goal for the New Year! Write Code More Normatively! | Xixiaoyao’s Cute Selling House

Embedded Machine Learning (TinyML) Practical Tutorial

If you are shy about showing your code to your colleagues, you might want to take a look…

Embedded Machine Learning (TinyML) Practical Tutorial
Embedded Machine Learning (TinyML) Practical Tutorial

9. Others

  • Opening “Bad Apple!!” with Raspberry Pi is actually like this | Quantum Bit

  • Cutting-edge | Really fun, programmers use AI to recognize their twin sons! “This Raspberry Pi-based facial recognition system is more accurate than I am” | Robot Lecture Hall

  • Super NVIDIA A100, IBM announces the world’s first 7nm energy-efficient training inference chip, debuts at the top conference ISSCC 2021 | Machine Heart

  • FastFormers: Achieving 223 times inference acceleration of Transformers on CPU | CVer

Embedded Machine Learning (TinyML) Practical Tutorial

👇👇👇 Click to read the original text and enter the official website

Leave a Comment