Embedded AI Biweekly Newsletter – Issue 14

Click at the end of the article to read the original text and jump to our biweekly newsletter homepage, where you can read the version with item hyperlinks.

Industry News

  • AAAI 2018 | Alibaba Proposes Extremely Low-Bit Neural Networks for Deep Model Compression and Acceleration | Machine Heart Review: The internationally renowned artificial intelligence academic conference AAAI 2018 will be held in February in New Orleans, USA. According to Machine Heart, Alibaba has had 11 papers accepted. In this paper, Alibaba proposes using the ADMM algorithm to learn the architecture of extremely low-bit neural networks.

  • Intel Halts CPU Patch Deployment, Linux Creator Criticizes “Spectre Patch as Complete Garbage” | Linuxer Review: Intel announced on Monday that users should stop deploying the chip security patches for the Meltdown and Spectre vulnerabilities discovered by security personnel last month on affected devices, as unexpected reboot issues and other “unpredictable” system behaviors have been found.

  • Amazon’s Unmanned Store Opens, We Queued On-Site for a Fancy Review | Machine Heart Review: The author personally experienced Amazon’s unmanned store and conducted various tests through the shopping app.

  • Samsung to Launch First AI Chip NPU, Performance Surpassing Huawei and Apple, AI Chip Showdown for Smart Devices | New Intelligence Review: According to foreign media reports, Samsung is nearing completion of an AI chip development, with performance comparable to Apple’s A11 and Huawei’s Kirin 970. Samsung is likely to showcase its new AI technology capabilities alongside the Galaxy S9 at the MWC 2018 conference on February 25.

  • Tsinghua Develops Chip Supporting Neural Networks | Police Technology Review: A research team from Tsinghua University has made a significant breakthrough by developing a chip that supports neural networks, which can be used in battery-operated small devices.

Research Papers

  • [1801.06287] What Does a TextCNN Learn? Review: TextCNN is a convolutional neural network for text, a useful deep learning algorithm for sentence classification tasks such as sentiment analysis and question classification. However, neural networks have long been referred to as black boxes, as interpreting them is a challenging task. Researchers have developed tools to understand CNN’s image classification through deep visualization, but research on deep text remains insufficient. In this paper, we attempt to understand what a TextCNN learns on two classic NLP datasets. Our work focuses on the functions of different convolutions.

  • [1801.06434] EffNet: An Efficient Structure for Convolutional Neural Networks Review: As more convolutional neural networks emerge that need to run effectively on embedded applications and mobile hardware, streamlined models have become a hot research topic, with various methods ranging from binary networks to modified convolutional layers. We contribute to the latter and propose a new convolution block that significantly reduces computational burden while surpassing current state-of-the-art. Our model, called effnet, optimizes the model to be slim, addressing issues with existing models like MobileNet and ShuffleNet.

  • [1801.07606] Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning Review: Many interesting problems in machine learning are being re-examined with new deep learning tools. A recent significant development in graph-based semi-supervised learning is Graph Convolutional Networks (GCN), which effectively integrates local vertex features and their topological structure in convolutional layers. While it is on par with other state-of-the-art methods and models, its mechanisms remain unclear, requiring substantial labeled data for validation and model selection. In this paper, we delve deeper into the GCN model and address its fundamental limitations. First, we find that the GCN graph convolution model is actually a special form of Laplacian smoothing, which is the main reason for GCNs’ operation but also introduces potential issues in many-layer smoothing convolutions. Secondly, to address the limitations of shallow GCN architectures, we propose collaborative training and self-training methods for GCNs. Our approach significantly improves GCN performance with very few labeled samples, alleviating the need for additional labels. Extensive benchmarking confirms our theories and recommendations.

  • [1801.06700] A Deep Reinforcement Learning Chatbot (Short Version) Review: Milabot can converse with people on popular topics through voice and text. The system consists of natural language generation and retrieval models, including neural network and template-based models. Using reinforcement learning with crowdsourced data and real user interactions, the system has been trained to select an appropriate overall response model. The system has been tested through AB testing with real-world users, significantly outperforming other systems. The results highlight the potential of integrating systems with deep reinforcement learning as a productive development for real-world, open-domain conversational agents.

  • [1801.07829] Dynamic Graph CNN for Learning on Point Clouds Review: This paper proposes a new neural network module called edgeconv suitable for advanced tasks including classification and segmentation of point clouds. Edgeconv is differentiable and can be inserted into existing architectures. Compared to existing modules that primarily operate in external space or process each point independently, EdgeConv has several attractive features: it utilizes local neighborhood information; it can be stacked or repeatedly applied to learn global shape features; and in multi-layer systems, it captures semantic features in the original embedding’s potential long-distance. In proposing this module, we provide extensive evaluation and analysis, revealing how edgeconv captures and utilizes the geometric properties of fine-grained point clouds. This method achieves state-of-the-art performance standards on benchmarks, including modelnet40 and s3dis.

  • [1801.06867] Scene recognition with CNNs: objects, scales and dataset bias Review: This paper proposes an alternative approach that considers scale, resulting in significant recognition gains. By analyzing the responses of ImageNet CNNs and local CNNs at different scales, we find that the performance limitations induced by data bias at all scales using the same network. Therefore, adopting feature extraction for each specific scale (i.e., specific scale CNNs) is key to improving recognition, as objects in scenes have their specific range of scales. Experimental results show that recognition accuracy largely depends on scale, and a simple yet carefully selected multi-scale combination of ImageNet CNNs and local CNNs can push state-of-the-art recognition accuracy on sun397 to 66.26% (even 70.17% with deeper structures, comparable to human performance).

Open Source Projects

  • romulus914/CNN_VGG19_verilog: Convolution Neural Network of VGG19 Model in Verilog Review: Convolutional neural network of the VGG19 model in Verilog.

  • cliffordwolf/picorv32: PicoRV32 – A Size-Optimized RISC-V CPU Review: PicoRV32 is a CPU core implementing the RISC-V RV32IMC instruction set. It can be configured as RV32E, RV32I, RV32IC, RV32IM, or RV32IMC cores, and can optionally include a built-in interrupt controller.

  • azonenberg/openfpga: Open FPGA tools Review: Updated the schematic for v0.2 of the heat sink.

  • Detectron Deep Dive Series: Learning Rate Adjustment and Pitfalls | Machine Heart Review: Detectron is open source, and the author shares their experiences and demonstrates learning rate adjustments.

  • Uber Proposes SBNet: Accelerating Convolutional Networks Using Activated Sparsity | Uber Review: Uber researchers proposed an algorithm SBNet that can significantly improve speed while enhancing detection accuracy, and introduced this research on their engineering development blog. Additionally, the project’s code has been released on GitHub.

  • Tenfold Model Computation Time Increases Only 20%: OpenAI Releases Gradient Replacement Plugin | GitHub Review: A toolkit developed by OpenAI researchers Tim Salimans and Yaroslav Bulatov allows you to balance computational power and memory usage, enabling your model to occupy memory more reasonably. For feedforward models, we can use this tool to fit over ten times larger models on our GPU, with only a 20% increase in computation time.

  • TensorFlow Officially Releases 1.5.0, Supporting CUDA 9 and cuDNN 7, Doubling Speed Review: TensorFlow officially released version 1.5.0 today, supporting CUDA 9 and cuDNN 7, further speeding up performance. Starting from version 1.6, precompiled binaries will use AVX instructions, which may break TensorFlow on older CPUs.

Blog Posts

  • In-Depth Reading of EETimes’ AI Chip Article | StarryHeavensAbove Review: “AI Silicon Preps for 2018 Debuts”, the author extracts various issues related to AI chips from the article.

  • Analysis of Flexibility in Image and Video AI Chips | StarryHeavensAbove Review: This article analyzes the flexibility range of AI chips by listing typical algorithms, network structures, platforms, and interfaces for image and video.

  • Demand Analysis for Voice and Text AI Chips | Machine Heart Review: The author deeply analyzes the demand for voice and text deep learning and AI chips.

  • Overview of Image Classification, Localization, Detection, Semantic Segmentation, and Instance Segmentation Methods Review: The author from the Department of Computer Science at Nanjing University, Machine Learning and Data Mining Institute (LAMDA), systematically organizes the applications of deep learning in four fundamental tasks in computer vision: image classification, localization, detection, semantic segmentation, and instance segmentation.

  • PTGAN: Generative Adversarial Network for Person Re-Identification | PaperDaily #36 Review: This paper proposes a generative adversarial network PTGAN for person re-identification, using GAN to transfer individuals from one dataset to another.

  • TVM Optimization Tutorial | Quantum Bit Review: TVM addresses deployment issues across different hardware platforms by introducing a unified IR stack. Using TVM/NNVM can generate efficient kernels for ARM Mali GPUs and perform end-to-end compilation.

  • Implementing Mobile Video Tagging Using Video Object Tracking Review: This article discusses using classic video object tracking algorithms in computer vision to achieve lightweight video tagging functionality, thereby generating richer, personalized video content.

Editor: Wang Jianzhang, Yuan Shuai

PerfXLab Pengfeng Technology

Embedded AI Biweekly Newsletter - Issue 14

Let’s Chat

Click at the end to read the original text and jump to our biweekly newsletter homepage, where you can read the version with item hyperlinks.

Leave a Comment