Guide to Deploying Lightweight AI on STM32: Making Microcontrollers “Smart” with TinyFlow

Guide to Deploying Lightweight AI on STM32: Making Microcontrollers "Smart" with TinyFlow

This guide covers hardware selection, model optimization, toolchain operations, code implementation, and debugging techniques, using the STM32 series microcontrollers as an example: 1.Hardware Selection and Configuration (1)Clarify Requirements Computational Requirements: Simple classification tasks (e.g., binary classification of sensor data):Cortex-M0+/M3 (e.g., STM32G0/F1) are sufficient. Complex tasks (image recognition, speech processing): Choose models with hardware acceleration (e.g., … Read more

How to Run AI on Low-Power Embedded Devices?

How to Run AI on Low-Power Embedded Devices?

Click the blue textto follow us Bringing intelligence to small devices The application of AI in embedded systems is changing our understanding of small, low-power devices. From smartwatches to industrial sensors, embedded systems are leveraging AI to process data locally. This technological integration is fundamentally transforming how low-power devices operate in real-time. This article will … Read more

How Huawei Tamed a Trillion-Parameter Sparse Model? Key Technical Breakthroughs in MOE Training on Ascend NPU

How Huawei Tamed a Trillion-Parameter Sparse Model? Key Technical Breakthroughs in MOE Training on Ascend NPU

How Huawei Tamed a Trillion-Parameter Sparse Model? Key Technical Breakthroughs in MOE Training on Ascend NPU In the competition of large models, sparse large models represented by Mixture of Experts (MoE) are gradually becoming the new favorites in the AI field due to their outstanding efficiency. Recently, Huawei released a technical report titled “Pangu Ultra … Read more

What Level is the Ascend 910 NPU and How Does it Perform in the DeepSeek Integrated Machine?

What Level is the Ascend 910 NPU and How Does it Perform in the DeepSeek Integrated Machine?

The Ascend DeepSeek integrated machine is an AI solution based on self-developed Ascend AI chips (such as Ascend 910B and 910C) deeply integrated with the DeepSeek large model, aimed at providing a high-performance, low-cost, domestically produced AI computing power platform. This article provides a detailed analysis from various dimensions including the technology, products, architecture, specifications, … Read more

TinyML Breakthrough: Deploying 1KB Models with MicroTVM on LoRa

TinyML Breakthrough: Deploying 1KB Models with MicroTVM on LoRa

Hey, recently I’ve been tinkering with something fun — running machine learning on those tiny IoT devices! Seeing the number “1KB”, many people shake their heads: how is that possible? Indeed, a high-definition photo takes several MB, so where’s the magic that allows AI to fit into such a tiny space? Actually, TinyML is such … Read more

Java Edge AI Inference: Deploying TensorFlow Lite on Raspberry Pi

Java Edge AI Inference: Deploying TensorFlow Lite on Raspberry Pi

Click the blue text to follow us Java Edge AI Inference: Deploying TensorFlow Lite on Raspberry Pi To be honest, when I first encountered edge AI, I completely went in the wrong direction. I thought that simply shrinking the model would allow it to run, but I ended up hitting a lot of pitfalls. At … Read more

LoRA-Dash: A More Efficient Method for Task-Specific Fine-Tuning

LoRA-Dash: A More Efficient Method for Task-Specific Fine-Tuning

Article Link: https://arxiv.org/abs/2409.01035 Code Link: https://github.com/Chongjie-Si/Subspace-Tuning Project Homepage: https://chongjiesi.site/project/2024-lora-dash.html Due to the rich content of the LoRA-Dash paper, compressing 30 pages of content into 10 pages is a highly challenging task. Therefore, we have made careful trade-offs between readability and content integrity. The starting point of this article may differ from the original paper, aligning … Read more

Implementing Neural Networks on FPGAs

Implementing Neural Networks on FPGAs

Author | Shawn Ouyang, System Architect at Ruijun Micro UK R&D Center; Dr. Andrew, Fellow at Ruijun Micro UK Research Center 1. Introduction FPGA is a device for implementing programmable digital logic. Similar to circuit architectures like CPU, GPU/NPU and dedicated ASIC, FPGAs have also begun to be widely used for implementing neural networks (NN). … Read more

Quantization and Precision Optimization of Neural Network Models in C++

Quantization and Precision Optimization of Neural Network Models in C++

1. Introduction: The Wonderful Collision of C++ and Neural Networks In today’s technological wave, neural networks are undoubtedly a shining star, driving the field of artificial intelligence forward at an astonishing pace. From accurately identifying various objects in image recognition to enabling smooth human-computer dialogue in natural language processing, and assisting doctors in detecting disease … Read more