What Level is the Ascend 910 NPU and How Does it Perform in the DeepSeek Integrated Machine?

What Level is the Ascend 910 NPU and How Does it Perform in the DeepSeek Integrated Machine?

The Ascend DeepSeek integrated machine is an AI solution based on self-developed Ascend AI chips (such as Ascend 910B and 910C) deeply integrated with the DeepSeek large model, aimed at providing a high-performance, low-cost, domestically produced AI computing power platform. This article provides a detailed analysis from various dimensions including the technology, products, architecture, specifications, … Read more

TinyML Breakthrough: Deploying 1KB Models with MicroTVM on LoRa

TinyML Breakthrough: Deploying 1KB Models with MicroTVM on LoRa

Hey, recently I’ve been tinkering with something fun — running machine learning on those tiny IoT devices! Seeing the number “1KB”, many people shake their heads: how is that possible? Indeed, a high-definition photo takes several MB, so where’s the magic that allows AI to fit into such a tiny space? Actually, TinyML is such … Read more

Java Edge AI Inference: Deploying TensorFlow Lite on Raspberry Pi

Java Edge AI Inference: Deploying TensorFlow Lite on Raspberry Pi

Click the blue text to follow us Java Edge AI Inference: Deploying TensorFlow Lite on Raspberry Pi To be honest, when I first encountered edge AI, I completely went in the wrong direction. I thought that simply shrinking the model would allow it to run, but I ended up hitting a lot of pitfalls. At … Read more

LoRA-Dash: A More Efficient Method for Task-Specific Fine-Tuning

LoRA-Dash: A More Efficient Method for Task-Specific Fine-Tuning

Article Link: https://arxiv.org/abs/2409.01035 Code Link: https://github.com/Chongjie-Si/Subspace-Tuning Project Homepage: https://chongjiesi.site/project/2024-lora-dash.html Due to the rich content of the LoRA-Dash paper, compressing 30 pages of content into 10 pages is a highly challenging task. Therefore, we have made careful trade-offs between readability and content integrity. The starting point of this article may differ from the original paper, aligning … Read more

Implementing Neural Networks on FPGAs

Implementing Neural Networks on FPGAs

Author | Shawn Ouyang, System Architect at Ruijun Micro UK R&D Center; Dr. Andrew, Fellow at Ruijun Micro UK Research Center 1. Introduction FPGA is a device for implementing programmable digital logic. Similar to circuit architectures like CPU, GPU/NPU and dedicated ASIC, FPGAs have also begun to be widely used for implementing neural networks (NN). … Read more

Quantization and Precision Optimization of Neural Network Models in C++

Quantization and Precision Optimization of Neural Network Models in C++

1. Introduction: The Wonderful Collision of C++ and Neural Networks In today’s technological wave, neural networks are undoubtedly a shining star, driving the field of artificial intelligence forward at an astonishing pace. From accurately identifying various objects in image recognition to enabling smooth human-computer dialogue in natural language processing, and assisting doctors in detecting disease … Read more