NanoLLM Articles

Efficient ML Systems: TinyChat Engine and On-Device LLM Inference

2025-04-20 by boardor

Click belowcard, follow the “LiteAI” public account Hi, everyone, I am Lite. I recently shared the first to nineteenth articles on efficient large model full-stack technology, including large model quantization and fine-tuning, efficient inference of LLMs, quantum computing, generative AI acceleration, etc. Here is the link: Efficient Large Model Full-Stack Technology (Nineteen): Efficient Training and … Read more

Efficient LLM Inference with Block Sparse Attention

2025-04-20 by boardor

Click the card below to follow the “LiteAI” public account Hi, everyone, I am Lite. A while ago, I shared the Efficient Large Model Full-Stack Technology from Part 1 to Part 19, which includes content on large model quantization and fine-tuning, efficient LLM inference, quantum computing, generative AI acceleration, etc. The content links are as … Read more

The Future of Accessible and Sustainable AI

2025-04-20 by boardor

Click the “blue text” above, and select “Star this“ Key messages delivered at D1 time! The rise of edge AI is already underway, as we see AI deployed in smaller, more specialized models, particularly in the Internet of Things. This AI approach shifts processing tasks from centralized data centers to the “edge” of the network, … Read more

Stronger Small LLM: Zephyr-7B

2025-04-15 by boardor

ZEHPYR-7B is one of the next-generation large language models (LLMs) that has gained significant popularity in the AI community. The model was created by Hugging Face and is essentially a fine-tuned version of Mistral-7B trained on public datasets, optimized through knowledge distillation techniques. This model has achieved incredible results, surpassing many larger models across various … Read more

Detailed Explanation of the Zephyr Model

2025-04-15 by boardor

Click the “Deephub Imba“, follow the public account, and don’t miss out on great articles!! Zephyr utilizes dDPO, significantly improving intent alignment and AI feedback (AIF) preference data, following steps similar to InstructGPT. Training Method Distilled Supervised Fine-Tuning (dSFT) Starting from the original LLM, it is first trained to respond to user prompts, traditionally done … Read more

Overview of LoRA and Its Variants: LoRA, DoRA, AdaLoRA, Delta-LoRA

2025-04-15 by boardor

Source: Deephub Imba This article is about 4000 words long, and it is recommended to read in 6 minutes. In this article, we will explain the basic concepts of LoRA itself and then introduce some variants that improve the functionality of LoRA in different ways. LoRA can be said to be a major breakthrough for … Read more

Configuring Different Learning Rates: Can LoRA Improve Further?

2025-04-15 by boardor

©PaperWeekly Original · Author | Su Jianlin Unit | Dark Side of the Moon Research Direction | NLP, Neural Networks LoRA (Low-Rank Adaptation) is one of the parameter-efficient fine-tuning methods for current LLMs. Previously, we briefly discussed it in “LoRA from a Gradient Perspective: Introduction, Analysis, Speculation, and Promotion”. In this article, we will learn … Read more

Deep Dive into AI Agents with ERNIE SDK and Multi-tool Orchestration

2025-04-13 by boardor

In the past year, the rapid development of large language models (LLMs) has attracted global attention. Tech giants like Baidu have launched their own large models, continuously pushing the performance limits of language models. However, the industry’s goals for LLMs are no longer limited to basic Q&A functions but are seeking to utilize large models … Read more

A Brief History of AI Agent Development

2025-04-13 by boardor

A Brief History of AI Agent Development, From Philosophical Enlightenment to the Realization of AI Entities Want to understand the development history of AI agents? This brief history of AI agents is a must-read! Finally, someone has clearly explained the history of AI agents; make sure to save it! A Brief History of AI Agents, … Read more

Detailed Explanation of AgentGPT Technology

2025-04-13 by boardor

Author: Yeyan, Master’s in Engineering, China University of Geosciences 1. Background With the development of ChatGPT, the demand for utilizing ChatGPT to accomplish a series of complex tasks has emerged, leading to many application frameworks for AI agents. The specific applications are shown in the figure below, including both open-source and commercial options. Image from … Read more