Efficient LLM Fine-Tuning Using GaLore on Local GPU

Efficient LLM Fine-Tuning Using GaLore on Local GPU

Source: DeepHub IMBA This article is approximately 2000 words long, suggesting an 8-minute read. GaLore can save VRAM, allowing training of a 7B model on consumer-grade GPUs, but it is slower, taking almost twice as long as fine-tuning and LoRA. Training large language models (LLMs), even those with “only” 7 billion parameters, is a computationally … Read more

ICML 2024: New Fourier Fine-Tuning Method Reduces Parameters

ICML 2024: New Fourier Fine-Tuning Method Reduces Parameters

This article introducesThe Hong Kong University of Science and Technology (Guangzhou)a paper on efficient fine-tuning of large models (LLM PEFT Fine-tuning) titled “Parameter-Efficient Fine-Tuning with Discrete Fourier Transform”, which has been accepted by ICML 2024, and the code has been open-sourced. Paper link: https://arxiv.org/abs/2405.03003 Project link: https://github.com/Chaos96/fourierft Background Large foundation models have achieved remarkable successes … Read more

NVIDIA Introduces New SOTA Fine-Tuning Method for LLMs: LoRA

NVIDIA Introduces New SOTA Fine-Tuning Method for LLMs: LoRA

The mainstream method for fine-tuning LLMs, LoRA, has a new variant. Recently, NVIDIA partnered with Hong Kong University of Science and Technology to announce an efficient fine-tuning technique called DoRA, which achieves more granular model updates through low-rank decomposition of pre-trained weight matrices, significantly improving fine-tuning efficiency. In a series of downstream tasks, both training … Read more

How to Code LoRA From Scratch: A Tutorial

How to Code LoRA From Scratch: A Tutorial

The author states: Among various effective LLM fine-tuning methods, LoRA remains his preferred choice. LoRA (Low-Rank Adaptation) is a popular technique for fine-tuning LLMs (Large Language Models), initially proposed by researchers from Microsoft in the paper “LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS”. Unlike other techniques, LoRA does not adjust all parameters of the neural … Read more

Exploring Intelligent Agents with Professor Andrew Ng

Exploring Intelligent Agents with Professor Andrew Ng

Book Giveaway at the End The Expert Created a Translation Agent Leading figure in the field of artificial intelligence, Stanford University professor Andrew Ng, recently released an open-source project for a machine translation intelligent agent — translation-agent. This project implements a large model translation application based on a reflective workflow. Currently, this project has already … Read more

Fudan NLP Team Releases 80-Page Overview of LLM-based Agents

Fudan NLP Team Releases 80-Page Overview of LLM-based Agents

Will agents become the key to unlocking AGI? The Fudan NLP team comprehensively explores LLM-based Agents. Recently, the Fudan University Natural Language Processing team (FudanNLP) released a survey paper on LLM-based Agents, spanning 86 pages and citing over 600 references! The authors start from the history of AI Agents and provide a comprehensive overview of … Read more

Understanding Agents Based on Large Models

Understanding Agents Based on Large Models

Source: Datawhale Datawhale Insights Author: Chen Andong, Datawhale Member Introduction In the current information age, the development speed and influence of Large Language Models (LLMs) are increasingly significant. The powerful reasoning and generation capabilities of large models make them the best components for building intelligent agents. This content is derived from Datawhale’s open-source course “Fundamentals … Read more

An In-Depth Look at AI Agents

An In-Depth Look at AI Agents

In the past year, the rapid development of general large language models (LLMs) has attracted global attention. Tech giants like Baidu have launched their own large models, continuously pushing the performance limits of language models. However, the industry’s goals for LLMs are no longer limited to basic Q&A functions, but are seeking to utilize large … Read more

How to Write LoRA Code From Scratch: A Tutorial

How to Write LoRA Code From Scratch: A Tutorial

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering audiences such as NLP graduate students, university professors, and corporate researchers. The Community’s Vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for beginners. Reprinted from … Read more

Understanding LLM, SLM, and TinyML in AI

Understanding LLM, SLM, and TinyML in AI

LLM (Large Language Model) Definition: Large Language Models are AI models designed to understand and generate natural language text. They are typically based on deep learning techniques and trained on vast amounts of text data. Examples: GPT-3, GPT-4 (provided by OpenAI) BERT (provided by Google) T5 (provided by Google) Application Scenarios: Text generation Translation Sentiment … Read more