With the rapid development of large models, there has been significant technological iteration and updates in just a year, from LoRA, QLoRA, AdaLoRa, ZeroQuant, Flash Attention, KTO, distillation techniques to model incremental learning, data processing, and understanding new open-source models, almost every day brings new developments.
As algorithm engineers, do you feel like your learning pace is a bit behind the rapid technological advancements? And do you find that your understanding of these emerging technologies only stays at the application level, without a specific analysis of the underlying principles? If you wish to maintain a competitive edge in the large model race, a deeper understanding of the technology itself may be a necessary option.
In light of these pain points and in response to technological development, Greedy Technology has once again launched the “Large Model Fine-Tuning Algorithm Practical Camp” at this critical moment, allowing participants to fully grasp the mainstream technologies in the field of large models and their essence over a three-month period, significantly saving learning costs.

Detailed Outline
-
Introduce course objectives, arrangements, and expected outcomes
-
Clarify requirements and expectations for students
-
Overview of projects and technologies that will be explored in the course
-
Discuss the current industry status of large model technologies
-
Recommend tools and open-source projects to pay attention to
-
Definition and importance of large models
-
Development history and key milestones of large models
-
Basic concepts of pre-training and fine-tuning
-
Pre-training, data processing, fine-tuning, alignment of large models
-
Infrastructure and resource requirements for training large models
-
Challenges faced and future development directions
-
Basic architecture of the Transformer model
-
Principles and computational process of the Self-Attention mechanism
-
Design and role of Multi-Head Attention
-
Calculation and visualization of attention weights
-
Role and advantages of Self-Attention in the model
-
Concept and implementation methods of Positional Encoding
-
Rotary Positional Embedding
-
BPE tokenizer, SentencePiece Encoding
-
Feed-Forward Networks in Transformer
-
Principles and importance of Layer Normalization
-
Residual connections in the Transformer model
-
Structural differences between encoder and decoder
-
Training strategies and optimization methods for Transformers
-
Parameter initialization and learning rate scheduling
-
Regularization techniques for Transformer models
-
Variants and improvements of the Attention mechanism
-
Greedy Decoding, Beam-search
-
Top-K Sampling, Top-p Sampling
-
Source code interpretation of Transformer
-
Differences between full fine-tuning and efficient fine-tuning
-
Common strategies for fine-tuning Transformer models
-
Selecting appropriate fine-tuning tasks and datasets
-
Challenges and best practices in fine-tuning
-
Standards and tools for evaluating fine-tuning effectiveness
-
Installation of PEFT
-
Usage instructions for PEFT, explanation of core modules
-
Techniques for preparing and preprocessing instruction data
-
Detailed steps for implementing fine-tuning
-
Performance evaluation and analysis of the fine-tuning project
-
Development history of the GPT series models
-
Analysis of models from GP1 to GPT4, GPT3
-
Interpretation of GPT code
-
Analysis of the InstructGPT model
-
Zero-shot Prompting
-
Few-shot Prompting
-
Limitations and challenges of GPT models
-
Features and technological innovations of the LLaMA model
-
Principles of the LLaMA model
-
Source code interpretation of LLaMA
-
Comparison of LLaMA with other large models
-
Training and fine-tuning strategies for the LLaMA model
-
Future development directions for the LLaMA model
-
Architecture and design philosophy of ChatGLM
-
Interpretation of the ChatGLM model
-
Technical iterations from ChatGLM1 to ChatGLM3
-
Advantages and application areas of the ChatGLM model
-
Practical guide for fine-tuning and deploying the ChatGLM model
-
Evaluation and performance optimization of the ChatGLM model
-
Overview and core technologies of the Baichuan model
-
Principles and source code interpretation of Baichuan
-
Comparison of the Baichuan model with other models
-
Application of the Baichuan model in specific tasks
-
Strategies and techniques for fine-tuning the Baichuan model
-
Limitations of the Baichuan model
-
Definition and application background of instruction fine-tuning
-
Comparison of instruction fine-tuning with traditional fine-tuning
-
Importance of instruction fine-tuning in large models
-
Overview of the instruction fine-tuning process
-
Challenges and strategies in instruction fine-tuning
-
Basic concepts of matrices and vectors
-
Matrix operations and properties
-
Eigenvalues and eigenvectors
-
Introduction to matrix decomposition (SVD) techniques
-
Application of matrices in the LoRA algorithm
-
Principles and motivations of the LoRA algorithm
-
Low-rank assumptions in LoRA
-
Key technical components of LoRA
-
Implementation steps of the LoRA algorithm
-
Optimization and debugging of the LoRA algorithm
-
Source code interpretation of the LoRA algorithm
-
Importance and sources of instruction data
-
Methods for automated and manual collection of instruction data
-
Preprocessing and standardization of instruction data
-
Techniques for generating high-quality instruction data
-
Maintenance and updating of instruction datasets
-
Manual quality assessment and automated quality assessment of instruction data
-
Design and objectives of the Alpaca fine-tuning project
-
Preparing instruction data needed for Alpaca fine-tuning
-
Detailed steps for implementing Alpaca fine-tuning
-
Methods for evaluating the effectiveness of Alpaca fine-tuning
-
Analyzing and solving problems encountered in Alpaca fine-tuning
-
Interpreting the source code of the Alpaca project
-
Comparison of AdaLoRA and LoRA
-
Significance of dynamically changing matrix weights
-
SVD and AdaLoRA
-
Training AdaLoRA
-
Source code interpretation of AdaLoRA
-
Explanation of an AdaLoRA case study
-
Background and application scenarios of the Vicuna fine-tuning project
-
Data collection for ShareGPT
-
Implementation process and technical details of Vicuna fine-tuning
-
Evaluation and analysis of the effects of Vicuna fine-tuning
-
Experience summary and outlook based on the Vicuna fine-tuning project
Stage 3: Instruction Fine-tuning of Large Models – Quantization
-
Role and principles of quantization in deep learning
-
Common quantization techniques and their classifications
-
Impact of model quantization on performance and accuracy
-
Practical steps and tools for quantization
-
Challenges and solutions for model quantization
-
Definition and background of the QLoRA algorithm
-
Key differences and improvements of QLoRA compared to LoRA
-
Detailed implementation process of the QLoRA algorithm
-
4bit NormalFloat, double quantization
-
Optimization and debugging techniques for the QLoRA algorithm
-
Source code interpretation of QLoRA
-
Design of the technical solution
-
Collection and preprocessing of instruction data
-
Fine-tuning the QLoRA large model based on PEFT
-
Evaluating the effects after QLoRA fine-tuning
-
Analyzing problems encountered during QLoRA fine-tuning and their solutions
-
Necessity and technical background of model compression
-
Overview of common model compression methods
-
Relationship between model compression and quantization
-
Steps and precautions for implementing model compression
-
Latest research progress in model compression techniques
-
Basic concepts and working principles of model distillation
-
Application of model distillation in model optimization
-
Comparison and selection of different distillation techniques
-
Specific methods for implementing model distillation
-
Challenges and solutions faced by model distillation techniques
-
Basic principles and application background of the ZeroQuant algorithm
-
Innovations of ZeroQuant in model quantization
-
Key steps and technical requirements for implementing ZeroQuant
-
Source code interpretation of ZeroQuant
-
Limitations and future directions of ZeroQuant technology
-
Design philosophy and core technologies of the SmoothQuant algorithm
-
Differences between SmoothQuant and traditional quantization methods
-
Specific process for implementing the SmoothQuant algorithm
-
Source code interpretation of SmoothQuant
-
Technical challenges and improvement paths for SmoothQuant
-
Origins and background of RLHF
-
Role and importance of RLHF in artificial intelligence
-
Advantages of combining reinforcement learning with human feedback
-
Main application areas and case studies of RLHF
-
From InstructGPT to GPT-4
-
Role of human feedback in reinforcement learning
-
Different forms of human feedback: annotations, preferences, guidance
-
Learning from human feedback: methods and strategies
-
Collection and processing of human feedback data
-
Challenges and solutions for human feedback reinforcement learning
-
Origins and motivations of PPO
-
Comparison of PPO with other policy gradient methods
-
Core concepts and principles of the algorithm
-
Advantages and limitations of PPO
-
Application areas and cases of PPO
-
Introduction to basic concepts of reinforcement learning
-
Role and importance of data in reinforcement learning
-
Data structures of states, actions, and rewards
-
Methods for data collection, processing, and utilization
-
Generating and testing data using simulation environments
-
Introduction to policy gradient methods
-
Advantage functions and returns
-
Concept and role of baselines
-
Cumulative returns and discounted returns
-
Trade-offs between exploration and exploitation
-
Objective functions and KL divergence
-
Principles of clipping the objective function
-
Multiple iterations of optimization strategies
-
Generalized Advantage Estimation (GAE)
-
Importance sampling and policy updates
-
Building a neural network model
-
Implementing the optimization loop of PPO
-
Adaptive learning rate adjustment
-
Debugging and performance analysis techniques
-
Evaluating the aligned large model
-
Variants and improvement strategies of PPO
-
Handling high-dimensional inputs and model generalization
-
PPO applications in multi-agent environments
-
Transfer learning and multi-task learning in reinforcement learning
-
Safety and interpretability in reinforcement learning
-
Project requirement analysis and technical solution design
-
Environment setup and task definition
-
Collection and preprocessing of alignment data
-
Implementing the PPO training process
-
Result analysis and performance optimization
-
Introduction to DPO (Direct Preference Optimization)
-
Comparison with the PPO algorithm
-
Application scenarios and importance of DPO
-
Basic principles and working mechanisms
-
Advantages and challenges of the DPO algorithm
-
Role of preferences and ranking problems in AI
-
Data representation: pairwise comparisons and preference matrices
-
Challenges in preference learning
-
Evaluation metrics for ranking and preference prediction
-
Overview of classic preference learning algorithms
-
Mathematical framework for preference modeling
-
Comparison of direct and indirect preference optimization
-
Key algorithm components in DPO
-
Methods for processing pairwise comparison data
-
Loss functions and optimization strategies in DPO
-
Data organization and preprocessing
-
Steps for building a preference learning model
-
Using Python to implement a basic DPO model
-
Testing DPO performance on benchmarks
-
Advantages and disadvantages of DPO
-
Preference learning in recommendation systems
-
Designing DPO-driven recommendation algorithms
-
Handling real-time user feedback
-
Implementing DPO for fine-tuning recommendation models
-
Evaluating the performance of recommendation systems
-
Combining multi-task learning with DPO
-
Application of DPO in unsupervised learning
-
Deep learning methods and DPO
-
Interactive preference learning
-
Variants of DPO technology
-
Basic principles of Prefix Tuning
-
Key steps for implementing Prefix Tuning
-
Source code interpretation of Prefix Tuning
-
Comparison of Prefix Tuning with other fine-tuning methods
-
Case studies of applying Prefix Tuning in NLP tasks
-
Limitations and challenges of Prefix Tuning
-
Basic principles of Adaptor Tuning
-
How to insert Adaptor layers in large models
-
Advantages and application scenarios of Adaptor Tuning
-
Source code interpretation of Adaptor Tuning
-
Practical case: Application of Adaptor Tuning in classification tasks
-
Efficiency and scalability issues of Adaptor Tuning
-
Design philosophy and algorithm principles of Flash Attention
-
Optimizing the attention mechanism in Transformer models
-
Role of Flash Attention in improving processing speed and efficiency
-
Case analysis of improving large models with Flash Attention
-
Challenges and solutions for implementing Flash Attention
-
Introduction to differences between Flash Attention 2 and previous versions
-
In-depth exploration of technical improvements in Flash Attention 2
-
Application examples of Flash Attention 2 in complex task processing
-
Evaluating the performance and applicability of Flash Attention 2
-
Implementation details and tuning suggestions for Flash Attention 2
-
Background and theoretical foundation of KTO algorithm
-
Application of Kahneman-Tversky optimization in fine-tuning
-
Key technical steps for implementing KTO
-
Role of KTO in improving decision quality
-
Application cases and performance analysis of KTO
-
Fine-tuning strategy combining QLoRA and Flash Attention
-
Task selection and data preparation
-
Detailed fine-tuning process: from preprocessing to model evaluation
-
Analyzing performance improvements of the model after fine-tuning
-
Challenges faced and solutions shared
-
Importance of incremental learning (Continual learning)
-
Comparison with traditional zero-training
-
Application scenarios of incremental learning
-
Task selection and data preparation
-
Detailed fine-tuning process: from preprocessing to model evaluation
-
What is catastrophic forgetting
-
Ideas for solving catastrophic forgetting
-
Regularization, dynamic network architecture, meta-learning
-
Mixed training of general data and vertical data
-
Information analysis in data
-
Adjusting learning rates
-
Application of incremental learning on large-scale datasets
-
Multimodal and cross-domain incremental learning
-
Adaptive learning and online learning techniques
-
Combination of reinforcement learning and incremental learning
-
Future directions for incremental learning
|
|
|
|
|
|
|
|
Main Instructor

-
Postdoctoral researcher in Computer Science and Artificial Intelligence at Tsinghua University -
Long-term engagement in the development and commercialization of dialogue systems and pre-trained language models at large companies -
Mainly engaged in pioneering research and commercialization in natural language processing and dialogue fields -
Published over ten high-level papers in international conferences and journals such as AAAI, NeurIPS, ACM, EMNLP

-
Technical strategy advisor for several listed companies -
Former chief scientist at a fintech unicorn company -
Former chief scientist at a quantitative investment startup -
Former recommendation system engineer at Amazon USA -
Deeply engaged in artificial intelligence for over ten years, trained tens of thousands of AI students
Registration Consultation
