Introduction to ChatGLM-6B: The Open-Source Alternative to ChatGPT

Introduction to ChatGLM-6B

ChatGLM-6B is an open-source, bilingual dialogue language model that supports Chinese and English Q&A, based on the General Language Model (GLM) architecture, with 6.2 billion parameters.

https://huggingface.co/THUDM/chatglm-6b

ChatGLM-6B uses the same technology as ChatGLM and is optimized for Chinese Q&A and dialogue. With approximately 1 trillion identifiers trained in both Chinese and English, supplemented by supervised fine-tuning, self-feedback, and human feedback reinforcement learning techniques, the 6.2 billion parameter ChatGLM-6B can generate responses that align closely with human preferences.

Quantization Level	Minimum GPU Memory (Inference)	Minimum GPU Memory (Efficient Parameter Fine-Tuning)
FP16 (No Quantization)	13 GB	14 GB
INT8	8 GB	9 GB
INT4	6 GB	7 GB

ChatGLM-6B-INT8

https://huggingface.co/THUDM/chatglm-6b-int8

28 GLM Blocks in ChatGLM-6B have been quantized to INT8, with no quantization applied to the Embedding and LM Head.

The quantized model theoretically requires 8 GB of memory (using CPU memory) for inference, making it possible to run on embedded devices (such as Raspberry Pi).

ChatGLM-6B-INT4

https://huggingface.co/THUDM/chatglm-6b-int4

28 GLM Blocks in ChatGLM-6B have been quantized to INT4, with no quantization applied to the Embedding and LM Head.

The quantized model theoretically requires 6 GB of memory (using CPU memory) for inference, making it possible to run on embedded devices (such as Raspberry Pi).

Basic Usage of ChatGLM-6B

ChatGLM-6B has been hosted on Hugging Face and can be called directly through hg:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) # Load the model, can set different versions
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda() # First conversation
response, history = model.chat(tokenizer, "Hello", history=[])
print(response) # Second conversation
response, history = model.chat(tokenizer, "What should I do if I can't sleep at night?", history=history)
print(response)

During usage, response is the return result of the current conversation, and history is the accumulated historical information of the conversation.

Examples of ChatGLM-6B Usage

Self-Awareness

Outline Writing

Copywriting

Email Writing Assistant

Information Extraction

Role Playing

Comment Comparison

Travel Guide

Limitations of ChatGLM-6B

Due to the small scale of ChatGLM-6B, its capabilities still have many limitations. Here are some issues we have currently found:

Small model capacity: The small capacity of 6B limits its relatively weak model memory and language abilities. When faced with many factual knowledge tasks, ChatGLM-6B may generate incorrect information; it also struggles with logical problems (such as mathematics and programming).
Generation of harmful or biased content: ChatGLM-6B is only a preliminary language model aligned with human intentions and may generate harmful or biased content. (Content may be offensive and is not displayed here)
Insufficient English capability: The instructions/responses used during the training of ChatGLM-6B were mostly in Chinese, with only a small portion in English. Therefore, if English instructions are input, the quality of the response is far inferior to that in Chinese, and may even contradict the content under Chinese instructions, resulting in a mix of Chinese and English.
Prone to being misled, weaker dialogue capabilities: ChatGLM-6B’s dialogue capabilities are still relatively weak, and there are issues with its “self-awareness,” making it easily misled and prone to generating incorrect statements. For example, the current version of the model has self-awareness issues when misled.

🏴☠️ Treasure-level 🏴☠️ Original public account “Data STUDIO” content is super hardcore. The public account focuses on Python as the core language, vertical to the data science field, includingcan click👉 Python｜MySQL｜Data Analysis｜Data Visualization｜Machine Learning and Data Mining｜Web Scraping and more, from beginner to advanced!

Long press 👇 to follow – Data STUDIO – set as a star mark, quick delivery of dry goods