Efficient ML Systems: TinyChat and Edge AI 2.0

Efficient ML Systems: TinyChat and Edge AI 2.0

Click belowcard, follow the “LiteAI” public account Hi, everyone, I am Lite. Recently, I shared the Efficient Large Model Full Stack Technology from Part 1 to 19, which includes large model quantization and fine-tuning, efficient inference of LLMs, quantum computing, generative AI acceleration, and more. The content links are as follows: Efficient Large Model Full … Read more

Efficient ML Systems: TinyChat Engine and On-Device LLM Inference

Efficient ML Systems: TinyChat Engine and On-Device LLM Inference

Click belowcard, follow the “LiteAI” public account Hi, everyone, I am Lite. I recently shared the first to nineteenth articles on efficient large model full-stack technology, including large model quantization and fine-tuning, efficient inference of LLMs, quantum computing, generative AI acceleration, etc. Here is the link: Efficient Large Model Full-Stack Technology (Nineteen): Efficient Training and … Read more