RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Address:https://github.com/NotPunchnox/rkllama

RKLLama is an open-source server and client solution designed to run large language models (LLMs) optimized for the Rockchip RK3588 (S) and RK3576 platforms, and to interact with them. Unlike solutions such as Ollama or Llama.cpp, RKLLama fully utilizes the Neural Processing Units (NPU) on these devices, providing an efficient and high-performance solution for deploying artificial intelligence and deep learning models on Rockchip hardware.

RKLLama comes equipped with a REST API, allowing you to easily build custom clients tailored to specific needs. It also provides an integrated command-line interface (CLI) client, simplifying the process of testing and interacting with the API.

Based on this video:

https://www.youtube.com/watch?v=Kj8U1OGqGPc

Download

git clone https://github.com/notpunchnox/rkllama

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

To install, execute setup.sh in the download directory.

bash setup.sh

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Successful installation display

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Run rkllama to see available commands

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Available commands:

  • help: Displays this help menu.
  • update: Checks for available updates and upgrades.
  • serve: Starts the server.
  • list: Lists all available models on the server.
  • pull hf/model/file.rkllm: Downloads a model from a file on Hugging Face.
  • rm model.rkllm: Deletes a model.
  • load model.rkllm: Loads a specific model.
  • unload: Unloads the currently loaded model.
  • run: Enters a dialogue mode with the model.
  • exit: Exits the program.

Start the service

rkllama serve

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Display after successful start

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

In another terminal, download the model

rkllama pull

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Manually download the model and place it in RKLLAMA/models. You can download the Qwen2.5 model from https://huggingface.co/c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4/tree/main

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

At this point, run

rkllama list

to see the available models

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Run this model

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

After starting, you can see

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

After ‘You:’, you can enter commands or prompts

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

RKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

Set system prompts

RKLLama: LLM Server and Client for Rockchip 3588/3576 ChipsRKLLama: LLM Server and Client for Rockchip 3588/3576 Chips

For more detailed instructions, refer to the GitHub readme.

Leave a Comment