Deployment of vLLM Enterprise Large Model Inference Framework (Linux)

Deployment of vLLM Enterprise Large Model Inference Framework (Linux)

Introduction Compared to traditional LLM inference frameworks (such as HuggingFace Transformers, TensorRT-LLM, etc.), vLLM demonstrates significant advantages in performance, memory management, and concurrency capabilities, specifically reflected in the following five core dimensions:1. Revolutionary Improvement in Memory Utilization By utilizing Paged Attention technology (inspired by the memory paging mechanism of operating systems), the KV Cache (Key-Value … Read more

Quick Deployment of Single Node Single Disk Architecture MinIO on Linux

Quick Deployment of Single Node Single Disk Architecture MinIO on Linux

Download MinIO Server Files This step describes how to deploy MinIO in a single node single disk (SNSD) configuration for early development and evaluation. The SNSD deployment does not provide any additional reliability or availability beyond what is offered by the underlying storage volume implementation (RAID, LVM, ZFS, etc.). The SNSD deployment uses a zero-parity … Read more

Detailed Explanation of Pod Controller – Canary Release

Detailed Explanation of Pod Controller - Canary Release

“Learning k8s from Scratch” Canary Release The Deployment controller supports controlling the update process, such as “pause” or “resume” update operations. For example, after a batch of new Pod resources is created, the update process is immediately paused. At this point, only a portion of the new version of the application exists, while the majority … Read more

Essential Operations: Automated Docker Deployment with Ansible, Understandable for Beginners

Essential Operations: Automated Docker Deployment with Ansible, Understandable for Beginners

Source: https://cloud.tencent.com/developer/article/2123531Ansible is an automation tool written in Python that can achieve automated management of clusters and perform common operational tasks.Many companies today use cluster deployment services, ranging from a few virtual machines to hundreds or thousands. Sometimes, it is necessary to perform operational tasks on a single cluster or multiple clusters, and this is … Read more

Detailed Explanation of Pod Controllers – Introduction to Pods

Detailed Explanation of Pod Controllers - Introduction to Pods

“Learning Kubernetes from Scratch” 01 — Introduction to Pod Controllers A Pod is the smallest management unit in Kubernetes. In Kubernetes, Pods can be classified into two categories based on how they are created: 1. Standalone Pods: Pods that are created directly by Kubernetes. Once deleted, these Pods do not exist anymore and will not … Read more

DevOps: Deploying Java on Linux

DevOps: Deploying Java on Linux

Environment Requirements Four Ways to Deploy Java Services on Linux (Including Environment Dependencies and Start/Stop Instructions) This article provides four methods for deploying Java services, compatible with JDK 21, detailing the environment dependencies, start, stop, and restart operations for each method. The provided JAR file is automatically placed in the /root/resume/ directory, with the filename: … Read more

Detailed Explanation of Pod Controller – Introduction to Deployment

Detailed Explanation of Pod Controller - Introduction to Deployment

“Learning Kubernetes from Scratch“ Deployment (Deploy) To better address the issue of service orchestration, Kubernetes introduced the Deployment controller starting from version 1.2. It is worth mentioning that this controller does not directly manage Pods but instead manages ReplicaSets to indirectly manage Pods. In other words, the Deployment manages ReplicaSets, and ReplicaSets manage Pods, making … Read more

DevOps: Deploying Java on Linux

DevOps: Deploying Java on Linux

Environment Requirements Four Ways to Deploy Java Services on Linux (Including Environment Dependencies and Start/Stop Instructions) This article provides four methods for deploying Java services, compatible with JDK 21, detailing the environment dependencies, start, stop, and restart operations for each method. The JAR file provided in this example has been automatically placed in the /root/resume/ … Read more

Gunicorn: A Practical Python Library for WSGI HTTP Servers!

Gunicorn: A Practical Python Library for WSGI HTTP Servers!

▼ Click the card below to follow Note me ▲ Click the card above to follow me Gunicorn: The Tool That Launches Your Python Web Applications! When writing web applications, we often encounter a frustrating problem: how to efficiently and stably deploy our Python applications? Today, I want to introduce you to a super powerful … Read more

Unveiling: Practical Implementation Process of Python Microservices Architecture from Requirements to Deployment

Unveiling: Practical Implementation Process of Python Microservices Architecture from Requirements to Deployment

That day at three in the morning, the online service suddenly crashed. After checking the logs, we discovered that our seemingly rock-solid monolithic application could not withstand high concurrency. The entire team stayed up all night, and we barely managed to get through the crisis by temporarily scaling up the servers. This incident completely solidified … Read more