Understanding Linux Process Descriptors

Processes are the entities scheduled by the operating system, and the description of the resources possessed by a process is called the Process Control Block (PCB).

Describing Processes with task_struct

In the kernel, a process is described by the task_struct structure, known as the process descriptor, which stores all the information necessary for the normal operation of a process. The task_struct structure contains a lot of information; here, only a few member variables are listed. Interested readers can check the source code in the include/linux/sched.h header file.

struct task_struct {

#ifdef CONFIG_THREAD_INFO_IN_TASK
  /*
   * For reasons of header soup (see current_thread_info()), this
   * must be the first element of task_struct.
   */
  struct thread_info        thread_info;
#endif
  volatile long state;
  void *stack;
  ......
  struct mm_struct *mm;
  ......
  pid_t pid;
  ......
  struct task_struct *parent;
  ......
  char comm[TASK_COMM_LEN];
  ......
  struct files_struct *files;
  ......
  struct signal_struct *signal;
}

The main information categories in task_struct are:

1. Identifier: The unique identifier pid that describes this process, used to distinguish it from other processes.

2. State: Task state, exit code, exit signal, etc.

3. Priority: The priority relative to other processes.

4. Program Counter: The address of the next instruction to be executed in the program.

5. Memory Pointer: Includes pointers to program code and process-related data, as well as pointers to shared memory blocks with other processes.

6. Context Data: Data in the processor’s registers during process execution.

7. I/O State Information: Includes displayed I/O requests, allocated process I/O devices, and the list of files used by the process.

8. Accounting Information: May include total processor time, total clock used, time limits, accounting numbers, etc.

  • struct thread_info thread_info: Information about the scheduling execution of the process.
  • volatile long state: -1 means not running, =0 means running state, >0 means stopped state. Below are some important process states and their conversion processes.Understanding Linux Process Descriptors
  • void *stack: A pointer to the kernel stack; the kernel allocates kernel stack space for each process through dup_task_struct and records it here.
  • struct mm_struct *mm: Information related to the process address space.Understanding Linux Process Descriptors
  • pid_t pid: The process identifier.
  • char comm[TASK_COMM_LEN]: The name of the process.
  • struct files_struct *files: Open file table.
  • struct signal_struct *signal: Related to signal handling.

Relationship Between task_struct, thread_info, and Kernel Stack sp

Next, let’s look at the thread_info structure:

struct thread_info {
        unsigned long           flags;          /* low level flags */
        mm_segment_t            addr_limit;     /* address limit */
#ifdef CONFIG_ARM64_SW_TTBR0_PAN
        u64                     ttbr0;          /* saved TTBR0_EL1 */
#endif
        union {
                u64             preempt_count;  /* 0 => preemptible, <0 => bug */
                struct {
#ifdef CONFIG_CPU_BIG_ENDIAN
                        u32     need_resched;
                        u32     count;
#else
                        u32     count;
                        u32     need_resched;
#endif
                } preempt;
        };
#ifdef CONFIG_SHADOW_CALL_STACK
        void                    *scs_base;
        void                    *scs_sp;
#endif
};

Next, let’s look at the definition of the kernel stack:

union thread_union {
#ifndef CONFIG_ARCH_TASK_STRUCT_ON_STACK
        struct task_struct task;
#endif
#ifndef CONFIG_THREAD_INFO_IN_TASK
        struct thread_info thread_info;
#endif
        unsigned long stack[THREAD_SIZE/sizeof(long)];
};

When the CONFIG_THREAD_INFO_IN_TASK configuration is enabled, only the stask member exists in the thread_union structure.

During kernel startup, the kernel initializes the kernel stack in head.S through __primary_switched:

SYM_FUNC_START_LOCAL(__primary_switched)
        adrp    x4, init_thread_union
        add     sp, x4, #THREAD_SIZE
        adr_l   x5, init_task
        msr     sp_el0, x5                      // Save thread_info

The address of init_thread_union is saved to x4, then the offset of THREAD_SIZE stack size is applied to initialize sp. The address of the init_task process descriptor is assigned to x5 and saved to sp_el0.

Next, let’s look at the definitions of init_thread_union and init_task:

#include/linux/sched/task.h
extern union thread_union init_thread_union;

#init/init_task.c
struct task_struct init_task
        __aligned(L1_CACHE_BYTES)
= {
#ifdef CONFIG_THREAD_INFO_IN_TASK
        .thread_info    = INIT_THREAD_INFO(init_task),
        .stack_refcount = REFCOUNT_INIT(1),
#endif
.....
 };

Thus, the relationship between these three can be described by the following diagram:

Understanding Linux Process Descriptors

How to Get the Current Process

In the kernel, the current macro is often used to obtain the struct task_struct structure corresponding to the current process. Using current, along with the content introduced above, let’s look at the specific implementation.

static __always_inline struct task_struct *get_current(void)
{
    unsigned long sp_el0;
 
    asm ("mrs %0, sp_el0" : "=r" (sp_el0));
 
    return (struct task_struct *)sp_el0;
}
 
#define current get_current()

The code is quite simple; it can be seen that by reading the value of the user space stack pointer register sp_el0, and then casting this value to the task_struct structure, we can obtain the current process. (sp_el0 stores init_task, which is the address of thread_info, and thread_info is at the beginning of task_struct, thus finding the current process.)

Understanding Linux Process Descriptors

Understanding Linux Process Descriptors

5T technology resources are being released! Including but not limited to: C/C++, Arm, Linux, Android, Artificial Intelligence, Microcontrollers, Raspberry Pi, etc. Reply “peter” in the public account to get it for free!

Understanding Linux Process Descriptors Remember to click share, like and view, give me some energy!

Leave a Comment

×