Linux | Red Hat Certified | IT Technology | Operations Engineer
👇 Join the technical exchange QQ group with 1000 members, note 【Public Account】 for faster approval
1. Getting Input
When building a simple Shell, the first step is to get input and obtain environment variables: it can run like a shell and retrieve some environment variables to get user input: obtain user input commands.
Getting Environment Variables
Some environment variables will appear when running the shell, and we can also achieve this step in our custom-built shell.
Username:pxt
Hostname: hecs – 198213
Current Directory: myshell
// Get environment variables user, hostname, pwd, homechar *homepath() { char *home = getenv("HOME"); if(home) return home; else return (char*)".";}const char *getUsername() { const char *name = getenv("USER"); if(name) return name; else return "none";}const char *getHostname() { const char *hostname = getenv("HOSTNAME"); if(hostname) return hostname; else return "none";}const char *getCwd() { const char *cwd = getenv("PWD"); if(cwd) return cwd; else return "none";}printf("[%s@%s %s]$ ", getUsername(), getHostname(), getCwd());
Here we directly display the absolute path, which has no impact.
We used a function getenv(), which is used to obtain the value of environment variables. Its header file is <stdlib.h>. In shell scripts, the value of environment variables is obtained by directly using the variable name without special functions or methods.
After completing the most basic step, we will start simulating the usage of the shell we use. The next step is to obtain user input.
Getting User Input
When obtaining user input, we can create a character array to store the user’s input.
#define NUM 1024char usercommand[NUM];
When obtaining user input, after finishing a command, do not press enter and proceed to the next input.
However, when we finish inputting, we will always press enter to confirm, so we need to modify the last input.
getUserCommand:
// Encapsulate user inputint getUserCommand(char *command, int num) { printf("[%s@%s %s]$ ", getUsername(), getHostname(), getCwd()); char *r = fgets(command, num, stdin); // The last input will still include '\n', enter if(r == NULL) return -1; // TODO // Modify the last input to '\0' command[strlen(command)-1] = '\0'; // This will not overflow return strlen(command);}
2. Splitting Strings
In Shell, splitting strings is a common operation that involves dividing a string containing multiple substrings (possibly separated by spaces, commas, colons, etc.) into separate parts for further processing or assigning to different variables.
After we finish reading the user input command, we need to split the string so that our command can be correctly read and executed. Usually, our delimiter is ‘ ‘ (space).
#define SIZE 64 // Set the size of argv#define SEP " " // Delimiter " "#define debug 1 // Used to test whether string splitting is successfulchar *argv[SIZE]; // Used to store split strings
Regarding splitting strings, we may have mentioned a string function strtok() in our C language studies. Its header file is <string.h> and is used for splitting strings.It splits strings based on a series of delimiters and returns a pointer to the split string.
char *strtok(char *str, const char *delim);
commandSplit:void commandSplit(char *in, char *out[]) // in -> usercommand{ // out -> argv int argc = 0; out[argc++] = strtok(in, SEP); while(out[argc++] = strtok(NULL, SEP)); // We only need to read all user inputs// Used to test whether string splitting is successful#ifdef debug int i = 0; for(i = 0; out[i]; i++) { printf("%d:%s\n", i, out[i]); } #endif}
After completing the string splitting, we can use the knowledge of processes we learned long ago to let it run. However, many of our commands cannot be executed simply like this, so we need one more step: check built-in commands!
3. Checking Built-in Commands
Built-in commands (also known as internal commands) are commands provided by the Shell (such as Bash) itself, rather than an executable file in the filesystem. These commands usually execute faster than external commands because they do not require disk I/O to load external programs and do not require creating new processes.
We will implement three simple built-in commands: cd, export, echo.
doBuildin:char cwd[1024]; // Storagechar enval[1024]; // For testingint lastcode = 0; // Store the return information of the last commandvoid cd(const char *path) { chdir(path); char tmp[1024]; getcwd(tmp, sizeof(tmp)); sprintf(cwd, "PWD=%s", tmp); // Change the current path putenv(cwd); // Change environment variables}int doBuildin(char *argv[]) { if(strcmp(argv[0], "cd") == 0) { char *path = NULL; if(argv[1] == NULL) path = homepath(); // When we do cd without any input, // we directly return to the home directory else { path = argv[1]; } cd(path); return 1; } else if(strcmp(argv[0], "export") == 0) { if(argv[1] == NULL) return 1; strcpy(enval, argv[1]); putenv(enval); // Used to add environment variable content return 1; } else if(strcmp(argv[0], "echo") == 0) {if(argv[1] == NULL) { printf("\n"); return 1; } if(*(argv[1]) == '$' && strlen(argv[1]) > 1) { char *val = argv[1]+1; // argv[1] is '$', argv[1]+1 is the character after '$' if(strcmp(val, "?") == 0) { // lastcode stores the return information of the last command, initialized to 0 // When the program executes, if an error occurs, lastcode will change value, print it, and reset to 0 printf("%d\n", lastcode); lastcode = 0; } else // If '$' is not followed by "?", get the environment variable { const char *enval = getenv(val); if(enval) printf("%s\n", enval); else printf("\n"); } return 1; } else // Directly output characters{ printf("%s\n", argv[1]); return 1; } } return 0;}
In checking built-in commands, we will also use some functions:
getcwd: Used to get the absolute path of the current working directorychdir: Used to change the current working directoryThe chdir command is usually implemented through the cd command because cd is an alias for chdir, and both have the same function. However, in programming languages (such as C, PHP, etc.), chdir is a specific function used to dynamically change the current working directorysprintf: Used to write formatted data into character arraysputenv: A function used to change or add environment variable content
4. Executing Programs
After completing the previous groundwork, we can create processes to implement our program. When we learned about processes earlier, we needed to use three header files.
#include<unistd.h>#include<sys/types.h>#include<wait.h>
execute:
int execute(char *argv[]) { pid_t id = fork(); if(id < 0) return -1; else if(id == 0) { // Child process execvp(argv[0], argv); exit(1); } else { // Parent process int status = 0; pid_t rid = waitpid(id, &status, 0); if(rid > 0) { // Provide the current status to lastcode lastcode = WEXITSTATUS(status); } } return 0;}
execvp is a system call function used in C language (especially in Unix and Unix-like systems, such as Linux) to execute external programs. This function locates and executes the specified file by searching the environment variable (especially the PATH environment variable) while passing the parameter list to that program.
Above is the encapsulation of some basic operations. Let’s take a look at the main function.
Main:
int main() { // shell is a program that runs in a loop while(1) { char usercommand[NUM]; char *argv[SIZE]; // Get input int n = getUserCommand(usercommand, sizeof(usercommand)); // When obtaining input, if a number less than 0 is returned, we continue without proceeding further if(n <= 0) continue; // Split string (strtok) commandSplit(usercommand, argv); // Check Built-in n = doBuildin(argv); // Generally, built-in commands do not cause errors if(n) continue; // Execute command execute(argv); } return 0;}
Note: When we input commands and output characters, the Backspace key cannot directly delete. We need to use the Ctrl + Backspace combination key to delete the previous word.
Let’s think about the similarity between functions and processes.
exec/exit is like call/return A C program consists of many functions. One function can call another function while passing some parameters to it. The called function performs certain operations and then returns a value. Each function has its local variables, and different functions communicate through the call/return system.
This pattern of communication through parameters and return values among functions with private data is the foundation of structured program design. Linux encourages extending this pattern applied within programs to between programs.
A C program can fork/exec another program and pass it some parameters. The called program performs certain operations and then returns a value via exit(n). The calling process can obtain the exit return value through wait(&ret).
5. Summary
As our journey of exploring and learning to write simple shell scripts in Linux comes to an end, we can’t help but look back on this challenging and rewarding time. Shell scripts, as an indispensable part of the Linux system, with their powerful automation capabilities and flexible syntax structure, have become a valuable assistant for system administrators, developers, and anyone looking to improve work efficiency.
For course inquiries, add: HCIE666CCIE
↑ Or scan the QR code above ↑
What technical points and content would you like to see?
You can leave a message below to let us know!