Contents

Writing a Shell in C

Do you have dreams about writing C shells by the sea shore? That makes two of us.

One of the many cool projects you can make while learning to program in the Linux environment is to make a command-line interpreter like the Bash shell or the command prompt in windows. In the process you will learn how to handle fork + exec calls and the various system calls associated with the working of a shell.

We will first implement a very basic shell and then improve upon it by adding various features like redirection,pipeline,globbing etc.

Prerequisites:

  • Basic Knowledge of the Linux operating System, the Bash Shell and C programming Language .

  • A Linux machine or the Windows Bash Subsystem with the required packages.

  • The gcc and the readline dev packages packages

    Install the packages by typing

    sudo apt-get install gcc libreadline*

Lets begin by writing a simple version of the whole shell

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

#include<stdio.h>
#include<stdlib.h>
#include <readline/readline.h>
int main(void){

    char buf[1024];
    char * input;

    while(1)
    {

        input=readline(" > ");

        if(strlen(input)==0){
                strcpy(buf,"\n");
        }
        else
        {
                strcpy(buf,input);
        }

        if(strcmp(input,"exit")==0){
                exit(0);
        }
    }
}

Note: We are not executing the command or even checking if the entered command is valid just taking a command showing the prompt. we will add those features later.

If you are familiar with C the whole program looks easy enough, we take a command from the user in a prompt if the command is ‘exit’ we close the shell.

The only thing different is the way we take input, it is done through readline.

Readline if you look at the man pages(‘man readline’) it says

“Readline will read a line from the terminal and return it, using prompt as a prompt. If prompt is NULL or the empty string, no prompt is issued. The line returned is allocated with malloc; the caller must free it when finished. The line returned has the final newline removed, so only the text of the line remains. readline offers editing capabilities while the user is entering the line. By default, the line editing commands are similar to those of emacs. A vi-style line editing interface is also available.”

We use readline instead of using the standard input library functions because of the line editing capabilities of readline and the auto-complete feature that comes with the prompt (also bash uses readline sooooooo).

Lets save and run the program

  • Save the file as ‘myshell.c’
  • open up a terminal and compile the file using the command
    • ‘gcc myshell.c -L/usr/local/lib -I/usr/local/include -lreadline -o myshell’
  • execute by typing ‘./myshell’

Initial implementation of the shell

If you are thinking why the ‘ls’ command is not working its because we have not written the related code yet. Our basic shell now just takes a command checks if it is ‘exit’ if so it terminates otherwise prints the ‘>’ prompt again.

You can also see the auto-complete with readline working on the filename in the same directory. if you press [tab] key after a my it autocompletes with all the filenames starting with my.

Lets see how our shell will work in its entirety using a flow chart diagram

Flowchart

The Parser:

The parser takes the input from the prompt and separates it into the command and arguments.

The various we followed for writing the parser are

  • The command and the various arguments are separated by spaces.
  • If an argument is inside " " it is treated as a single argument even if separated by spaces.
    • Eg a directory name may contain spaces in its name ‘ls “Program Files”’ it should not be treated as ‘ls Program’ and ‘ls Files’.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
int parse(char * buffer,int argv_size)
{
char *p, _start_of_word;
char _ argv[30];
int debug=1;
short unsigned int c, argc = 0;
memset(argv,0,sizeof(argv));
enum states { START, IN_WORD, IN_STRING } state = START;
// the buffer will contain the command the user has typed
// eg: ls -a /root /"directory with spaces"

    for (p = buffer; argc < argv_size && *p != '\0'; p++)
    {
        c = (unsigned char) *p;

        switch (state)
        {

        case START:
            //we ignore the spaces we encounter
            // like those haters in your life
            if (isspace(c)) {
                continue;
            }
            //when we encounter opening quotes
            if (c == '"') {
                state = IN_STRING;
                start_of_word = p + 1;
                continue;
            }

            state = IN_WORD;
            start_of_word = p;
            continue;
        case IN_STRING:
            //when we encounter closing quotes
                if (c == '"') {
                    *p = 0;
                    //peek to see if this is the end
                    if(*(p+5)!='\0')
                    argv[argc++] = start_of_word;
                    state = START;
            }
            continue;

        case IN_WORD:
            if (isspace(c)) {
                *p = 0;
                argv[argc++] = start_of_word;
                state = START;
            }
            continue;
        }
    }
    argv[argc++] = start_of_word;
    if(debug){
            printf("No of tokens:%d\n",argc);
            for(int i=0;i<argc;i++)
            printf("[%s]",argv[i]);
    }

}

we make the necessary to our ‘myshell.c’ program

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include<stdio.h>
#include<stdlib.h>
#include <readline/readline.h>
#include "parse.c"
......
        if(strcmp(input,"exit")==0){
                        free(input);
                        exit(0);

        }
        //add these lines and include parse.c
        if(strlen(buf)!=1)
        argc=parse(buf,1024);
        memset(buf,'\0',1024);
    }
}

Lets compile the file and test the functionality of the parser to see if it works.

Parser Working

The parser works as expected the different tokens are shown inside the [ ] along with the no of tokens. You can turn of the debug function by setting debug=0 and recompiling.

Whats next?

We have taken the input from the prompt separated it into commands and arguments, now we call the write the execute module that handles the various fork and exec calls for the external programs and to check if the command is a shell built-in(like cd).

Lets implement the execute module, which takes the command and the arguments and forks a child process and runs it either in the background or waits for the process to end.

We use to ‘&’ trailing character if we do not want to wait for the command to complete.

Lets look at the code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <sys/wait.h> //for wait
#include <errno.h>
#include <unistd.h> //for fork
#include <stdbool.h>
int execute(char * argv[],int argc){
    int status;
    int i;
    bool background=false;
    //if the command has a trailing '&' it is a background process
    if(strchr(argv[argc-1],'&')!=NULL){
        background=true;
        argv[argc-1]=NULL;
        argc--;
    }

    //child_process will contain the pid of the child
    pid_t child_process=fork();
    if(child_process<0){
            fprintf(stderr, "cant fork process%s\n",strerror(errno) );
            return 1;
    }
    else if(child_process==0){

        //execution of child starts here

            if(execvp(argv[0],argv)<0){
                    fprintf(stderr, "cant run program:%s\n",strerror(errno) );
                    return 1;
            }

    }
    else
    {
        if(!background){
                //the process is not running in background wait for the process
                //give terminal access to user after program execution
                    while(wait(&status)!=child_process)
                            ;
        }
    }
        return 0;
}

we save the above code in file called called execute.c

and make the following changes to our parse.c file.

1
2
3
4
5
6
7
8
#include "execute.c"
....

     printf("[%s]",argv[i]);
    }
        //call execute with the command and arguments
        execute(argv,argc);
}

Save all the files and recompile.

Running the shell

It works. :)

Note: Every command you execute in the command line in Linux is either a shell built-in or an external binary forked using fork+exec. Our shell will run external binaries but will not run any shell built-ins like cd (change directory) etc. The shell built-ins as the name suggests are a part of the shell and have to be implemented inside the shell.

Additional Features

  • Shell Built-ins
  • Globbing
  • Redirection / Pipes
  • Signal Handling
  • Color Schemes and Prompt.

See the entire project here: Z-Shell