Friday, August 1, 2014

Zombie Processes Explained

What is zombie process?

Zombie process (of defunct as named in some systems) is a process that its execution finished and the system for some reason cannot get its return value in order to clean it from processes table. That is a zombie process is just a record in the process table in the system and not a process that is under execution.

What causes zombie process?

Take this scenario of how process can turn into a zombie:
  1. A process forks a child process.
  2. The child process finish its execution and waits to report its exit status to its parent process.
  3. But the parent process is busy in processing another child process exit status, or it is poorly programmed so it do not response to children when they finish their execution (we will look at how to handle children that their execution is finished).
  4. The system keeps the child -that its exit status is not handled by its parent- in process table in case of its parent executes wait() to get its exit status.
  5. The zombie will stay in the process table until its exit status is handled by its parent or its parent terminates.
  6. When the parent of a zombie is terminated normally its zombies are deleted from the process table.
  7. If the parent is abnormally terminated its zombies are inherited by the process id 1 ( ie init ) which periodically initiate wait() and removes zombies.
Note:
  • When a process is abnormally terminated, all of its children are inherited by the process id 1 (ie init).
  • When a child is finished the kernel sends to its parent the signal SIGCHLD. Poorly programmed processes do not listen to this signal or do not handle child processes correctly. (We will look at how to handle zombies in our program later in another section).
Why Kernel Keeps Zombies?

Zombies cannot be deleted arbitrarily because their parents may be scheduling their exit status.

The programmer may plan to use the exit status of the child process later to make some decision, but not immediately after it is finished. So the system may cause a crash for that process if it get rid of zombies at its on discretion.

Only if the process terminate (normally or abnormally) the system can take care of deleting zombies.

Why Programmers Tend to Avoid Zombie Processes?

Even though that zombies almost consume nothing from the resource and share no CPU time which makes zombies have no effect on the performance on you system, programmers tend to avoid them for the following reason.

The process table of Linux systems is limited (usually 32768 processes only) so accumulating zombies can eventually (when their number is high) block the system from running new processes.

To see what is the limit of process table on your system, use this command:

    cat /proc/sys/kernel/pid_max

The zombie pid cannot be reused until it is completely deleted. So some programmers hold some zombies to make sure that they create children with unique pid ( not child is allowd to take the pid of previously terminated and child). In this situation zombies can be useful.


How to Kill Zombie Processes

You cannot just use kill or other commands to get rid of zombies. You need to kill the parent of the zombie process.

Commands to Deal with Zombie Processes

To show all zombie processes on your system use ps and then grep all lines containing Z.

    # ps -aux | grep Z

The output on Fedora 17 is

    1000 6746 0.0 0.0 0 0 pts/4 Z+ 15:59 0:00 [a.out]

Note that at the status column it used Z to indicate that it is zombie as well as ps used after the text of the command of the process to indicate that it is indeed zombie process.

Now to show the kill the parent of the zombie simply use kill with pid of the parent which can be found in the PPID (Parent Process ID) in the output of # ps -aux | grep Z

    # kill

Using KDE System Activity we see that in the "CPU %" column it used the word "zombie" to indicate zombie process

Example on program that generates zombie child processes?

In this section I will show you an example using C language on a process with zombie child process.

If the child finish its execution before the parent, the child will be marked as a zombie process.

Lets see how to build it in C languages.

    #include
    #include
    #include

    main() {
    pid_t pid;
    pid = fork();

    int n;
    n = 20;
    if(pid == 0) {
    printf("I am a child which will become a zombie process\n");
    }
    else
    sleep(n);
    return 0;
    }


We omitted the return value type on the main function because int is the default in C language.

After we fork() pid will hold 0 to indicate the child process and value other than 0 for the parent. This means

  • If pid == 0 this means that the current process is the child and we execute commands that we wish to be executed in the child process.
  • If pid != 0 this means that the current process is the parent process and we execute commands that we wish to be executed in the parent process.
You can download the C file of the code from here.


How to Handle Zombie processes in Code?

In the main function we tell the kernel that the handler of signal SIGCHLD is the function deleteChild(). This means that when the child finishes (and hence SIGCHLD signal is sent to its parent process) the kernel (which executes the process) will call the function deleteChild.

    #include
    #include
    #include
    #include

    void deleteChild(int signal_value) {
    int child_pid;
    int exit_status;
    child_pid = wait(&exit_status);
    printf("Child of PID: %d exited with status: %d\n", child_pid, exit_status);
    }

    main() {
    (void) signal(SIGCHLD, deleteChild); /* This is used to set the handle of the signal SIGCHLD is the function deleteChild */
    pid_t pid;
    pid = fork();

    int n;
    n = 20;
    if(pid == 0) {
    printf("I am a child which will become a zombie process\n");
    sleep(10);
    }
    else {
    sleep(n);
    printf("No Zombies Are Allowed\n");
    }
    return 0;
    }


You can download the C file of the code from here.

Notes About last Program:

We asked the child process to sleep for 10 seconds to let you check ps (or any GUI tool to watch processes) to insure that the PID of the child is the same as the PID that is printed by the program.

You may expected that the parent process will terminate after 20 seconds (after sleep finishes), but actually it finished after 10 seconds. This is because sleep finish under tow conditions:

  1. The amount of time specified (eg: 20 seconds) finish.
  2. A signal arrives at the program (in our case SIGCHLD signal).
After the signal arrives the sleep(20) will stop and the parent will continue its execution. And as demonstration we printed "No Zombies Are Allowed".

You can create a map between children pids and the exit status so you can get rid of zombies immediately and at the same time you can hold their exit status for later usage. Obviously saving the values is done inside deleteChild() function.

If you have any question, please comment.
Thanks for reading

No comments:

Post a Comment