Linux Scheduling Algorithm
Caixue Lin and Vaibhav Bhandari
Problem Statement: Experiment with kernel-level tasks scheduling, including replacing the current Linux task scheduler and improving it.
Introduction: Linux uses a simple priority based scheduling algorithm to choose between the current processes in the system. Linux distinguishes three classes of processes for scheduling purposes, which are :
1) Real-time FIFO processes are the highest priority and not preemptable.
2) Real-time round robin processes are the same as Real-time FIFO thread except for its preemptibility;
3) Normal Timesharing processes have lower priority than the previous two.
The processes of the first and second classes belong to Real-Time processes, and the third kinds of processes belong to Normal processes. Actually, Real time processes will always run before normal processes and they may have either of two types of policy: round robin or first in first out. As Linux uses preemptive scheduling (but it's not fully preemptive???), every process is given a fixed time slice of 200ms to run.
Each process has scheduling priority and a quantum associated with it. Linux schedules processes via a GOODNESS algorithm, which chooses to run the process with highest goodness and the process's quantum is decremented by one as it runs.
Problems: The scheduling algorithm of Linux is both self-contained and relatively easy to follow. For that reason, many kernel hackers love to try to make improvements. However, the scheduler is a rather mysterious component of the kernel. While you can change its performance significantly by modifying just a few key parameters, there is usually no theoretical support to justify the results obtained. Furthermore, you can't be sure that the positive (or negative) results obtained will continue to hold when the mix of requests submitted by the users (real-time, interactive, I/O-bound, background, etc.) varies significantly. Actually, for almost every proposed scheduling strategy, it is possible to derive an artificial mix of requests that yields poor system performances.
Let us try to outline some pitfalls of the Linux scheduler, e.g. Linux is not a fully pre-emptable design.. As it will turn out, some of these limitations become significant on large systems with many users. On a single workstation that is running a few tens of processes at a time, the Linux scheduler is quite efficient. Since Linux was born on an Intel 80386 and continues to be most popular in the PC world, we consider the current Linux scheduler quite appropriate.
1). The algorithm does not scale well
If the number of existing processes is very large, it is inefficient to recompute all dynamic priorities at once.
In old traditional Unix kernels, the dynamic priorities were recomputed every second, thus the problem was even worse. Linux tries instead to minimize the overhead of the scheduler. Priorities are recomputed only when all runnable processes have exhausted their time quantum. Therefore, when the number of processes is large, the recomputation phase is more expensive but is executed less frequently.
This simple approach has the disadvantage that when the number of runnable processes is very large, I/O-bound processes are seldom boosted, and therefore interactive applications have a longer response time.
2). The predefined quantum is too large for high system loads
The system responsiveness experienced by users depends heavily on the system load, which is the average number of processes that are runnable, and hence waiting for CPU time.
As mentioned before, system responsiveness depends also on the average time-quantum duration of the runnable processes. In Linux, the predefined time quantum appears to be too large for high-end machines having a very high expected system load.
3). I/O-bound process boosting strategy is not optimal
The preference for I/O-bound processes is a good strategy to ensure a short response time for interactive programs, but it is not perfect. Indeed, some batch programs with almost no user interaction are I/O-bound. For instance, consider a database search engine that must typically read lots of data from the hard disk or a network application that must collect data from a remote host on a slow link. Even if these kinds of processes do not need a short response time, they are boosted by the scheduling algorithm.
On the other hand, interactive programs that are also CPU-bound may appear unresponsive to the users, since the increment of dynamic priority due to I/O blocking operations may not compensate for the decrement due to CPU usage.
4). Support for real-time applications is weak
Nonpreemptive kernels are not well suited for real-time applications, since processes may spend several milliseconds in Kernel Mode while handling an interrupt or exception. During this time, a real-time process that becomes runnable cannot be resumed. This is unacceptable for real-time applications, which require predictable and low response times.
Future versions of Linux will likely address Linux weak Real-Time problem, either by implementing SVR4's "fixed preemption points" or by making the kernel fully preemptive.
However, kernel preemption is just one of several necessary conditions for implementing an effective real-time scheduler. Several other issues must be considered. For instance, real-time processes often must use resources also needed by conventional processes. A real-time process may thus end up waiting until lower-priority process releases some resource. This phenomenon is called priority inversion. Moreover, a real-time process could require a kernel service that is granted on behalf of another lower-priority process (for example, a kernel thread). This phenomenon is called hidden scheduling. An effective real-time scheduler should address and resolve such problems.
1) One idea is to create a RTOS kernel and running Linux as an unscheduled thread under the RTOS. In this duel structure Linux, high-priority thread that need true RTOS performance would run under the host operating environment at a higher priority than would the Linux thread. This kind of Linux kernel provides a nominally generalized RTOS API and threading model, while relegating the main Linux kernel to the status of a low priority (idle) thread under their control. 
2) The second effort is to replace the current Linux kernel scheduling algorithm with a new scheduling algorithm and tune Linux device drivers, and hence to make Linux more responsive and deterministic.
Our approach will evolve as we experiment with present Kernel (2.4) and research through the relevant material.
The plan of approach:
1. Find out the relevant experiments and simulations
2. Compare various plausible solutions in performance, over available data.
1. Catherine Lingxia Wang, Bo Yao, Yang Yang, Zhengyong Zhu. A Survey of Embedded Operating System, Department of Computer Science, UCSD, 2001.
2. Daniel P. Bovet & Marco Cesati. Understanding the Linux Kernel, Chapter 10, Processing Scheduling, October 2000.
1. The uptime program returns the system load for the past 1, 5, and 15 minutes. The same information can be obtained by reading the /proc/loadavg file.
2. The Linux kernel has been modified in several ways so it can handle a few hard real-time jobs if they remain short. Basically, hardware interrupts are trapped and kernel execution is monitored by a kind of "superkernel." These changes do not make Linux a true real-time system, though.