The New O(1) Scheduler in Linux 2

The O(1) scheduler in Linux 2.6

The new scheduler has one runqueue per processor.

Address two big issues: multiprocessor scalability and CPU affinity.

A runqueue per-processor eliminates the need to serialize one global runqueue and eliminates waiting for the scheduler: Multiple processors can now execute the scheduler concurrently, each on their own local queue.

One runqueue per processor naturally improved CPU affinity

Because runqueues are local to each processor, an interrupted task will resume on the same processor it was running on before the interruption, and any performance improvement realized by local processor and memory caches is maintained.

To avoid the problem of imbalance, the scheduler will, when appropriate, migrate processes between CPUs to balance the workload. This migration is now, however, the exception rather than the rule.

Scheduler O(n) issue

Moving to multiple queues does not in itself solve the O(n) scalability issue.

The runqueues for each processor still grow linearly with the number of tasks in the system and, even if nothing else has changed, the scheduler still has to scan all n entries, and the O(n) behavior would still occur (albeit now divided across available CPUs).

New O(1) Scheduler

It implements a priority-based array of task entries that enables the highest-priority task to be found quickly (by using a priority bitmap with a fast instruction).

It eliminates the recalculation of priorities and timeslices by maintaining two queues: an active queue and an expired queue.

It recalculates the timeslice of an expired task before it places it on the expired queue. When all the tasks expire, the scheduler simply needs to swap the active and expired queue pointers and schedule the next task.

Long scans of runqueues are, thus, eliminated.

This process takes the same amount of processing, irrespective of the number of tasks in the system. It no longer depends on the value of n, but is a fixed constant. In the same notation, this constant behavior is designated as O(1), and the new scheduler is often referred to as the “O(1) scheduler.”

For responsiveness, the kernel 2.6 is now pre-emptible

Higher priority task can now interrupt the lower-priority task even during the processing of a system call. The high-priority task can, therefore, get control of the CPU faster than was previously possible and deliver more responsive service for time-critical tasks.