Operating Systems and Concurrency

Content

Introduction to operating systems/computer design
Processes, process scheduling, threading, ...
Concurrency (deadlocks)
Coursework clarification/revision
Memory management, swapping, virtual memory, ...
File Systems, file structures, management, ...
Virtualisation
Revision

Book Resources

Abraham Silberschatz Operating System Concepts
Modern Operating System - Tanenbaum
Operating System Concepts 8th Edition

Introduction

Defining Operating Systems

What Can an OS Do For Me?

import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintWriter;

public class Demo1 {
    public static void main(String[] args) throws IOException {
        FileWriter fw = new FileWriter(("C:/Program Files (x86)/test.txt"));
        PrintWriter pw = new PrintWriter(fw);
        pw.close();
    }
}

File systems: where is the file physically written on the disk and how is it retrieved?
Abstraction: why looks the instruction the same independent of the device?
Concurrency: what if multiple programs access the same file simultaneously?
Security: why is the access denied?

OS - A Virtual Machine Providing Abstractions (i.e. an Extended Machine)

In the early days, programmers had to deal directly with the hardware
- Real computer hardware is ugly
- Hardware is extremely difficult to manipulate/program
An operating system is a layer of indirection on top of the hardware
- It provides abstractions for application programs (e.g., file systems)
- It provides a cleaner and easier interface to the hardware and hides the complexity of “bare metal”
- It allows the programmer to be lazy by using common routines :-)

------------------
   Applications       User Mode
------------------   Virtual Machine Interface
 Operating System     Kernel Mode
------------------   Hardware Interface
     Hardware
------------------

OS - A Resource Manager

Many modern operating systems use multi-programming to improve user experience and maximise resource utilisation
- without multi-programming, CPU time is wasted while waiting for I/O requests
The operating system must allocate/share resources (including CPU, memory, I/O devices) fairly and safely between competing processes:
- In time, e.g. CPUs and printers
- In space, e.g., memory and disks
The execution of multiple programs (processes) needs to be interleaved with one another:
- This requires context switches and process scheduling ⇒ mutual exclusion, deadlock avoidance, protection, ...

Computer Hardware

Simplified Computer Model

CPU Design

A CPU’s basic cycle consist of fetch, decode, and execute (pipelines, or superscalar)
Every CPU has his own instruction set
A CPU has a set of registers (extremely fast memory close to the CPU “core”)

Registers are used to store data and for special functions (e.g. program counter, program status word – mode bit)
The compiler/programmer decides what to keep in the registers
Context switching must save and restore the CPU’s internal state, including its registers

CPU Design: Memory Management Unit

#include <stdio.h>

int iVar = 0;
void main() {
    int i = 0;
    while(i < 10) {
		iVar++;
		sleep(2);
		printf("Address:%u; Value:%d\n",&iVar, iVar);
        i++;
    }
}

Will the same or different addresses be displayed for x?
- Same
Will the value for x in the first run influence the value for x in the second run?
- Yes

Note that this does not work on Mac (which uses Address Space Layout Randomization – ASLR)

There are two different address spaces:
- the logical address space seen by the process and used by the compiler
- the physical address space seen by the hardware/OS
When compiling code, memory addresses must be assigned to variables and instructions, the compiler does not know what memory addresses will be available in physical memory
- It will just assume that the code will start running at address 0 when generating the machine code

On some rare occasions, the process may run at physical address 0
On other occasions, it will be running at a completely different location in physical memory and an offset is added

1	physical address = logical address + offset

In the case of our example:

The address printed on the screen is the logical address
The logical address is translated into two different physical addresses using different offsets

Modern computers use a logical and physical memory addresses:

Every process has a logical address space – [0,MAX_64] (theoretically)
- MAX_64 = 16 exbibyte, 1 exbibyte = 1152921504606846976 bytes = 1024 pebibytes
The machine has a physical address space – [0, MAX] (MAX determined by the amount of physical memory)

A context switch between processes invalidates the MMU (as well as registers, cache, ...)

The CPU’s internal state must be saved and restored for every context switch (by the OS in the process control block)

Timer Interrupts

Interrupts temporarily pause a process’s normal operation
Types of interrupts
- Timer interrupts by CPU clock
  - Context switches (i.e. switching between processes) can be initiated by timer interrupts after a “set time”
- I/O interrupts for I/O completion or error codes
- Software generated, e.g. errors and exceptions

1. Timer generates an interrupt
2. CPU finishes current instruction and tests
      for interrupt
4. Transfer to interrupt service routine
  - Hardware saves current process state
      (PSW, program counter)
  - Set program counter to interrupt service routine
  - Save registers and other state information
6. Carry out interrupt service routine (scheduler)
7. Restore next process to run

Moore’s Law

The number of transistors on an integrated circuit (chip) doubles roughly every two years.

Closely linked, but not necessarily related to performance (performance roughly doubles until 2003)
Moore’s still continuing, but the “power wall” slows performance improvements of single core/single processor systems

Multi-core, Hyperthreaded Processors

Modern CPUs contain multiple cores and are often hyper-threaded
Evolution in hardware has implications on operating system design
- Process scheduling needs to account for load balancing and CPU affinity
- Cache coherency becomes important

Memory

Memory hierarchies used to balance cost and performance
- Fast and expensive memory is used for caching
- Slow and inexpensive memory is used for long term storage
- Memory includes, registers, L1/L2 cache, main/core memory, disk, etc.
L2 Cache can be shared or dedicated to individual cores
Cache management is mainly done by hardware
The CPU can only access main memory directly (i.e. files have to be brought into memory first)

I/O Devices

Device driver interacts with the controller, controller interacts with the device (e.g., disk controller)
The operating system/device driver typically communicates with the controller through registers
I/O can take place through:
- Busy waiting
- Interrupt based
- Direct memory access (using DMA chip)

Operating System Structure

Operating Systems contain a lot of functionality, including
- Processes, process management, process scheduling, context switching, etc.
- Memory management, virtual memory, etc.
- File Systems
- I/O Management
How are operating systems structured?
- Micro kernels
- Monolithic

Micro Kernels

All non-essential functionality is extracted from the kernel
- Communication, memory management and CPU scheduling are likely to be included in the kernel
- The file system, GUI, device drivers are likely to be user processes
Micro kernels are easier to extend, more portable, and usually more reliable
Frequent system calls and kernel traps cause significant overhead (mode switches)
Some Unix version, Mac OS X, Minix, and early versions of Windows (NT4.0) were (partially) micro kernels

Monolithic Systems

All procedures are linked together into one single executable running in kernel mode
Monolithic kernels are difficult to maintain
Current versions of Windows, Linux are implemented as monolithic kernels

Processes and Threads

Processes

Definition

A process is a running instance of a program
- A program is passive and “sits” on a disk
- A process has control structures associated with it, may be active, and may have resources assigned to it (e.g. I/O devices, memory, processor)
A process is registered with the OS using its “control structures”
- i.e. an entry in the OS’s process table to a process control blocks (PCB)
The process control block contains all information necessary to administer the process
- is essential for context switching in multiprogrammed systems

Memory Image

A process’ memory image contains:
- The program code (could be shared between multiple processes running the same code)
- A data segment, stack and heap
Every process has its own logical address space, in which the stack and heap are placed at opposite sides to allow them to grow
Some OS’es use address space layout randomisation

Process States and Transitions

States

A new process has just been created (has a PCB) and is waiting to be admitted (it may not yet be in memory)
A ready process is waiting for CPU to become available (e.g. unblocked or timer interrupt)
A running process “owns” the CPU
A blocked process cannot continue, e.g. is waiting for I/O
A terminated process is no longer executable (the data structures - PCB - may be temporarily preserved
A suspended process is swapped out (not discussed further)

Transitions

New → Ready: admit the process and commit to execution
Running → Blocked: e.g. process is waiting for input or carried out a system call
Ready → Running: the process is selected by the process scheduler
Blocked → Ready: event happens, e.g. I/O operation has finished
Running → Ready: the process is preempted, e.g., by a timer interrupt or by pause
Running → Exit: process has finished, e.g. program ended or exception encountered

The interrupts/traps/system calls lie on the basis of the transitions

OS Queues

Context Switching

Multi-programming

Modern computers are multi-programming systems
Assuming a single processor system, the instructions of individual processes are executed sequentially
- Multi-programming goes back to the “MULTICS” age
- Multi-programming is achieved by alternating processes and context switching
- So true parallelism requires multiple processors

When a context switch takes place, the system saves the state of the old process and loads the state of the new process (creates overhead)
- Saved ⇒ the process control block is updated
- (Re-)started ⇒ the process control block read
A trade-off exists between the length of the time-slice and the context switch time
- Short time slices result in good response times but low effective “utilisation”
  - e.g.: 99 * (1 + 1) = 198ms
- Long time slices result in poor response times but better effective “utilisation”
  - e.g.: 99 * (100 + 1) = 9999ms

A process control block contains three types of attributes:
- Process identification (PID, UID, Parent PID)
- Process control information (process state, scheduling information, etc.)
- Process state information (user registers, program counter, stack pointer, program status word, memory management information, files, etc.)
Process control blocks are kernel data structures
- i.e. they are protected and only accessible in kernel mode!
- Allowing user applications to access them directly could compromise their integrity
- The operating system manages them on the user’s behalf through system calls (e.g. to set process priority)

Process Implementation

Tables and Control Blocks

An operating system maintains information about the status of “resources” in tables
- Process tables (process control blocks)
- Memory tables (memory allocation, memory protection, virtual memory)
- I/O tables (availability, status, transfer information)
- File tables (location, status)
The process table holds a process control block for each process, allocated upon process creation
Tables are maintained by the kernel and are usually cross referenced

Switching Processes

1. Save process state (program counter, registers)
2. Update PCB (running -> ready/blocked)
3. Move PCB to appropriate queue (ready/blocked)
4. Run scheduler, select new process
5. Update to running state in the new PCB
6. Update memory management unit (MMU)
7. Restore process

System Calls

Process Creation

The true system calls are “wrapped” in the OS libraries (e.g. libc) following a well defined interface (e.g. POSIX, WIN32 API)
System calls for process creation:
- Unix: fork() generates and exact copy of parent ⇒ exec()
- Windows: NTCreateProcess()
- Linux: Clone()

    Application
------------------ Library Interface (e.g. POSIX, WIN32)
      Library
------------------ System Call Interface
 Operating System

fork()

fork() creates an exact copy of the current process
fork() returns the process identifier of the child process to the parent process
fork() returns 0 to the child process

#include <stdio.h>

// Parent calls fork
// Child has identical copy of parent code
// Child code overwrites memory image

void main()
{
    int iStatus;
    int iPID = fork();
    if(iPID < 0)
    {
        printf("fork error\n");
    }
    else if(iPID == 0)
    {
        printf("hello from child process\n");
        execl("/bin/ls", "ls", "-l", 0);
    }
    else if(iPID > 0)
    {
        waitpid(iPID, &iStatus, 0);
        printf("hello from parent process\n");
    }
}

Process Termination

System calls are necessary to notify the OS that the process has terminated
- Resources must be de-allocated
- Output must be flushed
- Process admin may have to be carried out
A system calls for process termination:
- UNIX/Linux: exit(), kill()
- Windows: TerminateProcess()

Process Scheduling

Context

The OS is responsible for managing and scheduling processes
- Decide when to admit processes to the system (new → ready)
- Decide which process to run next (ready → run)
- Decide when and which processes to interrupt (running → ready)
The OS relies on the scheduler (dispatcher) to decide which process to run next, which uses a scheduling algorithm to do so
- The type of algorithm used by the scheduler is influenced by the type of operating system (e.g., real time vs. batch)

Classification by Time Horizon

Long term: applies to new processes and controls the degree of multiprogramming by deciding which processes to admit to the system when
- A good mix of CPU and I/O bound processes is favourable to keep all resources as busy as possible
- Usually absent in popular modern OS
Medium term: controls swapping and the degree of multi-programming
Short term: decides which process to run next
- Manages the ready queue
- Invoked very frequently, hence must be fast
- Usually called in response to clock interrupts, I/O interrupts, or blocking system calls

Classification by Approach

Non-preemptive: processes are only interrupted voluntarily (e.g., I/O operation or “nice” system call – yield())
- Windows 3.1 and DOS were non-preemtive
Preemptive: processes can be interrupted forcefully or voluntarily
- This requires context switches which generate overhead, too many of them should be avoided (recall last lecture)
- Prevents processes from monopolising the CPU
- Most popular modern operating systems are preemptive

Performance Assessment

Criteria

User oriented criteria:
- Response time: minimise the time between creating the job and its first execution
- Turnaround time: minimise the time between creating the job and finishing it
- Predictability: minimise the variance in processing times
System oriented criteria:
- Throughput: maximise the number of jobs processed per hour
- Fairness:
  - Are processing power/waiting time equally distributed?
  - Are some processes kept waiting excessively long (starvation)
Evaluation criteria can be conflicting, i.e., improving the response time may require more context switches, and hence worsen the throughput and increase the turn around time

Scheduling Algorithms

Overview

Algorithms considered:
- First Come First Served (FCFS)/ First In First Out (FIFO)
- Shortest job first
- Round Robin
- Priority queues
Performance measures used:
- Average response time: the average of the time taken for all the processes to start
- Average turnaround time: the average time taken for all the processes to finish

First Come First Served

Concept: a non-preemtive algorithm that operates as a strict queueing mechanism and schedules the processes in the same order that they were added to the queue
Advantages: positional fairness and easy to implement
Disadvantages:
- Favours long processes over short ones (think of the supermarket checkout!)
- Could compromise resource utilisation, i.e., CPU vs. I/O devices

Average response time = (0+7+9+15)/4 = 31/4 = 7.75
Average turn around time = (7+9+15+20)/4 = 51/4 = 12.75

Shortest Job First

Concept: A non-preemtive algorithm that starts processes in order of ascending processing time using a provided/known estimate of the processing
Advantages: always result in the optimal turn around time
Disadvantages:
- Starvation might occur
- Fairness and predictability are compromised
- Processing times have to be known beforehand

Average response time = (0+2+7+13)/4 = 22/4 = 5.5
Average turn around time = (2+7+13+20)/4 = 42/4 = 10.5

Round Robin

Concept: a preemptive version of FCFS that forces context switches at periodic intervals or time slices
- Processes run in the order that they were added to the queue
- Processes are forcefully interrupted by the timer
Advantages:
- Improved response time
- Effective for general purpose time sharing systems
Disadvantages:
- Increased context switching and thus overhead
- Favours CPU bound processes (which usually run long) over I/O processes (which do not run long)
  - Can be prevented by working with multiple queues?
- Can reduce to FCFS
The length of the time slice must be carefully considered!
For instance, assuming a multi-programming system with preemptive scheduling and a context switch time of 1ms:
- E.g., a good (low) response time is achieved with a small time slice (e.g. 1ms) ⇒ low throughput
- E.g., a high throughput is achieved with a large time slice (e.g. 1000ms) ⇒ high response time
If a time slice is only used partially, the next process starts immediately

Average response time = (0+1+2+3)/4 = 6/4 = 1.5
Average turn around time = (6+17+19+20)/4 = 62/4 = 15.5

Priority Queues

Concept: A preemptive algorithm that schedules processes by priority (high → low)
- The process priority is saved in the process control block
Advantages: can prioritise I/O bound jobs
Disadvantages: low priority processes may suffer from starvation (with static priorities)

Average response time = (0+1+11+13)/4 = 25/4 = 6.25
Average turn around time = (10+11+13+20)/4 = 54/4 = 13.5

Threads

Threads from an OS Perspective

A process consists of two fundamental units:
- Resources
  - A logical address space containing the process image (program, data, heap, stack)
  - Files, I/O devices, I/O channels, . . .
- Execution trace: an entity that gets executed
A process can share its resources between multiple execution traces
- i.e., multiple threads running in the same resource environment

Every thread has its own execution context (e.g. program counter, stack, registers)
All threads have access to the process’ shared resources
- E.g. files, one thread opens a file, all threads of the same process can access the file
- Global variables, memory, etc. (⇒ synchronisation!)
Some CPUs (hyperthreaded ones) have direct hardware support for multi-threading
- They can offer up to 8 hardware threads per core

Threads vs Processes

Similar to processes, threads have:
- States
- Transitions
- A thread control block
Threads incur less overhead to create/terminate/switch
- Address space remains the same for threads of the same process

Inter-thread communication is easier/faster than interprocess communication
- (threads share memory by default)
No protection boundaries are required in the address space
- (threads are cooperating, belong to the same user, and have a common goal)
Synchronisation has to be considered carefully!

Why Use Threads

Multiple related activities apply to the same resources, these resources should be accessible/shared
Processes will often contain multiple blocking tasks
- I/O operations (thread blocks, interrupt marks completion)
- Memory access (pages faults are result in blocking)
Such activities should be carried out in parallel/concurrently

OS Implementations of Threads

User threads

Kernel threads

Hybrid implementations

3 Types of Threads

User Threads (Many-to-One)

Thread management is carried out in user space with the help of a user library
- (creating, destroying, scheduling, thread control block manipulation)
The process maintains a thread table managed by the runtime system without the kernel’s knowledge
- Similar to process table
- Used for thread switching
- Tracks thread related information

Advantages:
- Threads are in user space
  - (no mode switches required)
- Full control over the thread scheduler
- OS independent
  - (threads can run on OS that do not support them)
Disadvantages:
- Blocking system calls suspend the entire process
  - (user threads are mapped onto a single process, managed by the kernel)
- No true parallelism
  - (a process is scheduled on a single CPU)
- Clock interrupts are non-existent
  - (user threads are non-preemptive)
- Page faults result in blocking the process

Kernel Threads (One-to-One)

The kernel manages the threads, user application accesses threading facilities through API and system calls
- Thread table is in the kernel, containing thread control blocks (subset of process control blocks)
- If a thread blocks, the kernel chooses thread from same or different process (↔ user threads)

Advantages:
- True parallelism can be achieved
- No run-time system needed
Disadvantages:
- Frequent mode switches take place, resulting in lower performance

Windows and Linux apply this approach

User Threads vs. Kernel Threads vs. Processes on Performance

Null fork: the overhead in creating, scheduling, running and terminating a null process/thread

Signal wait: overhead in synchronising threads

Hybrid Implementations (Many-to-Many)

User threads are multiplexed onto kernel threads
Kernel sees and schedules the kernel threads (a limited number)
User application sees user threads and creates/schedules these (an “unrestricted” number)

Comparison

Thread Management

Libraries

Thread libraries provide an API/interface for managing threads
- (e.g. creating, running, destroying, synchronising, etc.)
Thread libraries can be implemented:
- Entirely in user space
  - (i.e. user threads)
- Based on system calls
  - (i.e. rely on the kernel for thread implementations)

Examples of thread APIs include POSIX’s PThreads, Windows Threads, and Java Threads.

POSIX Threads

POSIX threads are a specification that “anyone” can implement, i.e., it defines a set of APIs (function calls, over 60 of them) and what they do.

Core functions of PThreads include:

More detailed descriptions can be found using man function_name on the command line/

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#define NUMBER_OF_THREADS 10

void * hello_world(void * tid)
{
    printf("HELLO from thread %d\n", *((int *) tid));
    pthread_exit(NULL);
}

int main(int argc, char * argv[]) {
    int aiIds[] = {1,2,3,4,5,6,7,8,9,10};
    pthread_t threads[NUMBER_OF_THREADS];
    int i;
    for(i = 0; i < NUMBER_OF_THREADS; i++) {
        if(pthread_create(&threads[i], NULL, hello_world, (void *) &(aiIds[i])) == -1)
        {
            printf("Creating thread %d failed", i);
            exit(-1);
        }
    }
    for(i = 0; i < NUMBER_OF_THREADS; i++)
        pthread_join(threads[i], NULL);
    return 0;
}

Scheduling in Windows 7: Multi-level Feedback Queues

Moving Beyond Priority Queues

Different scheduling algorithms can be used for the individual queues (e.g., round robin, SJF, FCFS).
Feedback queues allow priorities to change dynamically, i.e., jobs can move between queues:
- Move to lower priority queue if too much CPU time is used (prioritise I/O and interactive processes)
- Move to higher priority queue to prevent starvation and avoid inversion of control

Inversion of Control

Process A (low)  Process B (high)  Process C (high)
...
request X
receive X
...              RUN
...              request X
...              blocked           RUN
...              ...               ...

Characteristics of feedback queues:
- The number of queues
- The scheduling algorithms used for the individual queues
- Migration policy between queues
- Initial access to the queues
Feedback queues are highly configurable and offer significant flexibility

Windows 7

An interactive system using a preemptive scheduler with dynamic priority levels
- Two priority classes with 16 different priority levels exist
  - “Real time” processes/threads have a fixed priority level
  - “Variable” processes/threads can have their priorities boosted temporarily
A round robin algorithm is used within the queues

Priorities are based on the process base priority (between 0-15) and thread base priority (±2 relative to the process priority)
A thread’s priority dynamically changes during execution between its base priority and the maximum priority within its class
- Interactive I/O bound processes (e.g. keyboard) receive a larger boost
- Boosting priorities prevents priority inversion

Example:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Drawing;
using System.Linq;
using System.Threading;

namespace MultiThreadEx
{
    class Program
    {
        private static readonly Stopwatch stopwatch = new Stopwatch();
        private static readonly List<Tuple<int, double, double>> intervals = new List<Tuple<int, double, double>>();

        /// <summary>
        /// The duration of the experiment, sec
        /// </summary>
        private const int ExperimentDuration = 10;

        /// <summary>
        /// The height, in pixels, of one thread in the bitmap
        /// </summary>
        private const int ThreadHeight = 40;

        /// <summary>
        /// Horizontal resolution of the bitmap
        /// </summary>
        private const int PixelsPerSec = 1000;

        static void Main(string[] args)
        {
            // If there are several CPU cores, all the threads will run on the second one (the first one might be busy with the system)
            if (Environment.ProcessorCount > 1)
                Process.GetCurrentProcess().ProcessorAffinity = (IntPtr) 4;

            // To reduce interruptions by other processes
            Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.RealTime;

            // To make sure that some high-priority thread does not prevent the main thread from starting all the other threads
            Thread.CurrentThread.Priority = ThreadPriority.Highest;

            // This defines the number of threads and their priorities
            //var priorities = new[] { ThreadPriority.Highest, ThreadPriority.Highest};
            //var priorities = new[] { ThreadPriority.BelowNormal, ThreadPriority.BelowNormal, ThreadPriority.BelowNormal };
            var priorities = new[] { ThreadPriority.Normal, ThreadPriority.Normal, ThreadPriority.Normal, ThreadPriority.Normal, ThreadPriority.Normal, ThreadPriority.Normal};
            //var priorities = new[] { ThreadPriority.Normal, ThreadPriority.AboveNormal };
            //var priorities = new[] { ThreadPriority.Normal, ThreadPriority.Normal, ThreadPriority.BelowNormal };

            // Start the high-resolution timer
            stopwatch.Start();

            // Start the threads
            Thread[] threads = new Thread[priorities.Length];
            for (int i = 0; i < threads.Length; i++)
            {
                threads[i] = new Thread(Run) { Priority = priorities[i] };
                threads[i].Start(i);
            }

            // Wait until all the threads terminate
            foreach (var thread in threads)
                thread.Join();

            // Produce the bitmap
            var bitmap = new Bitmap(PixelsPerSec * ExperimentDuration, threads.Length * ThreadHeight);
            using (var graphics = Graphics.FromImage(bitmap))
            {
                Brush[] brushes = { Brushes.Blue, Brushes.Red, Brushes.Green };
                foreach (var tuple in intervals)
                {
                    int threadId = tuple.Item1;
                    graphics.FillRectangle(
                        brushes[threadId % brushes.Length],
                        (float)(tuple.Item2 * PixelsPerSec),
                        (threadId + 0.1f) * ThreadHeight,
                        (float)((tuple.Item3 - tuple.Item2) * PixelsPerSec),
                        0.8f * ThreadHeight);
                }
            }

            string filename = priorities.Select(p => p.ToString()).Aggregate((s1, s2) => s1 + ' ' + s2) + ".png";

            // Save the image file
            bitmap.Save(filename);

            // Open the image file
            new Process { StartInfo = new ProcessStartInfo(filename) }.Start();
        }

        private static void Run(object state)
        {
            // To let other threads start
            Thread.Sleep(20);

            int curThreadId = (int)state;

            // The start time of the time chunk allocated to this thread
            double start = stopwatch.Elapsed.TotalSeconds;

            // Last reading of the timer
            double prev = start;

            while (prev < ExperimentDuration)
            {
                double cur = stopwatch.Elapsed.TotalSeconds;

                // If the gap between consecutive readins is above 1 / PixelsPerSec, 
                // add a record for the last chunk of time allocated to this thread
                if ((cur - prev) * PixelsPerSec >= 1)
                {
                    lock (intervals)
                        intervals.Add(Tuple.Create(curThreadId, start, prev));

                    start = cur;
                }
                prev = cur;
            }
        }
    }
}

Scheduling in Linux

The Completely Fair Scheduler

Process scheduling has evolved over different versions of Linux to account for multiple processors/cores, processor affinity, and load balancing between cores
Linux distinguishes between two types of tasks for scheduling:
- Real time tasks (to be POSIX compliant), divided into:
  - Real time FIFO tasks
  - Real time Round Robin tasks
- Time sharing tasks using a preemptive approach (similar to variable in Windows)
The most recent scheduling algorithm in Linux for time sharing tasks is the “completely fair scheduler” (CFS, before the 2.6 kernel, this was an O(1) scheduler)

Real-Time Tasks

Real time FIFO tasks have the highest priority and are scheduled using a FCFS approach, using preemption if a higher priority job shows up
Real time round robin tasks are preemptable by clock interrupts and have a time slice associated with them
Both approaches cannot guarantee hard deadlines

The CFS divides the CPU time between all processes
If all N processes have the same priority:
- They will be allocated a “time slice” equal to 1/N times the available CPU time
  - I.e., if N equals 5, every process will receive 20% of the processor’s time
The length of the time slice and the “available CPU time” are based on the targeted latency (⇒ every process should run at least once during this interval)
If N is very large, the context switch time will be dominant, hence a lower bound on the “time slice” is imposed by the minimum granularity
- A process’s time slice can be no less than the minimum granularity (response time will deteriorate)

A weighting scheme is used to take different priorities into account
If process have different priorities:
- Every process \(i\) is allocated a weight \(W_i\) that reflects its priority
- The “time slice” allocated to process \(i\) is then proportional to \(\frac{W_i}{\sum_{j \in N} W_j}\)
The tasks with the lowest proportional amount of “used CPU time” are selected first

Multi-processor Scheduling

Scheduling Decisions

Single processor machine: which process (thread) to run next (one dimensional)
Scheduling decisions on a multi-processor/core machine include:
- Which process (thread) to run where, i.e., which CPU?
- Which process (thread) to run when?

Shared Queues

A single or multi-level queue shared between all CPUs
Advantage: automatic load balancing
Disadvantages:
- Contention for the queues (locking is needed)
- “All CPUs are equal, but some are more equal than others” : does not account for processor affinity:
  - Cache becomes invalid when moving to a different CPU
  - Translation look aside buffers (TLBs - part of the MMU) become invalid
Windows will allocate the highest priority threads to the individual CPUs/cores

Private Queues

Each CPU has a private (set) of queues
Advantages:
- CPU affinity is automatically satisfied
- Contention for shared queue is minimised
Disadvantages: less load balancing
Push and pull migration between CPUs is possible

Thread Types

Related: multiple threads that communicate with one another and ideally run together (e.g. search algorithm)
Unrelated: e.g. processes threads that are independent, possibly started by different users running different programs

Working Together

E.g. threads belong to the same process and are cooperating, e.g. they exchange messages or share information, e.g
- Process A has thread A0 and A1, A0 and A1 cooperate
- Process B has thread B0 and B1, B0 and B1 cooperate
- The scheduler selects A0 and B1 to run first, then A1 and B0, and A0 and A1, and B0 and B1 run on different CPUs
- They try to send messages to the other threads, which are still in the ready state

The aim is to get threads running, as much as possible, at the same time across multiple CPUs
Approaches include:
- Space sharing
- Gang scheduling

Space Scheduling

Approach:
- N threads are allocated to N dedicated CPUs
- N threads are kept waiting until N CPUs are available
- Non-preemptive, i.e. blocking calls result in idle CPUs (less context switching overhead but results in CPU idle time)
The number N can be dynamically adjusted to match processor capacity

Gang Scheduling

Time slices are synchronised and the scheduler groups threads together to run simultaneously (as much as possible)
A preemptive algorithm
Blocking threads result in idle CPUs