OS Note Chapter 3: Processes

2023-03-27

5 min read

Note , Learning

Process Concept

Process

a program in execution in memory
execution must progress in sequential fashion

There can be several processes for one program

Consider multiple users executing the same program

Memory [Process (active)]
A
| Mouse click or command input
| Loader loads program
|
Disk [Program: Executable file <= Object file <= Source file (passive)]

ELF Object File Format: ELF (Executable and Linkable Format , Generic name: ELF binaries)

A process includes multiple parts

the program code (also called text section)
stack containing temporary data
- e.g., function parameters, return addresses, local variables
data section containing global variables and static variables
heap containing memory dynamically allocated during run time
program counter, processor registers (include all current data of the program inside the CPU)

Process State

As a process executes, it changes state

New: The process is being created
Running: Instructions are being executed by CPU
Waiting: The process is waiting for some event to occur
Ready: The process is waiting to be assigned to a processor
Terminated: The process has finished execution

Page 8, object 53

Representation of Process Scheduling

Page 8, object 56

Process Control Block (PCB)

All the PCBs together is how the kernel keeps track of which processes exist in memory, where they are in memory, what they are currently doing (executing, waiting, …), etc.

Threads

If process has a single thread of execution

One program counter

If a process has multiple threads of execution

Kernel will keep the control information for each thread
Multiple program counters
Explored in detail in Chapter 4

Object 75

Process Scheduling

Several processes want to use one CPU (or CPU core)
Process scheduler (调度) (algorithm inside the kernel, software) selects among available processes(i.e. in ready state) for next execution on CPU core
- Maintains scheduling queues of processes
  - Ready queue – set of all processes residing in main memory, ready and waiting to execute
  - Wait queues – set of processes waiting for an event (i.e. I/O)
Processes migrate among the various queues
Scheduling purpose: maximize CPU use, quickly switch processes

Context Switch

A context switch occurs when the CPU switches from one process to another.

Page 14, object 85

When CPU switches to another process, the system must
- save the state of the old process (the one is running on CPU)
- load the saved state (CPU registers, program counter in PCB) for the new process (the one will run on CPU) via a context switch (i.e., switch PCB)
Context-switch time is overhead, the system does no useful work while switching
- dependent on the complexity of OS. The more complex the OS and the PCB -> the longer time the context switch
- also dependent on hardware support
  - some hardware provides multiple sets of registers per CPU -> multiple contexts loaded at once, i,e, no need to switch context when load another process to run

Multitasking in Mobile Systems

iOS

At the beginning, supported only single task, but now multi-tasking

Due to screen (UI) real estate limitation, iOS provides

Single foreground process- controlled via user interface (process with active window focus / input)
Multiple background processes – in memory, running, but not on the display, and with limits
- Types limits include single, short task, receiving notification of events, specific long-running tasks like audio playback

Android

Always supports multi-tasking runs foreground and background, with fewer limits.

Background process uses a service to

perform tasks
keep running even if background process is suspended

Service is a separate application component that runs on behalf of the background process, has no user interface, small memory use.

No constraints on background applications types.

Operations on Processes

OS must provide mechanisms for:

process creation
process termination

Process Creation

Parent process

create children processes,
then, in turn create other processes,
finally forming a tree of processes

Generally, process is identified and managed via a process identifier (pid)

Page 19, object 96

Parent and children

Resource sharing options
1. Parent and children share all resources
2. Children share subset of parent’s resources
3. Parent and child share no resources
Execution options
1. Parent and children execute concurrently
2. Parent waits until children terminate

UNIX/Linux examples

fork() system call creates new process, create a duplicate of the parent
exec() system call used after a fork() to replace the process’ memory space with a new program
Parent process calls wait() for the child to terminate

Page 21, object 103

fork: Child duplicate of parent

exec: Child has a program loaded into it

Process Creation Procedure

……

Process Termination

Child process asks OS to terminate itself using the exit() system call.

Returns status data from child to parent (via wait())
Process’ resources are deallocated by operating system

Parent may terminate the execution of children processes (use abort() system call) for some reasons:

Child has exceeded allocated resources
Task assigned to child is no longer required
Parent terminated (some OSes do not allow children to be alive)

Cascading (倾泻式的) termination

Some operating systems do not allow child to exist if its parent has terminated.
When parent is terminated, all children, grandchildren, etc. are terminated.
The termination is initiated by the operating system.

wait() system call

The parent process may wait for termination of a child process
The call returns status information and the pid of the terminated process

Zombie and Orphan

A zombie (僵尸) process is

living corpse, half alive and half dead
terminated, but still consumes system resources
- still has an entry in the process table where the entry is still needed to allow the parent process to read its child’s exit status.
- once the exit status is read by parent via the wait system call, the zombie’s entry is removed from the process table (“reaped“).

An orphan (孤儿) process is

child process that is still running
but parent process has finished or terminated.

Zombie

Reaping (回收)
- Performed by parent on terminated child
- Parent is given exit status information (by OS)
- Kernel discards process
What if parent doesn’t reap?
- If any parent terminates without reaping a child, then child will be reaped by init or system process
- So, only reaping is needed explicitly reaping when parent is a
  - long-running processes
    - e.g., shells and servers

Inter-process Communication (IPC)

Processes within a system may be independent or cooperating

Independent process cannot affect or be affected by another process
Cooperating process can affect or be affected by other processes, because they share data

Advantages/Reasons of process cooperation

Information sharing
Computation speed-up
Modularity
Convenience

Disadvantages

Added complexity
Deadlocks(死锁) possible
Starvation(饥饿) possible

Communications Models

Cooperating processes need interprocess communication (IPC)

Two models of IPC:

Shared memory //user processes control
Message passing //kernel control

Page 38, object 155

Interprocess Communication – Shared Memory

Processes communicate through a shared memory
User processes control
Major issues
- Synchronization (同步)
  - Discussed in Chapters 6 & 7.

Producer-Consumer Problem

Paradigm (范例) for cooperating processes

Producer process produces information that is consumed by a consumer process

unbounded-buffer: no practical limit on the size of the buffer
- Producer never has to wait because there is always extra space available for new information;
- Only consumer might have to wait if no information is available to read
bounded-buffer: buffer size is fixed
- Producer might have to wait if there is no space available to store new information;
- Consumer might have to wait if no information is available to read

Interprocess Communication – Message Passing

Message passing

Kernel control
two operations (message size is either fixed or variable):
- send (message)
- receive(message)

Implementation of communication link

A communication link between processes must be created before communication.

Physical level:
- Shared memory
- Hardware bus
- Network
Logical level:
- Direct (process to process) or indirect (mail box)
- Synchronous (blocking) or asynchronous (non-blocking)
- Automatic or explicit buffering

Message Passing: Direct Communication

Processes must name each other explicitly.
Send() and receive() primitives (原语) are defined as
- send(P, message) – send a message to process P
- receive(Q, message) – receive a message from process Q
Properties of communication link
- Links are established automatically
- A link is associated with exactly one pair of communicating processes
- Between each pair there exists exactly one link
- The link may be unidirectional, but is usually bi-directional

Message Passing: Indirect Communication

Messages are directed and received from mailboxes (also referred to as ports)
- Each mailbox has a unique id
- Processes can communicate only if they share a mailbox
The send() and receive() primitives are defined as
- send(A, message) – send a message to mailbox A
- receive(A, message) – receive a message from mailbox A
Operations
- create a new mailbox (port)
- send and receive messages through mailbox
- destroy a mailbox
Properties of communication link
- Link established only if processes share a common mailbox
- A link may be associated with many processes
- Each pair of processes may share several communication links
- Link may be unidirectional or bi-directional

Message Passing: Synchronization(同步)

Message passing may be either blocking or non-blocking
Blocking is considered synchronous
- Blocking send – the sender is blocked until the message is received
- Blocking receive – the receiver is blocked until a message is available
Non-blocking is considered asynchronous
- Non-blocking send – the sender sends the message and continues without waiting for the message to be received
- Non-blocking receive – the receiver receives:
  - A valid message, or
  - Null message
Different combinations possible
- If both send and receive are blocking, this case is called rendezvous (会合)

Message Passing: Buffering

Queue of messages attached to the link, in kernel memory
Implemented in one of three ways
1. Zero capacity – no messages are queued on a link.
  
  Sender must wait for receiver (rendezvous)
2. Bounded capacity – finite length of n messages
  
  Sender must wait if link full
3. Unbounded capacity – infinite length
  
  Sender never waits

Examples of IPC Systems – Windows

Message-passing centric via advanced local procedure call (ALPC) facility

Only works between processes on the same system
Uses ports (like mailboxes) to establish and maintain communication channels
Communication works as following steps:
1. The client opens a handle (句柄,标示资源) to the subsystem’s connection port object.
2. The client sends a connection request.
3. The server creates two private communication ports and returns the handle to one of them to the client.
4. The client and server use the corresponding port handle to send messages or callbacks and to listen for replies.

Pipes

Acts as a conduit (管道) allowing two processes to communicate on the same computer

Ordinary (anonymous(匿名)) pipes – cannot be accessed from outside the process that created it. Typically, a parent process creates a pipe and uses it to communicate with a child process that it created.
Named pipes – can be accessed without a parent-child relationship.

Ordinary Pipes

Ordinary Pipes allow communication in standard producer-consumer style
Producer writes to one end (the write-end of the pipe)
Consumer reads from the other end (the read-end of the pipe)
Ordinary pipes are therefore unidirectional (单向)
- create two separate pipes if bidirectional communication is necessary
Require parent-child relationship between communicating processes

Page 56, object 202

Named Pipes

Named Pipes are more powerful than ordinary pipes
Communication is bidirectional
No parent-child relationship is necessary between the communicating processes
Several (>=2) processes can use the named pipe for communication
Provided on both UNIX and Windows systems

Communication in Client-Server Systems

Sockets

A socket (套接字)

endpoint for communication
a data structure inside the kernel that represents the local end of a network connection
a number included at start of message packet to differentiate network services on a host、
- Concatenation of IP address and port
  - E.g., 161.25.19.8:1625
  - port 1625 on host 161.25.19.8
- All ports below 1024 are well known, used for standard services
- Special IP address 127.0.0.1 (loopback) to refer to system on which process is running
Communication happens between a pair of sockets, one on the local host and one on the remote host

Remote Procedure Calls (RPC)

Remote procedure call (RPC) abstracts procedure calls between processes on networked systems.

Looks like a normal function call but done through the network
Uses ports for service differentiation

Remote communication has more failure scenarios than local

Messages can be delivered exactly once rather than at most once

OS typically provides a rendezvous (or matchmaker) service to connect client and server

Stubs

manages the network connection between client and server
extra code on the client side and server side
Typically, a separate stub exists for each separate remote procedure
On Windows, stub code compile from specification written in Microsoft Interface Definition Language (MIDL)

Marshal

Use External Data Representation (XDR) format to account for different CPU architectures
Data representation can be different in different CPU
- e.g., data: 0xAABBEEFF address: 0x1000, 0x1001, 0x1002, 0x1003

Big-endian (大端法)

e.g., ARM AA BB EE FF

0x1000 0x1001 0x1002 0x1003

Little-endian (小端法, intel),

e.g., Intel AA BB EE FF

0x1003 0x1002 0x1001 0x1000

On the client side, parameter marshalling involves converting the machine-dependent data into XDR before they are sent to server.
On the server side, the XDR data are un-marshalled and converted to the machine-dependent representation for the server.

RPC communication steps

The client-side stub (server proxy)
1. locates the server,
2. marshals the parameters,
3. sends parameters to server-side stub in a network message
The server-side stub (client proxy)
1. receives this message,
2. unpacks the marshalled parameters,
3. performs the procedure call on the server, marshals the result of the call, and
4. sends it back to the client-side stub in another message
The client-side stub
1. receives this second message,
2. unpacks the marshalled result, and
3. gives it back to the client that did the RPC

Page 69, object236