The Abstraction: Address Spaces
Early Systems
The OS was a set of routines that sat in memory, and there would be one running program that currently sat in physical memory and used the rest of memory.
Multiprogramming and Time Sharing
Leave processes in memory while switching between them, allowing the OS to implement time sharing efficiently. Each of the processes have a small part of the 512-KB physical memory carved out for them. With multiple programs residing concurrently in memory, protection becomes an important issue.
The Address Space
The OS creates an easy to use abstraction of physical memory called the address space, which is the running program's view of memory in the system.
The address space of a process contains all of the memory state of the running program, like the code, stack and heap.
Notice in Figure 13.3, the heap and the stack are placed at the opposite ends of the address space so that they just have to grow in opposite directions.
The program really isn't in memory at physical addresses 0 through 16KB; rather it is loaded at some arbitrary physical addresses.
Crux: How to virtualize memory?
We say the OS is virtualizing memory because the running program thinks it is loaded into memory at a particular address (say 0) and has a potentially very large address space (say 32-bits or 64-bits).
When process A in Figure 13.2 tries to perform a load at address 0 (which we call a virtual address), the OS, in tandem with some hardware support, will make sure that the load doesn't actually go to physical address 0 but rather to physical address 320KB (where A is loaded into memory).
Goals
- Transparency: The OS should implement virtual memory in a way that is invisible to the running program
- Efficiency: The OS should strive to make the virtualization as efficient as possible, both in terms of time and space
- Protection: The OS should make sure to protect processes from one another as well the OS itself from processes. Protection enables us to deliver the property of isolation among processes; each process should be running in its own isolated cocoon
Memory API
Crux: How to allocate and manage memory?
Types of Memory
In running a C program, there are two types of memory that are allocated.
The first is called stack memory, and allocations and deallocations of it are managed implicitly by the compiler for the programmer; for this reason it is also called automatic memory. The program, while it is running, uses a stack to keep track of where it is in the function call chain as well as to allocate local variables and pass parameter and return values to and from routines.
The second is called heap memory, where all allocations and deallocations are explicitly handled by the programmer. The heap is used for dynamically-allocated, user-managed memory, such as a call to malloc()
in C or new
in C++ or Java. The heap also contains statically-initialized variables.
The malloc() Call
Pass it a size asking for some room on the heap, and it either succeeds and gives back a pointer to the newly-allocated space, or fails and returns NULL
.
The free() Call
To free heap memory that is nolong in use, call free()
.
The routine takes one argument, a pointer that was returned by malloc()
.
Underlying OS Support
malloc()
and free()
are not system calls, but rather library calls. However, they are built on top of some system calls.
brk
is a system call that is used to change the location of the program's break
: the location of the end of the heap. It takes one argument (the address of the new break), and thus either increases or decreases the size of the heap based on whether the new break is larger or smaller than the current break.
You can obtain memory from the operating system via the mmap()
call. By passing in the correct arguments, mmap()
can create an anonymous memory region within the program - a region which is not associated with any particular file but rather with swap space.
Other Calls
calloc()
allocates memory and also zeroes it before returning.
realloc()
makes a new larger region of memory, copies the old region into it, and returns the pointer to the new region.