Pages Project

For this project you will use pseudo files from Linux’s /proc filesystem to determine the current virtual to physical page mapping for the stack and the heap of a process.

You will write 3 programs for this project. In stack_allocate.c and heap_allocate.c you will write programs that allocate a specified number of pages on the stack and heap respectively. The purpose of these two programs is so that you have controlled processes that you can use to test the program that you write in pages.c, which will print out the current page mappings for the stack and the heap of a process.

Be sure to read all of these instructions before you begin.

/proc Files

Pseudo files that expose information about a specific process are stored in /proc/<pid>/ where <pid> is the PID of the process we care about. For example, /proc/582/ contains information about the process with the PID 582.

The maps Pseudo File

Reading /proc/<pid>/maps gives you the current ranges of virtual addresses that are mapped for that process. Here is an example:

00400000-00401000 r-xp 00000000 00:30 29253                              /home/WOOAD/nsommer/maps
00600000-00601000 r--p 00000000 00:30 29253                              /home/WOOAD/nsommer/maps
00601000-00602000 rw-p 00001000 00:30 29253                              /home/WOOAD/nsommer/maps
0216d000-0218e000 rw-p 00000000 00:00 0                                  [heap]
7f181bd65000-7f181bf03000 r-xp 00000000 00:20 245330                     /lib64/libc-2.19.so
7f181bf03000-7f181c103000 ---p 0019e000 00:20 245330                     /lib64/libc-2.19.so
7f181c103000-7f181c107000 r--p 0019e000 00:20 245330                     /lib64/libc-2.19.so
7f181c107000-7f181c109000 rw-p 001a2000 00:20 245330                     /lib64/libc-2.19.so
7f181c109000-7f181c10d000 rw-p 00000000 00:00 0
7f181c10d000-7f181c12e000 r-xp 00000000 00:20 245322                     /lib64/ld-2.19.so
7f181c312000-7f181c315000 rw-p 00000000 00:00 0
7f181c32b000-7f181c32d000 rw-p 00000000 00:00 0
7f181c32d000-7f181c32e000 r--p 00020000 00:20 245322                     /lib64/ld-2.19.so
7f181c32e000-7f181c32f000 rw-p 00021000 00:20 245322                     /lib64/ld-2.19.so
7f181c32f000-7f181c330000 rw-p 00000000 00:00 0
7ffe195dd000-7ffe195fe000 rw-p 00000000 00:00 0                          [stack]
7ffe195fe000-7ffe19600000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Note the first 3 lines contain mappings to the program that the process is running. This is where the process accesses the instructions of the program. Most of the other mappings are for shared libraries, such as libc which provides functions from the C standard library.

The only lines we care about for this project are the [stack] and [heap] lines. In the example above, the virtual address range 7ffe195dd000-7ffe195fe000 is currently being used for the stack. The first address in the range is the first address for the stack and the second address is the first byte past the stack addresses.

Note that you’ll need to deal with page numbers, not addresses when determining the mappings. The address 7ffe195dd000 is the first address in page number 7ffe195dd. If the page size is 4096 (2 to the power of 12), then the 12 least significant bits of the address (all zeros here) represent the offset from the beginning of the page. To go from an address to the page number you can divide the address by the page size, which effectively shifts the address 12 bits to the right.

There will always be only one entry in this file for the stack, since the stack is always located in a contiguous area of virtual memory. It is possible that there will be multiple lines for the heap, since the heap does not have the same requirement.

The pagemap Pseudo File

The pseudo file /proc/<pid>/pagemap is used to get the current virtual to physical page mapping for a specific page. The information for each virtual page is packed into 8 bytes (64 bits). To read the mapping information for a specific virtual page you must seek to byte virtual_page * 8

For example, to access the information about virtual page 256, you need to seek to position 256 * 8 in the file and then read the next 64 bits.

For certain versions of the kernel this pseudo file must be read as root or it will always give you a physical page frame number of 0.

More information about the structure of the information is here: https://www.kernel.org/doc/Documentation/vm/pagemap.txt

You will need to use that documentation to figure out how to extract the required information from the 64 bits representing the virtual page.

Memory Allocating Programs

To help test your pages.c program, write programs in stack_allocate.c and heap_allocate.c. Both of these take a single command line argument which is the number of pages worth of memory to allocate. For example, if your page size is 4 KB and you run stack_allocate like this:

./stack_allocate 2

Then the program must allocate 8 KB of memory on the stack by creating an array.

heap_allocate must work in a similar way, but allocate memory on the heap rather than on the stack. I have found that you need to malloc() page-by-page in order to get the results we expect. That is, if you run heap_allocate like this:

./heap_allocate 2

You should allocate 2 buffers, each being the size of a single page.

Each page has a “present” bit associated with that page, which is 1 if the page is in main memory and the page table has a mapping for that page. The pages won’t have their present bits set until you access them. Have the program write something to the array or buffer. Write to at least 1 element per page so that every page is present.

Have stack_allocate and heap_allocate print out their PIDs so that you can then run pages with that PID and ensure that you see the appropriate number of mapped pages. End both of these programs with a call to getchar() so that they wait until you press enter to exit.

The pages Program

This program must print the current virtual to physical page mappings for the stack and the heap of a process. If the program is run like this:

sudo ./pages

it must print out the mappings for itself. If given a PID like so:

sudo ./pages 5329

it must print out the mappings for the process with that PID. If given a PID that does not exist, exit gracefully.

For each virtual page number in each range, check to see if the present bit is set. If it is not, skip the page. If it is set, output the virtual page number and the physical page number, or "swapped" if the physical page is swapped out to disk.

Here is some example output:

Heap starting at 0x1b1f000, ending at 0x1b40000
0x1b1f -> 0x51e41
0x1b20 -> 0x51e3f
Stack starting at 0x7ffe5f3b4000, ending at 0x7ffe5f3d5000
0x7ffe5f3d3 -> 0x51e51
0x7ffe5f3d4 -> 0x86337

C Suggestions

Submission

Push all 3 of your programs to git-keeper. For full credit you must meet all of the following requirements: