Pages Project

For this project you will use pseudo files from Linux’s /proc filesystem to determine the current virtual to physical page mapping for the stack and the heap of a process.

Development Environment

You will write 3 C programs for this project. While it’s possible to edit the C files within the host system and then copy them to the VM, it will be easier to write your code within an editor within the VM.

The first editor I used in class is called gedit, and you can access it in Ubuntu by clicking on the icon in the upper left and searching for “text editor”. You can launch the terminal, and any other applications, through this interface as well. Other Linux distributions will likely come with an editor as well.

You can clone the project directly into your VM with git. It’s likely that your VM came with git already installed. If not, you can install it from the terminal. In Ubuntu and other Debian-based distributions, you can install software with the command apt-get. You need to run apt-get as root (the admin user) to install software. You can run a command as root using the sudo command:

sudo apt-get install git

Other distributions will have similar means of installing applications.

You will also probably already have the GCC compiler installed. Try running gcc in the terminal. If it says the command is not found, you will need to install it as well. On Ubuntu and Debian you can install a bundle of development tools that includes GCC like so:

sudo apt-get install build-essential

Once you have everything installed you can edit in your editor and compile and run your programs in the terminal. Here is how to compile the program in stack_allocate.c to the executable stack_allocate:

gcc stack_allocate.c -o stack_allocate

You can use the up and down arrows in the terminal to cycle through your command history so you don’t have to type the entire command every time you want to compile. To then run the program stack_allocate:

./stack_allocate

To read documentation about different functions you can use the man command. To pull up documentation about fread(), do this:

man fread

To quit the man viewer, press q. To search through a man page, press /, type what you want to search for, and press enter. Skip to the next searh result with n.

Of course you can refer to online documentation about these functions as well, but if you just need a quick lookup to remind yourself what the parameters for a specific function are, it’s often faster to use the command line.

/proc Files

Pseudo files that expose information about a specific process are stored in /proc/<pid>/ where <pid> is the process ID (PID) of the process we care about. For example, /proc/582/ contains information about the process with the PID 582.

The maps Pseudo File

Reading /proc/<pid>/maps gives you the current ranges of virtual addresses that are mapped for that process. Here is an example:

00400000-00401000 r-xp 00000000 00:30 29253                              /home/WOOAD/nsommer/maps
00600000-00601000 r--p 00000000 00:30 29253                              /home/WOOAD/nsommer/maps
00601000-00602000 rw-p 00001000 00:30 29253                              /home/WOOAD/nsommer/maps
0216d000-0218e000 rw-p 00000000 00:00 0                                  [heap]
7f181bd65000-7f181bf03000 r-xp 00000000 00:20 245330                     /lib64/libc-2.19.so
7f181bf03000-7f181c103000 ---p 0019e000 00:20 245330                     /lib64/libc-2.19.so
7f181c103000-7f181c107000 r--p 0019e000 00:20 245330                     /lib64/libc-2.19.so
7f181c107000-7f181c109000 rw-p 001a2000 00:20 245330                     /lib64/libc-2.19.so
7f181c109000-7f181c10d000 rw-p 00000000 00:00 0
7f181c10d000-7f181c12e000 r-xp 00000000 00:20 245322                     /lib64/ld-2.19.so
7f181c312000-7f181c315000 rw-p 00000000 00:00 0
7f181c32b000-7f181c32d000 rw-p 00000000 00:00 0
7f181c32d000-7f181c32e000 r--p 00020000 00:20 245322                     /lib64/ld-2.19.so
7f181c32e000-7f181c32f000 rw-p 00021000 00:20 245322                     /lib64/ld-2.19.so
7f181c32f000-7f181c330000 rw-p 00000000 00:00 0
7ffe195dd000-7ffe195fe000 rw-p 00000000 00:00 0                          [stack]
7ffe195fe000-7ffe19600000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

The only lines that we care about for this project are the [stack] and [heap] lines. In the example above, the virtual address range 7ffe195dd000-7ffe195fe000 is currently reserved for the stack. The first address in the range is the first address for the stack and the second address is the first byte past the stack addresses.

The pagemap Pseudo File

The pseudo file /proc/<pid>/pagemap is used to get the current virtual to physical page mapping for a specific page. The information for each virtual page is packed into 8 bytes (64 bits). To read the mapping information for a specific virtual page you must seek to byte virtual_page * 8

For example, to access the information about virtual page 256, you need to seek to position 256 * 8 in the file and then read the next 64 bits.

For certain versions of the kernel this pseudo file must be read as root or it will always give you a physical page frame number of 0.

More information about the structure of the information is here: https://www.kernel.org/doc/Documentation/vm/pagemap.txt

Memory Allocating Programs

To help test your pages.c program, write programs in stack_allocate.c and heap_allocate.c. Both of these take a single command line argument which is the number of pages worth of memory to allocate. For example, if your stack size is 4 KB and you run stack_allocate like this:

./stack_allocate 2

Then the program must allocate 8 KB of memory on the stack by creating an array.

heap_allocate must work in a similar way, but allocate memory on the heap rather than on the stack. I have found that you need to malloc() page-by-page in order to get the results we expect. That is, if you run heap_allocate like this:

./heap_allocate 2

You should allocate 2 buffers, each getpagesize() bytes large.

Recall that pages have a “present” bit, which is 1 if the page is in main memory and the page table has a mapping for that page. The pages won’t have their present bits set until you access them. Have the program write something to the array or buffer. Write to at least 1 element per page so they are all present.

Have stack_allocate and heap_allocate print out their PID so that you can then run pages with that PID and ensure that you see the appropriate number of mapped pages. End both of these programs with a call to getchar() so that they wait until you press enter to exit.

The pages Program

This program must print the current virtual to physical page mappings for the stack and the heap of a process. If the program is run like this:

sudo ./pages

it must print out the mappings for itself. If given a PID like so:

sudo ./pages 5329

it must print out the mappings for the process with that PID. If given a PID that does not exist, exit gracefully.

For each virtual page number in each range, check to see if the present bit is set. If not, skip the page. If it is set, output the virtual page number and the physical page number, or "swapped" if the physical page is swapped out to disk.

Here is some example output by running sudo ./pages 40504 for a process with PID 40504:

Heap starting at 559500827000, ending at 559500848000
559500827 -> 1a9ff
Stack starting at 7fffdbd77000, ending at 7fffdbd98000
7fffdbd95 -> 20d44
7fffdbd96 -> 39ae9
7fffdbd97 -> 20d02

Here is another example output for sudo ./pages (using its own PID):

Heap starting at 564b7cc47000, ending at 564b7cc68000
564b7cc47 -> 30cc5
564b7cc48 -> 34cb1
Stack starting at 7ffdfafbf000, ending at 7ffdfafe0000
7ffdfafdd -> fde9
7ffdfafde -> 27c63
7ffdfafdf -> 3261c

C Suggestions

Submission

Push all 3 of your programs to git-keeper. For full credit you must push some work on the project by the end of the day on Wednesday, December 1, and you must meet all of the following requirements: