For this project you will use pseudo files from Linux’s /proc
filesystem to
determine the current virtual to physical page mapping for the stack and the
heap of a process.
You will write 3 C programs for this project. While it’s possible to edit the C files within the host system and then copy them to the VM, it will be easier to write your code within an editor within the VM.
The first editor I used in class is called gedit, and you can access it in Ubuntu by clicking on the icon in the upper left and searching for “text editor”. You can launch the terminal, and any other applications, through this interface as well. Other Linux distributions will likely come with an editor as well.
You can clone the project directly into your VM with git. It’s likely that your
VM came with git already installed. If not, you can install it from the
terminal. In Ubuntu and other Debian-based distributions, you can install
software with the command apt-get
. You need to run apt-get
as root (the
admin user) to install software. You can run a command as root using the sudo
command:
sudo apt-get install git
Other distributions will have similar means of installing applications.
You will also probably already have the GCC compiler installed. Try running
gcc
in the terminal. If it says the command is not found, you will need to
install it as well. On Ubuntu and Debian you can install a bundle of
development tools that includes GCC like so:
sudo apt-get install build-essential
Once you have everything installed you can edit in your editor and compile and
run your programs in the terminal. Here is how to compile the program in
stack_allocate.c
to the executable stack_allocate
:
gcc stack_allocate.c -o stack_allocate
You can use the up and down arrows in the terminal to cycle through your
command history so you don’t have to type the entire command every time you
want to compile. To then run the program stack_allocate
:
./stack_allocate
To read documentation about different functions you can use the man
command. To pull up documentation about fread()
, do this:
man fread
To quit the man
viewer, press q
. To search through a man page, press /
,
type what you want to search for, and press enter. Skip to the next searh
result with n
.
Of course you can refer to online documentation about these functions as well, but if you just need a quick lookup to remind yourself what the parameters for a specific function are, it’s often faster to use the command line.
/proc
FilesPseudo files that expose information about a specific process are stored in
/proc/<pid>/
where <pid>
is the process ID (PID) of the process we care
about. For example, /proc/582/
contains information about the process with
the PID 582.
maps
Pseudo FileReading /proc/<pid>/maps
gives you the current ranges of virtual addresses
that are mapped for that process. Here is an example:
00400000-00401000 r-xp 00000000 00:30 29253 /home/WOOAD/nsommer/maps
00600000-00601000 r--p 00000000 00:30 29253 /home/WOOAD/nsommer/maps
00601000-00602000 rw-p 00001000 00:30 29253 /home/WOOAD/nsommer/maps
0216d000-0218e000 rw-p 00000000 00:00 0 [heap]
7f181bd65000-7f181bf03000 r-xp 00000000 00:20 245330 /lib64/libc-2.19.so
7f181bf03000-7f181c103000 ---p 0019e000 00:20 245330 /lib64/libc-2.19.so
7f181c103000-7f181c107000 r--p 0019e000 00:20 245330 /lib64/libc-2.19.so
7f181c107000-7f181c109000 rw-p 001a2000 00:20 245330 /lib64/libc-2.19.so
7f181c109000-7f181c10d000 rw-p 00000000 00:00 0
7f181c10d000-7f181c12e000 r-xp 00000000 00:20 245322 /lib64/ld-2.19.so
7f181c312000-7f181c315000 rw-p 00000000 00:00 0
7f181c32b000-7f181c32d000 rw-p 00000000 00:00 0
7f181c32d000-7f181c32e000 r--p 00020000 00:20 245322 /lib64/ld-2.19.so
7f181c32e000-7f181c32f000 rw-p 00021000 00:20 245322 /lib64/ld-2.19.so
7f181c32f000-7f181c330000 rw-p 00000000 00:00 0
7ffe195dd000-7ffe195fe000 rw-p 00000000 00:00 0 [stack]
7ffe195fe000-7ffe19600000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
The only lines that we care about for this project are the [stack]
and
[heap]
lines. In the example above, the virtual address range
7ffe195dd000-7ffe195fe000
is currently reserved for the stack. The first
address in the range is the first address for the stack and the second address
is the first byte past the stack addresses.
pagemap
Pseudo FileThe pseudo file /proc/<pid>/pagemap
is used to get the current virtual to
physical page mapping for a specific page. The information for each virtual
page is packed into 8 bytes (64 bits). To read the mapping information for a
specific virtual page you must seek to byte virtual_page * 8
For example, to access the information about virtual page 256
, you need to
seek to position 256 * 8
in the file and then read the next 64 bits.
For certain versions of the kernel this pseudo file must be read as root or it will always give you a physical page frame number of 0.
More information about the structure of the information is here: https://www.kernel.org/doc/Documentation/vm/pagemap.txt
To help test your pages.c
program, write programs in stack_allocate.c
and
heap_allocate.c
. Both of these take a single command line argument which is
the number of pages worth of memory to allocate. For example, if your stack
size is 4 KB and you run stack_allocate
like this:
./stack_allocate 2
Then the program must allocate 8 KB of memory on the stack by creating an array.
heap_allocate
must work in a similar way, but allocate memory on the heap
rather than on the stack. I have found that you need to malloc()
page-by-page
in order to get the results we expect. That is, if you run heap_allocate
like
this:
./heap_allocate 2
You should allocate 2 buffers, each getpagesize()
bytes large.
Recall that pages have a “present” bit, which is 1 if the page is in main memory and the page table has a mapping for that page. The pages won’t have their present bits set until you access them. Have the program write something to the array or buffer. Write to at least 1 element per page so they are all present.
Have stack_allocate
and heap_allocate
print out their PID so that you can
then run pages
with that PID and ensure that you see the appropriate number
of mapped pages. End both of these programs with a call to getchar()
so that
they wait until you press enter to exit.
pages
ProgramThis program must print the current virtual to physical page mappings for the stack and the heap of a process. If the program is run like this:
sudo ./pages
it must print out the mappings for itself. If given a PID like so:
sudo ./pages 5329
it must print out the mappings for the process with that PID. If given a PID that does not exist, exit gracefully.
For each virtual page number in each range, check to see if the present bit is
set. If not, skip the page. If it is set, output the virtual page number and
the physical page number, or "swapped"
if the physical page is swapped out to
disk.
Here is some example output by running sudo ./pages 40504
for a process with PID 40504:
Heap starting at 559500827000, ending at 559500848000
559500827 -> 1a9ff
Stack starting at 7fffdbd77000, ending at 7fffdbd98000
7fffdbd95 -> 20d44
7fffdbd96 -> 39ae9
7fffdbd97 -> 20d02
Here is another example output for sudo ./pages
(using its own PID):
Heap starting at 564b7cc47000, ending at 564b7cc68000
564b7cc47 -> 30cc5
564b7cc48 -> 34cb1
Stack starting at 7ffdfafbf000, ending at 7ffdfafe0000
7ffdfafdd -> fde9
7ffdfafde -> 27c63
7ffdfafdf -> 3261c
getpid()
to get your current PIDsnprintf()
to insert the PID into the paths to the /proc
filesgetline()
to read lines of /proc/<pid>/maps
strstr()
to see if [heap]
or [stack]
are in each linesscanf()
to read the start and end of the address range on one of the
lines. The format specifier %lx
will work to read a hex representation into a
uint64_t
pagemap
file in "rb"
mode since you’ll be reading binary data
rather than textfseek()
and fread()
to read data from /proc/<pid>/pagemap
pagemap
information into a uint64_t
so you are sure you have a
64-bit variable. This requires stdint.h
getpagesize()
to get the page size of your systemPush all 3 of your programs to git-keeper. For full credit you must push some work on the project by the end of the day on Wednesday, December 1, and you must meet all of the following requirements:
stack_allocate.c
n
as a command line argumentn * getpagesize()
bytes on the stack and ensures all the
pages are presentheap_allocate.c
n
as a command line argumentn
buffers on the heap, each with a size of getpagesize()
,
and ensures all the pages are presentpages.c
[heap]
in the process’s maps
file
it prints out the address range and then prints out the virtual to physical
page mapping for all pages from the range that are present in memory.[stack]
in the process’s maps
file
it prints out the address range and then prints out the virtual to physical
page mapping for all pages from the range that are present in memory.