Memory Layout
in Program Execution
By
Frédérick Giasson
October 2001
Copyrights, 2001, Frédérick Giasson
Table of Contents:
------
Preface
Introduction
Introduction Chapter 1:
1:... Program execution: The route map.
1.1:.... The pseudo-shell code.
1.2:.... Create a child Process.
1.2.1:..... FORKS system call: the nine steps.
1.3:.... Execute the program.
1.3.1:..... C run-time, start-off procedure.
1.3.2:..... EXEC system call, the nine steps.
Chapter 2:
2:... Memory layout in executed program.
2.1:.... Dissection of the ELF executable file.
Chapter 3:
3:... The Stack and the Heap.
3.1:.... Where are they?
3.2:.... How to know what's the size of the user stack frame at compilation?
3.3:.... Registers.
3.3.1:..... Generals Registers.
3.3.2:..... Segments register.
3.3.3:..... Offsets Registers.
3.4:.... The stack.
3.4.2:..... Stack management during procedure calling.
3.4.2.1:...... The call.
3.4.2.2:...... The Prologue.
3.4.2.3:...... The return.
3.5:.... The Heap.
Conclusion
Annex 1
Annex 2
Annex 3
Bibliography
Preface:
------
This is my first article published on the web. This article is for beginners
and intermediates systems administrators, programmers, kernel developers, hobbyists,
or any other computer enthusiasts interesting in the subject.
I'm sorry for my bad English but, if you find grammar errors and are
willing to report them, please contact me and I'll change the text with pleasures.
If you wish to discuss of this article with me and other readers, please,
contact us on the discussion forum dedicated for this article:
-
Introduction:
------
Memory management is a hot topic in the operating system development.
This is an important resource of the system and need to be carefully managed.
This paper won't discuss of the whole process of the memory management
system. No, we'll see the point of view of the user in this MM system. We'll
see how a program file is executed and mapped into the memory.
Yes, there are many other parts in the MM system like: swapping, virtual
memory, page replacement algorithms, segmentation, etc. So, if you wish
understand the whole process of the MM system in an operating system, and then
look at the bibliography at the bottom of this document. There are many useful
resources about the Linux/Unix and Minix Memory Management system.
I'll use the Minix and Linux Memory management system to explain you
how this process works. This is practically the same schemas with other
ELF-based A-32 Operating Systems like NetBSD, FreeBSD, etc. Every demonstration
programs will be compiled with GCC under Linux and I'll also use GDB to debug
the Assembly code of our demonstration programs to show you how it works in a
low level environment.
Chapter 1: Program execution: The route map:
------
It's interesting to know how an executed program is mapped in the memory
but, how is he executed? In the first part of this paper I'll show you how the
whole execution process of a program work by tracing you the route map. The starting
point is when the user type the program name in a shell and strike < Enter >. The
final step of the route is when the program is mapped in memory and ready to start.
------
[Root@Seldon prog]# helloworld < Enter >
Hello World!
------
Okay, I typed the name of my program to execute( helloworld ) and I pressed
the key < Enter >. What append between the time that I strike < Enter > and the
apparition of the "Hello World!" string in my console screen? Is this magic? Certainly
not!
1.1: The pseudo-shell code:
------
There is the pseudo-code of a very basic shell program:
------
#define TRUE 1
ARRAY command
ARRAY parameters
while(TRUE)
{
read_command_line(command,parameters); /* We are waiting to read an input from the terminal.
In our example, command = "helloworld"
and parameters = "" */
pid_t = fork(); /* pid_t contain the process ID of the child process */
if( pid_t != 0 ) /* If the PID isn't 0 then...
{ Note: The PID == 0 when it's the child's thread
of execution. */
/* Parent code. */
waitpid(-1,&status,0); /* The program is waiting the end of the child
execution ( -1 ) to continue the parent process
}else{ ( the Shell ) */
/* Child Code. */
execve(command,parameters,0); /* Finally, we execute our helloworld program! */
}
}
------
Note 1: The execve() function is called in the child process ( when fork() == 0 )
and the waitpid() command is called in the parent process ( when fork() == pid_t ).
------
You don't understand? Don't worry, I'll explain you every part
of this pseudo-shell bellow.
1.2: Create a child process:
------
First, the program needs to create a new process to handle the execution of
our program. There is the ways to do this under Minix and Linux:
------
Minix:
------
do_fork()
Linux:
------
fork()
vfork()
clone()
------
Under Linux, vfork() and clone() have same functions as fork() but with some
difference in the process management. See at the Linux Man pages for more information
about these two functions ( "man clone" or "man vfork" ). We'll concentrate our efforts
on the fork() function. The fork() function do the same thing under Minix and Linux.
This function will create an exact duplicate of the parent process, including all the
file descriptions, variables, registers, everything. After the fork() call, the child
and the parent process go their separate ways. The values of all variables are the same
at the call of the fork() but, after him, the values of parent and child variables will
changes and the ones done on the parent process will not affect the ones on the child
process. The only thing that is shared between the parent and the child process is the
text section which is unchangeable.
Okay, the FORK system call is sent, the child process is created. Now before the
end of the fork() function, the system call will return to the program the value of the
child process identification ( the PID ). The PID is a signed integer variable defined
in types.h as:
Note 1: The fork() procedure will send the FORK system call. Dont get
muddled by these two concepts.
------
Minix:Source: minix/include/sys/types.h
--
typedef int pid_t;
Linux:Source: /posix/sys/types.h
--
ifndef __pid_t_defined
typedef int __pid_t pid_t;
------
1.2.1: FORKS system call: the nine steps:
------
There is the nine things, which the FORK system call do:
Note 1: You'll find every of these steps in the do_fork() source code
in Annex 1.
Note 2: The code for the Linux fork() function is in glibc in the
fork.c file.
Note 3: The descriptions of these steps are only applicable for the Minix
operating system. The base is the same for Linux. The only thing
which really differ is how the process is created by the kernel.
1- Check to see if process table is full. ( Lines 14 to 16 in Annex 1 )
Okay, What's the hell is this process table?
The process table is a part of the kernel. The declaration
of the table is in Annex 2. This table contains all
process' registers, memory map, accounting, and message to
send and receive.
Note 1: The number of slots in the process table is defined
by NR_PROCS in " /include/minix/config.h ":
------
#define NR_PROCS 32
------
Note 2: In Linux, the maximum number of process is the size
of the task vector, by default he have 512 entries.
2- Try to allocate memory for the child's data and stack. ( Lines 21 to 26 in Annex 1 )
3- Copy the parent's data and stack to the child's memory. ( Lines 28 to 34 in Annex 1 )
4- Find a free process slot and copy parent's slot to it. ( Lines 36 to 38 in Annex 1 )
5- Enter child's memory map in process table. ( Lines 41 to 57 in Annex 1 )
6- Choose a pid for the child. ( Lines 60 to 69 in Annex 1 )
Note 1: Don't forget, the pid_t must be a signed integer.
7- Tell kernel and file system about child. ( Lines 71 and 72 in Annex 1 )
Why the FORK system call tell to the kernel AND the file system
about the newly created process? Because in Minix the process
management, memory management and file management are each
handled by separate modules. So, the process table is
partitioned in 3 parts, and each of these parts have fields that
it needs. The part of the process table involve in the memory
management is defined in the file "/src/mm/mproc.h". The part
involve in the file system is defined in the file "/src/fs/fproc.h".
Finally, the one involve with the kernel is defined in the Annexe 2.
8- Report child's memory map to kernel. ( Lines 74 to Lines 77 in Annex 1 )
9- Send reply messages to parent and child. ( Lines 80 and 81 in Annex 1 )
Note 1: The return value to the child process is 0 and the return
value to the parent process is the PID of the child.
As we can see, the first part of a program execution called from a shell isn't
so simple. The FORK system call will only create a new process to handle the execution
of our new program started by the execve() command. So, the whole protocol of process
management is implicated when we call the do_fork() procedure, and, at every level of
the system (to the process management system, to the I/O tasks, to the server processes
( FS, MM and network ) and finally to the user processes). I'll not discuss of this
protocol in this paper because it's not is goal.
1.3: Execute the program:
------
Okay, we created a new process, now; we'll use this process to execute our
program. The execve() function call a new system call know as "EXEC system call".
What this system call does? He replace the current memory image with a new one and
setup a new stack for this new memory image.
There is the ways to execute a program under Minix and Linux:
Minix:
------
do_exec()
Note 1: In the /src/mm/exec.c library. There is other do_exec() functions in the
/src/fs/misc.c and /src/kernel/system.c library but I'll talk about them
later in this section.
Linux:
------
execve()
There is other variants of the exec family procedures, see man pages for more
information:
execl()
execlp()
execle()
execv()
execvp()
Okay, we'll take in consideration that there is a hole, of the size of our
new image, in memory. I'll first show you how the program is handled by the EXEC
system call and after, show you all steps that the do_exec() function perform.
Same as the FORK system call, take in consideration that the following explanation
is only fully compatible for the Minix operating system but the base is the same
under the Linux OS. The handling of the EXEC system call is the same, but, under
Linux, the Kernel, MM and FS can handle the problem in other ways with many other
features specific to the Linux system but the process is the same.
There is the memory schemas that we'll use to understand the whole process
of EXEC when we pass the " mv hw pg " command to our shell.
Note 1: This command will rename the file "hw" to "pr".
Arrays passed to execve()
------
Argument
Array
------
| 0 |
|------|
| pr |
|------|
| hw |
|------|
| mv |
------
Figure 1.0
Environment
Array
------
| 0 |
|------|
| HOME=/root |
------
Figure 1.1
The stack build by execve()
------
3 2 1 0
------
40 | \0| t | o | o |
|------|
36 | r | / | = | E |
|------|
32 | M | O | H | \0|
|------|
28 | r | p | \0| w |
|------|
24 | h | \0| v | m |
|------|
20 | 0 |
|------|
16 | 33 |
|------|
12 | 0 |
|------|
8 | 30 |
|------|
4 | 27 |
|------|
0 | 24 |
------
Figure 1.2
The stack after relocation by the memory manager:
------
3 2 1 0
------
6532 | \0| t | o | o |
|------|
6528 | r | / | = | E |
|------|
6524 | M | O | H | \0|
|------|
6520 | r | p | \0| w |
|------|
6516 | h | \0| v | m |
|------|
6512 | 0 |
|------|
6508 | 6525 |
|------|
6504 | 0 |
|------|
6500 | 6522 |
|------|
6496 | 6519 |
|------|
6492 | 6516 |
------
Figure 1.3
The stack as it appears to main() at the start of execution:
------
3 2 1 0
------
6532 | \0| t | o | o |
|------|
6528 | r | / | = | E |
|------| 6524 | M | O | H | \0|
|------|
6520 | r | p | \0| w |
|------|
6516 | h | \0| v | m |
|------|
6512 | 0 |
|------|
6508 | 6525 |
|------|
6504 | 0 |
|------|
6500 | 6522 |
|------|
6496 | 6519 |
|------|
6492 | 6516 |
|------|
6488 | 6508 | <-- envp
|------|
6484 | 6492 | <-- argv
|------|
6480 | 3 | <-- argc
|------|
6476 | return |
------
Figure 1.4
I'll now explain you how these stacks representations works and after
I'll show you all EXEC system call steps to finally have our program mapped in
memory.
There are two important arrays in the EXEC process. The environment
array (figure 1.0) and the argument array (figure 1.1). The environment array
is an array of string, which is passed as environment to the new program. The
argument array is an array of argument strings passed to the new program. These
two arrays need to be terminated with a NULL character ("\0"). The do_exec()
procedure will now build the initial stack within the shell's address space
(Figure 1.2, Annex 3 lines 049 to 056). Next, the procedure will call the MM and
this one will allocate new memory for the new created stack and release the old
one (Annex 3 lines 062 to 066). After the procedure will patch up the pointers
(Annex line 077) and now the memory from Figure 1.2 have the look of the memory
of the Figure 1.3. Finally, we'll save the offset to initial argc (Annex 3 line
112). The initial stack argument is a part of the procedures table in the memory
management system. There is a pointer on the initial stack in the MPROC structure
in the "/src/mm/mproc.h" file. The memory finally looks like the Figure 1.4. This
is the stack representation which will appears to main() procedure at the start
of execution!
1.3.1: C run-time, start-off procedure:
------
Okay, now, the program is mapped and executed. However, we have a little
problem. For the C compiler, the main() is just another function. The compiler
doesn't know that this function is the entry point of our program to execute! So
the compiler will compile the main() function code to access the three parameters
considering the standard C calling convention, last parameter first. In this case,
there is supposed to have three parameters ( one integer and 2 pointers ) before
the return address but this is not the case in our Figure 1.3. In this case, how
can we pass the three parameters to the main() function? We'll create a small
assembly code, which will be insert in the front head of our program. The code is
called C run-time, start-off procedure, CRTSO, and his general goal is to put three
more dwords on the stack and call the main() function with standard call instruction:
-DWord 1:
ARGC: The number of parameters passed to the function.
Type: Integer
Note 1: Adress 0x6476 on Figure 1.4
-DWord 2:
ARGV: Pointer on parameters array string.
Type: Pointer
Note 1: Adress 0x6484 and pointer on 0x6492 on Figure 1.4
-DWord 3:
ENVP: Pointer on the environment array string.
Type: Pointer
Note 1: Adress 0x6488 and pointer on 0x6508 on Figure 1.4
These three dwords are represented in the Figure 1.4. Okay, there is an assembly
procedure called CRTSO, but what look like this procedure? Let the hunt begin! The
GDB hunting ground is now open!
Let us first import our test program ("crtso") in GDB.
-----
(gdb) file crtso
Reading symbols from crtso...done.
-----
Okay, now, we don't have any ideas of where to start to find this legend in the
ground. In this case, let us start a point 0, this is the only point that we know
his existence.
-----
(gdb) disassemble main
Dump of assembler code for function main:
0x80481e0 <main>: push %ebp
...
0x80481eb <main+11>: call 0x8048498 <exit>
End of assembler dump.
-----
There is no information about the CRTSO location.
I have an idea. We'll track him in the program by following each function addresses
-1 dword, in this case, we'll have the address of the previous function. If we do
this to each functions, we'll probably find the root procedure, the CRTSO!
Let's begin the tracking with this new method.
-----
(gdb) disassemble main-1
Dump of assembler code for function init_dummy:
0x80481d0 <init_dummy>: push %ebp
...
0x80481da <init_dummy+10>: lea 0x0(%esi),%esi
End of assembler dump.
-----
We found the frame_dummy function at adress 0x80481d0 -1.
-----
(gdb) disassemble init_dummy-1
Dump of assembler code for function frame_dummy:
0x80481a0 <frame_dummy>: push %ebp
...
0x80481c9 <frame_dummy+41>: lea 0x0(%esi,1),%esi
End of assembler dump.
-----
We found the fini_dummy fonction at 0x80481a0 -1.
-----
(gdb) disassemble frame_dummy-1
Dump of assembler code for function fini_dummy:
0x8048190 <fini_dummy>: push %ebp
...
0x804819a <fini_dummy+10>: lea 0x0(%esi),%esi
End of assembler dump.
-----
We found the __do_global_dtors_aux fonction at 0x8048190 -1
-----
(gdb) disassemble fini_dummy-1
Dump of assembler code for function __do_global_dtors_aux:
0x8048130 <__do_global_dtors_aux>: push %ebp
...
0x804818d <__do_global_dtors_aux+93>: lea 0x0(%esi),%esi
End of assembler dump.
-----
We found the call_gmon_start fonction at 0x8048130 -1
-----
(gdb) disassemble __do_global_dtors_aux-1
Dump of assembler code for function call_gmon_start:
0x8048104 <call_gmon_start>: push %ebp
...
0x804812f <call_gmon_start+43>: nop
End of assembler dump.
-----
We found the _start fonction at 0x8048104 -1
Hummm, that's an interesting function, that's not?
We found it!
-----
(gdb) disassemble call_gmon_start-1
Dump of assembler code for function _start:
0x80480e0 <_start>: xor %ebp,%ebp
0x80480e2 <_start+2>: pop %esi
0x80480e3 <_start+3>: mov %esp,%ecx
0x80480e5 <_start+5>: and $0xfffffff0,%esp
0x80480e8 <_start+8>: push %eax
0x80480e9 <_start+9>: push %esp
0x80480ea <_start+10>: push %edx
0x80480eb <_start+11>: push $0x808e220
0x80480f0 <_start+16>: push $0x80480b4
0x80480f5 <_start+21>: push %ecx
0x80480f6 <_start+22>: push %esi
0x80480f7 <_start+23>: push $0x80481e0
0x80480fc <_start+28>: call 0x80481f0 <__libc_start_main>
0x8048101 <_start+33>: hlt
0x8048102 <_start+34>: mov %esi,%esi
End of assembler dump.
-----
The CRTSO will put the three parameters on the stack by performing three
push commands:
0x80480e8 <_start+8>: push %eax ! push argc ( integer )
0x80480ea <_start+10>: push %edx ! push argv ( pointer )
0x80480f5 <_start+21>: push %ecx ! push envp ( pointer )
after, the CRTSO will execute the __libc_start_main function (visible in the
libc.so.6 library). Than, the __libc_start_main will call the __libc_init_first
function and this function will call _init. Then it arrange _fini to be called
when the program exit.
finally, 0x8048101 <_start+33>: hlt, is called to force a trap if exit fails.
After this, all parameters will be on the stack and the main() function of
our program will have access to these parameters as shown in the Figure 1.4.
1.3.2: EXEC system call, the nine steps:
------
Now, I'll explain every steps of the EXEC system call. There are nine
important steps to have the program mapped in memory and executed.
1- First, check for memory and check if the file is executable.( Lines 24 to 37 in Annex 3 )
The file execution is in relation with the MM then, the MM
will inform the FS by the tell_fs() procedure to switch to
the user's working directory rather than to MM's. The
execution of the program will be done by the MM allowed()
function.
2- Read the header to get the segment and total size.( Lines 40 to 46 in Annex 3 )
3- Fetch the arguments and environment from the caller.( Lines 49 to 56 in Annex 3 )
4- Allocate new memory and release unneeded old memory.( Lines 62 to 66 in Annex 3 )
Before doing this, we'll check, with the find_shared() procedure ( line 59 in
Annex 3 ), if the text version is not already loaded in memory and able to
be shared with another process. After, we'll call the new_mem() function.
This function will check in the memory to find a hole big enought for our
new memory image( there, the data and stack section of our application if
we find an accesseble text section to share in the memory ). After, memory
maps are updated and the sys_newmap() function will report chages to the
Kernel.
Note 1: If the new_mem() procedure don't find one hole big enough