Labels

2008/09/21

[0x04]. Notes on Assembly - The fairytale of an x86 CPU

FLAGSEIPESPEBP
CSDSESSS
FSGSESIEDI
EAXEBXECXEDX
  • A 32bit x86 has 16 registers, divided in 6 groups respectively:
    • 1 x EFLAGS register
    • 1 x Instruction Pointer
    • 2 x Stack Pointing Registers
    • 6 x Segment Registers
    • 2 x Index Registers
    • 4 x General Purpose Registers
  • The registers are assigned specific roles:
    • EFLAGS register (Extended FLAGS register is a 32bit version of the 16bit FLAGS) contains the state of current processor. Only 18 out of 32 flags have a meaning assigned.
    • EIP - Extended Instruction Pointer points to the next instruction memory address in the Fetch-Execute cycle.
    • ESP - Extended Stack Pointer - points to the top of the stack. You can see how it grows down on an x86 architecture in the following example: stack_pointer.c
    • EBP - Extended Base Pointer - points to the base of the current Stack Frame. If you assemble func.c as follows:
      $ gcc -S func.c -o func.s
      and take a look into func.s file, the f() function will be translated to some thing like that:
      f:
      pushl %ebp
      movl %esp, %ebp
      subl $16, %esp
      movl $11, -16(%ebp)
      movl $22, -12(%ebp)
      movl $33, -8(%ebp)
      movl $44, -4(%ebp)
      leave
      ret
      1. Line one saves the old EBP
      2. Old ESP becomes new EBP
      3. Increasing the stack by the size of 1 paragraph
      4-7. Saving local variables in the stack frame locations relative to EBP


    • ?S - Segment Registers
      • CS, Code Segment
      • DS, Data Segment
      • SS, Stack Segment
      • ES, Extra Segment
      • FS, another Extra Segment
      • GS, another Extra Segment
    • Extended Index Registers, used for array operations (e.g. strings, which are arrays of bytes)
      • Source Index
      • Destination Index

    • Extended General Purpose Registers
      • EAX - accumulator, used for storing intermediate results of I/O access, interrupts or arithmetics.
      • EBX - base register, used for addressing
      • ECX - counter, used in loops and countdowns.
      • EDX - data register

2008/09/14

[0x03]. Notes on Assembly - Memory from a process' point of view

In-depth memory layout is specific to both the CPU architecture and the OS itself. I'm going to describe how a process sees its own memory share during execution.

Memory Layout from a process perspective

When a program is executed it is read into memory* where it resides until termination. The code allocates a number of special purpose memory blocks for different data types. A very common scheme, but not the only one, is depicted in the following table.

*that's why a statement that the size of your binary does not influence the memory use is not true. Programs static code is read into the lower part of memory.



Stack

a very dynamic kind of memory located at it's top (high addresses) and growing downwards



Memory not allocated yet

Memory that will soon become allocated by the stack, that grows down. Stack will grow until it hits the administrative limit (predefined).

Administrative limit for the stack
Shared Libraries
Administrative limit for the heap

Memory not allocated yet

Memory that will soon become allocated by the heap growing up from underneath.


Heap

It is said that this is the most dynamic part of memory. It is dynamically allocated and freed in big chunks. The allocation process is rather complex (stub/buddy system) and is more time consuming than putting things on stack.


BSS

Memory containing global variables of known (predeclared) size.


Constant data

All constants used in a program.


Static program code

Reserved / other stuff

In order to prove that things work this way (on many systems anyway) I wrote a C program, mem_sequence.c, that allocates 5 types of data, finds their location the (virtual) memory address, sorts them in descending order and then displays presenting a similar output to the table above. mem_sequence.c is tested on Linux, FreeBSD, MacOS X, WinXP and DOS. All UNIX-like systems preserve a similar model with slight differences in address thresholds, the output from Microsoft systems is different and hence interesting.

This is how you use mem_sequence:
$ gcc mem_sequence.c -o mem_sequence
$ ./mem_sequence
1.(0xbf828124) stack
2.(0x0804a008) heap
3.(0x080497d4) bss
4.(0x08048688) const's
5.(0x08048557) code
^Z
[1]+ Stopped ./mem_sequence
$ cat /proc/`pidof mem_sequence`/maps
08048000-08049000 r-xp 00000000 fd:01 313781 mem_sequence
08049000-0804a000 rw-p 00000000 fd:01 313781 mem_sequence
0804a000-0806b000 rw-p 0804a000 00:00 0 [heap]
b7dda000-b7ddb000 rw-p b7dda000 00:00 0
b7ddb000-b7efe000 r-xp 00000000 fd:01 4872985 /lib/libc-2.5.so
b7efe000-b7eff000 r--p 00123000 fd:01 4872985 /lib/libc-2.5.so
b7eff000-b7f01000 rw-p 00124000 fd:01 4872985 /lib/libc-2.5.so
b7f01000-b7f04000 rw-p b7f01000 00:00 0
b7f18000-b7f1b000 rw-p b7f18000 00:00 0
b7f1b000-b7f35000 r-xp 00000000 fd:01 4872978 /lib/ld-2.5.so
b7f35000-b7f36000 r--p 00019000 fd:01 4872978 /lib/ld-2.5.so
b7f36000-b7f37000 rw-p 0001a000 fd:01 4872978 /lib/ld-2.5.so
bf816000-bf82b000 rw-p bffeb000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
$
Let's analyze it:
  • The code (5) and constants (4) fall into the readable and executable (non-writable!) portion of code.
  • BSS (3) is enclosed in the read-write but not executable partition.
  • Heap sits on top of them and is denoted by "[Heap]".
  • ...long long nothing...
  • Stack at the very top, described as "[Stack]". Yahtzee!
It works, great news, but it works differently on different x86 based Operating Systems. Check it out yourself and please let me know if you make an interesting discovery on some other exotic system.
topLinuxFreeBSDMacOSX x86 / PPC
WinXP 32
DOSAmigaOS 4.1
Vista Home 32bit
1
stackstackstackheapheap code
bss
2
heapheapheapbssstackheap
const's
3
bssbssbssconstbssbss
code
4
constconstconstcodeconstconst's
heap
5codecodecodestackcodestack
stack

Thanks to Harald Monihart for providing MacOSX PPC data.
Thanks to Anonymous for AmigaOS 4.1 data.