phiral.net
Home

|=----------------------=[ Windows 16Bit Assembly Intro ]=--------------------=|
|=----------------------------------------------------------------------------=|


---[ Windows 16Bit Intel Syntax

1:  .model small
2:  .stack 100h
3:  .data
4:  helloWorld db 'Hello, World!', 0dh, 0ah, '$'
5:  .code 
6:  main proc
7:     mov ax, @data
8:     mov ds, ax
9:     mov ah, 09h
10:    mov dx, offset helloWorld
11:    int 21h
12:    mov ax, 4c00h
13:    int 21h
14: main endp
15: end main

Compile with:

tasm hello_win16
tlink hello_win16
hello_win16

Description:

1: .model small; Initializes the program memory model. Models include:

        tiny:    Older .com files. A single segment containing both code and 
                 data. Code plus data must be less then or equal too 64k.
        small:   One code segment and one data segment. All code and data are near.
                 Code must be less then or equal to 64k as must data.
        medium:  Multiple code, a single data segment.
                 Data must be less then or equal to 64k, code any size.
        compact: One code segment, multiple data segments.
                 Code must be less then or equal to 64k, data any size.
        large:   Multiple code and data segments.
                 Botch code and data can be greater then 64k.
        huge:    Same as large execpt individual data idems may be bigger then a single
                 segment. Same as large execpt arrays may be bigger then 64k.
        flat:    Protected mode, uses 32bit offsets for both code/text and data.
                 All data and code are in a single 32bit segment.
        
        Note: basically all but flat are now obsolete.

2: .stack 100h; Reserve 256 bytes of stack space. The stack starts from upper memory
                and grows down, hence if it were ever to become \x0 it would be full.
                Win16 has a Stack Segment register which contains the base address 
                address of the stack. The Stack Pointer register contains the address
                of the top of the stack where the last value was pushed.
                Two operations are done on the stack:
                   push: A push operation decrements the stack pointer and pushes a new value
                         onto the stack, the stack grows downward as each push operation
                         takes place.
                   eg.
                   A: ax=5
                      Lower memory               FFFC [0000000000000000]
                                                 FFFD [0000000000000000]
                                                 FFFE [0000000000000000] <- SP = FFFE
                      Higher Memory              FFFF [0000000000000000]

                   B: push ax; Push value in ax onto the stack. Think: memory[--SP] = ax;
        
                   C: ax=5
                      Lower memory               FFFC [0000000000000000]
                                                 FFFD [0000000000000005] <- SP = FFFD
                                                 FFFE [0000000000000000]
                      Higher Memory              FFFF [0000000000000000]

                   pop: A pop operation removes data from the stack by copying words(16bit)
                        into memory or registers and incrementing the stack pointer.
                   eg.
                   A: cx=0
                      Lower memory               FFFC [0000000000000000]
                                                 FFFD [0000000000000005] <- SP = FFFD
                                                 FFFE [0000000000000000]
                      Higher Memory              FFFF [0000000000000000]

                   B: pop cx; Pop top value off the stack into cx. Think cx = memory[SP++];

                   C: cx=5
                      Lower memory               FFFC [0000000000000000]
                                                 FFFD [0000000000000005]
                                                 FFFE [0000000000000000] <- SP = FFFE
                      Higher Memory              FFFF [0000000000000000]

                   The value that was poped off the stack remains in memory until another
                   stack operation overwrites it.

                   Stack Frame: A stack frame is the area of the stack set aside for a 
                                procedures return address, passed parameters, any saved
                                registers and all local variables. The stack frame is 
                                created by the following sequential steps:
                    
                   1: Arguments are pushed onto the stack.
                   2: The procedure is called causing the return address to be pushed.
                   3: As the procedure prolog begins [e]bp is is pushed onto the stack.
                   4: [e]bp is set equal to [e]sp. From this point on [e]bp acts as a 
                      reference for all other procedure paramerters.
                   5: Values are substracted from [e]sp to create local variables.
                      Values are added to [e]sp to get function arguments.
                   6: When the procedure is finished it goes through the procedure epilog
                      which sets [e]bp equal to [e]sp and calls ret (return).

3: .data; Defines the start of the data segment. Win16 memory is divided into three primary
          segments. The stack segment pointed to by SS and described above. The code segment 
          pointed to by the CS register is where the machines instructions are located. The CS 
          register contains its segment address and IP, the instruction pointer, points to the 
          first executable instruction. The data segment usually points to the programs variables. 
          Notice no register points to this segment,that is why we include the following in all EXE's.
           
          mov ax, @data; Move into ax the address of the data segment.
          mov ds, ax;    Move into the data segment pointer ds, ax.

4:  helloWorld db 'Hello, World!', 0dh, 0ah, '$'; 
         Defines bytes, hence "db". Since the cpu has no conception of strings we define from 
         after the "db" directive until the end of the statement. Each position in the string
         following "db" get's one byte of memory reserverd for it. The cpu determines the length
         that it needs to reserver by finding the end of the statement. This is what it looks
         like in memory (16 bit words):
         
          Ascii:   
      
                    FFEE -> [H][e][l][l][o][,][ ][W][o][r][l][d][!][d][a][$]
                             ^
          helloWorld = FFEE  |

          Hexidecmial:
       
                    FFEE -> [48][65][[6c][6c][6f][2c][20][57][6f][72][6c][64][21][d][a][24]
                             ^
          helloWorld = FFEE  |
                   
         The variable name "helloWorld" contains the address of the first byte defined in the
         "db" statement. So the char at helloWorld+2 is 'l' the char at helloWorld+4 is 'o', the 
         address of helloWorld is FFEE (65518 deciaml) and the address of helloWorld+2 is FFF0 
         (65520 decimal). On the Windows platform a newline is defined as Carrage Return followed
         by Line Feed. Carrage Return is 13 decimal 0xd hexidecimal, Line Feed is 10 decimal
         0xa hexidecimal, which is what follows the string "Hello, World!". The the dos function
         09h String Output which is what is called to display the string, the '$' is the 
         terminating character of the string. Meaning: when dos starts to output the string it:
         
                    1: Reads the character pointed to by the label helloWorld.
                    2: Incremnts the helloWorld address by one.
                    3: Checks to see if the character read is the '$' character.
                       If it is the function returns. 
                       If it is not it outputs the character and returns to step 1.

5: .code; Defines the start of the code segment, which is where the programs executable
          instructions start in memory when it is being executed.

6: main proc ; Uses the "proc" directive to declare the "main" procedure, any name might
               have been chosen. The first executable following this is the "program entry
               point", which is the point at which the programs begins to execute. The proc
               and endp directives mark the beginning and ending of a procedure.
               An example of an executable in memory will better show the Program Entry Point:

               Segment      Absolute Address   Segment Register Value
               ______________________________________________________
    20h Bytes  |Code        20000              2000                 | <- Program Entry Point
               |____________________________________________________| 
    10h Bytes  |Data        20020              2002                 |
               |____________________________________________________| 
    100h Bytes |Stack       20030              2003                 |
               |____________________________________________________| 

    Overlapping Segments:
    The program segments for code, data and stack appear to overlap but only the first part
    ('X') of each segment is in use.

    00   20    30    130
    [XXXX][OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 64K Code
          [XXXX][OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 64K Data
                [XXXX][OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 64K Stack

7:     mov ax, @data; Move into ax the address of the data section. As said beforehand since
                      no register points to the data section at the start of exectution we must
                      initialize the DS, data segment, register. Steps 7 and 8.

8:     mov ds, ax; Move into the data segment register the value at ax.

9:     mov ah, 09h; Move into ah the number 09h. This number is the number of the function that
                    we are about to call, function 09h String Output. To call function 09h the 
                   address of the string must be in the dx register, and the string must be 
                   terminated by the '$' sign. Control characters are recognized.

10:    mov dx, offset helloWorld; Move the addess (offset from ds) into the register dx.

11:    int 21h; Interrupt operating system. The "int" instruction calls an operating system
                subroutine identified by the number i nthe range of 0-FFh. Before the instruction
                is executed the ax register contains a function number that identifies the desired
                subroutine. The cpu process an interrupt using the Interrupt Vector Table, a table
                of addresses i nthe lowest 1024 bytes of memory. Each entry in this table is a 32 bit
                segment:offset address that pointes to an operating system subroutine. The steps it
                takes are illustrated below:
      
                1: The number following the "int" tells the cpu which entry to locate in the IVT.
                2: The cpu jumps to the address stored int the IVT.
                3: The interrup handler begins execting and finishes when an "iret" is reached.
                4: iret, Interrupt Return causes the program to resume exectation at the next
                   instruction in the original calling program.

                In this case we are calling the String Output function 09h.

12:    mov ax, 4c00h; Move into ax dos function number 4ch, exit program.

13:    int 21h;       Call dos and execuate the IVT routine for 4ch.

14:    main endp;     endp is the directive for End Procedure, closing "main".

15:    end main;      The "end" directive is the last line to be assembled, and the label next
                      to it "main" identifies the Program Entry Point.