phiral.net
Home


Ascii file encoder in assembly.

This program should open a file that contains the code table, open the file to
be encoded, encode the file then write to STDOUT. Example:

codetable.txt (this needs real nulls, see below)
00000000000000000000000000000000000000000000000045908213670000000\
GVHZUSOBMIKPJCADLFTYEQNWXR000000gvhzusobmikpjcadlftyeqnwxr0000000\
00000000000000000000000000000000000000000000000000000000000000000\
00000000000000000000000000000000000000000000000000000000000000000\
000000000000000

toencode.txt
This is the text to encode.

This is all one long line, the lines that end with \ just mean its a single line
that is broken up. I only changed the numbers, upper and lower case letters of
the ASCII table, then for all non-alphanumeric digits made them 0 so its easy to
check if the current char is encodable.  This was done like this so I can use the
char I have read from the file toencode.txt as the index into the codetable.

Start of the asm.

We know were going to have to call open, read and write, check out what they
take. man 2 open shows us:

SYNOPSIS
       #include <sys/types.h>
       #include <sys/stat.h>
       #include <fcntl.h>

       int open(const char *pathname, int flags);
       int open(const char *pathname, int flags, mode_t mode);
       int creat(const char *pathname, mode_t mode);

The first is using the default mode, the second we specify a mode the third the
mode is equivalent to open() with flags equal to O_CREAT|O_WRONLY|O_TRUNC.
For the first code table file we want to open that read only, which in the man
page says it is O_RDONLY. So we have to find the value of O_RDONLY, it must be in
one of the includes.

entropy@phalaris {~/asm/encode} grep O_RDONLY /usr/include/sys/types.h

entropy@phalaris {~/asm/encode} grep O_RDONLY /usr/include/sys/stat.h

entropy@phalaris {~/asm/encode} grep O_RDONLY /usr/include/fcntl.h

entropy@phalaris {~/asm/encode} grep O_RDONLY /usr/include/asm/fcntl.h
#define O_RDONLY             00

Thats not too obvious.

Now the man pages for the rest of them.

read()
------

SYNOPSIS
       #include <unistd.h>

       ssize_t read(int fd, void *buf, size_t count);

write()
-------

SYNOPSIS
       #include <unistd.h>

       ssize_t write(int fd, const void *buf, size_t count);

close()
-------

SYNOPSIS
       #include <unistd.h>

       int close(int fd);

exit()
------

SYNOPSIS
       #include <unistd.h>

       void _exit(int status);

Now that we know how to open a file lets start coding, open the file to be encoded 
get the file descriptor and close it. Find open's, read, write, close and exit's 
syscall number first:

ntropy@phalaris {~/asm/encode} more /usr/include/asm/unistd.h
#ifndef _ASM_I386_UNISTD_H_
#define _ASM_I386_UNISTD_H_

/*
 * This file contains the system call numbers.
 */

#define __NR_restart_syscall      0
#define __NR_exit                 1
#define __NR_fork                 2
#define __NR_read                 3
#define __NR_write                4
#define __NR_open                 5
#define __NR_close                6

[...snip...]

You'll end up with something like:

entropy@phalaris {~/asm/encode} cat encode.s
.section .rodata
.equ ARGC, 0
.equ ARGV1, 8
.equ LINUX_KERNEL, 0x80
.equ STDOUT, 1
.equ STDERR, 2
.equ EOF, 0
.equ O_RDONLY, 00
.equ MODE, 0666
.equ SYS_EXIT, 1
.equ SYS_READ, 3
.equ SYS_WRITE, 4
.equ SYS_OPEN, 5
.equ SYS_CLOSE, 6
use:
   .ascii "Usage: encode <file to encode>\n\0"
open_err:
   .ascii "Error opening file.\n\0"
.section .data
fd:
   .int 0

.section .bss

.section .text
.globl _start
_start:
   nop

   movl %esp, %ebp         # so we can use ebp as a reference

   xorl %eax, %eax         # set eax to 0
   movl ARGC(%ebp), %eax   # get the argument # count
   cmpl $2, %eax           # compare it to 2
   jne  usage              # if its not equal to 2 jump to the usage symbol

   movl $SYS_OPEN, %eax    # move SYS_OPEN(5) into eax
   movl $MODE, %edx        # move MODE(0666) into edx
   movl $O_RDONLY, %ecx    # move O_RDONLY(0) into ecx
   movl ARGV1(%ebp), %ebx  # move argv[1] (filename) into ebx
   int  $LINUX_KERNEL      # call linux to open the file
   movl %eax, fd           # save the file descriptor in fd

   cmpl $0, %eax           # compare eax to 0
   jl   open_error         # if its less then 0 there was an open error

   movl $SYS_CLOSE, %eax   # move SYS_CLOSE(6) into eax
   movl fd, %ebx           # move the file descriptor to close into ebx
   int $LINUX_KERNEL       # have linux close the file

exit:                      # exit symbol
   movl $SYS_EXIT, %eax    # move SYS_EXIT(1) into eax
   movl $0, %ebx           # move 0 the return address int oebx
   int $LINUX_KERNEL       # call linux to exit

usage:                     # usage symbol
   movl $SYS_WRITE, %eax   # move SYS_WRITE(4) into eax
   movl $31, %edx          # string length
   movl $use, %ecx         # string address
   movl $STDERR, %ebx      # write to STDERR
   int  $LINUX_KERNEL      # call the kernel
   jmp exit                # jump to exit symbol

open_error:                # open_error symbol
   movl $SYS_WRITE, %eax   # move SYS_WRITE(4) into eax
   movl $20, %edx          # string length
   movl $open_err, %ecx    # string address
   movl $STDERR, %ebx      # write to STDERR
   int $LINUX_KERNEL       # call the kernel
   jmp exit                # jump to exit symbol


First check the argument count, if its not correct print the usage message then
exit, try to open the file, if it fails print an error message and exit, then
close the file and exit. Now read the file char by char checking for EOF, if not
EOF write the char to STDOUT. I got:

entropy@phalaris {~/asm/encode} cat encode.s
.section .rodata
.equ ARGC, 0
.equ ARGV1, 8
.equ LINUX_KERNEL, 0x80
.equ STDOUT, 1
.equ STDERR, 2
.equ EOF, 0
.equ O_RDONLY, 00
.equ MODE, 0666
.equ SYS_EXIT, 1
.equ SYS_READ, 3
.equ SYS_WRITE, 4
.equ SYS_OPEN, 5
.equ SYS_CLOSE, 6
use:
   .ascii "Usage: encode <file to encode>\n\0"
open_err:
   .ascii "Error opening file.\n\0"

.section .data
fd:
   .int 0

.section .bss
.lcomm buf, 8

.section .text
.globl _start
_start:
   nop

   movl %esp, %ebp        # so we can use ebp as a reference

   xorl %eax, %eax        # set eax to 0
   movl ARGC(%ebp), %eax  # get the argument # count
   cmpl $2, %eax          # compare it to 2
   jne  usage             # if its not equal to 2 jump to the usage symbol

   movl $SYS_OPEN, %eax   # move SYS_OPEN(5) into eax
   movl $MODE, %edx       # move MODE(0666) into edx
   movl $O_RDONLY, %ecx   # move O_RDONLY(0) into ecx
   movl ARGV1(%ebp), %ebx # move argv[1] (filename) into ebx
   int  $LINUX_KERNEL     # call linux to open the file
   movl %eax, fd          # save the file descriptor in fd

   cmpl $0, %eax          # compare eax to 0
   jl   open_error        # if its less then 0 there was an open error

read_char:                # read_char symbol
   movl $SYS_READ, %eax   # move SYS_READ(3) into eax
   movl $1, %edx          # length to read
   movl $buf, %ecx        # move the address of the buffer into ecx
   movl fd, %ebx          # move the file descriptor to read from into ebx
   int $LINUX_KERNEL      # call kernel

   cmpl $EOF, %eax        # check if were at the end of the file
   jle hit_eof            # if so were done

   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $1, %edx          # length to write
   movl $buf, %ecx        # address of buffer to write out
   movl $STDOUT, %ebx     # write to STDOUT
   int $LINUX_KERNEL      # call kernel

   jmp read_char          # go back and read another char

hit_eof:
   movl $SYS_CLOSE, %eax  # move SYS_CLOSE(6) into eax
   movl fd, %ebx          # move the file descriptor to close into ebx
   int $LINUX_KERNEL      # have linux close the file

exit:                     # exit symbol
   movl $SYS_EXIT, %eax   # move SYS_EXIT(1) into eax
   movl $0, %ebx          # move 0 the return address into ebx
   int $LINUX_KERNEL      # call linux to exit

usage:                    # usage symbol
   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $31, %edx         # string length
   movl $use, %ecx        # string addres
   movl $STDERR, %ebx     # write to STDERR
   int  $LINUX_KERNEL     # call the kernel
   jmp exit               # jump to exit symbol

open_error:               # open_error symbol
   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $20, %edx         # string length
   movl $open_err, %ecx   # string address
   movl $STDERR, %ebx     # write to STDERR
   int $LINUX_KERNEL      # call the kernel
   jmp exit               # jump to exit symbol

Assemble, link and test it.
entropy@phalaris {~/asm/encode} as -g encode.s -o encode.o

entropy@phalaris {~/asm/encode} ld encode.o -o encode

entropy@phalaris {~/asm/encode} echo "This is the file to encode.">toencode.txt

entropy@phalaris {~/asm/encode} ./encode toencode.txt
This is the file to encode.

Ok that works, now for the encode part, it has to open a second file and that
file will have the character translation table in it, and for each char we read
from toencode.txt we'll use that as an index into the translation table, and
output it if its alphanumberic, else we'll just output it.

Here is what I used for my translation table.

entropy@phalaris {~/asm/encode} cat table.txt
G V H Z U S O B M I K P J C A D L F T Y E Q N W X R
O H N P U R A C J M K Q I W G L V Z F S E B X Y T D
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

4 5 9 0 8 2 1 3 6 7
3 6 5 7 0 1 8 9 4 2
0 1 2 3 4 5 6 7 8 9

It reads like
(letters)
if you give me a 'G' I give you and 'A' so above 'G' goes 'A'.
(numbers)
if you give me a '4' I give you a '0' so above '4' goes '0'.

Now we need to put this in a file as ascii with real null's so we know the difference 
between a 0 and a '0', to get this into a file I used perl -e print like:

entropy@phalaris {~/asm/encode} perl -e 'print "\x00"x48;print "4590821367";\
print "\x00"x7;print "GVHZUSOBMIKPJCADLFTYEQNWXR";print "\x00"x6;\
print "gvhzusobmikpjcadlftyeqnwxr";print "\x00"x133' > encodetable

entropy@phalaris {~/asm/encode} perl -e 'print "\x00"x48;print "3657018942";\
print "\x00"x7;print"OHNPURACJMKQIWGLVZFSEBXYTD";print "\x00"x6;\
print "ohnpuracjmkqiwglvzfsebxytd";print "\x00"x133' > decodetable

Then to see if its what we want:

entropy@phalaris {~/asm/encode} hexdump -v encodetable
0000000 0000 0000 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
0000030 3534 3039 3238 3331 3736 0000 0000 0000
0000040 4700 4856 555a 4f53 4d42 4b49 4a50 4143
0000050 4c44 5446 4559 4e51 5857 0052 0000 0000
0000060 6700 6876 757a 6f73 6d62 6b69 6a70 6163
0000070 6c64 7466 6579 6e71 7877 0072 0000 0000
0000080 0000 0000 0000 0000 0000 0000 0000 0000
0000090 0000 0000 0000 0000 0000 0000 0000 0000
00000a0 0000 0000 0000 0000 0000 0000 0000 0000
00000b0 0000 0000 0000 0000 0000 0000 0000 0000
00000c0 0000 0000 0000 0000 0000 0000 0000 0000
00000d0 0000 0000 0000 0000 0000 0000 0000 0000
00000e0 0000 0000 0000 0000 0000 0000 0000 0000
00000f0 0000 0000 0000 0000 0000 0000 0000 0000
0000100

entropy@phalaris {~/asm/encode} hexdump -v decodetable
0000000 0000 0000 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
0000030 3633 3735 3130 3938 3234 0000 0000 0000
0000040 4f00 4e48 5550 4152 4a43 4b4d 4951 4757
0000050 564c 465a 4553 5842 5459 0044 0000 0000
0000060 6f00 6e68 7570 6172 6a63 6b6d 6971 6777
0000070 766c 667a 6573 7862 7479 0064 0000 0000
0000080 0000 0000 0000 0000 0000 0000 0000 0000
0000090 0000 0000 0000 0000 0000 0000 0000 0000
00000a0 0000 0000 0000 0000 0000 0000 0000 0000
00000b0 0000 0000 0000 0000 0000 0000 0000 0000
00000c0 0000 0000 0000 0000 0000 0000 0000 0000
00000d0 0000 0000 0000 0000 0000 0000 0000 0000
00000e0 0000 0000 0000 0000 0000 0000 0000 0000
00000f0 0000 0000 0000 0000 0000 0000 0000 0000
0000100

The reason for the placement of the null's will be clear if you look at an 
extended ascii chart, it has 48 unprintable chars then the 10 numerals, then 7 
unprintable chars and the 26 uppercase alphas, then 6 unprintable chars, 
26 lowercase alpha followed finally by 133 unprintable chars.

And finally the code.

entropy@phalaris {~/asm/encode} cat encode.s
.section .rodata
.equ ARGC, 0
.equ ARGV1, 8
.equ ARGV2, 12
.equ LINUX_KERNEL, 0x80
.equ STDOUT, 1
.equ STDERR, 2
.equ EOF, 0
.equ O_RDONLY, 00
.equ MODE, 0666
.equ SYS_EXIT, 1
.equ SYS_READ, 3
.equ SYS_WRITE, 4
.equ SYS_OPEN, 5
.equ SYS_CLOSE, 6
use:
   .ascii "Usage: encode <file to encode> <file with code table>\n\0"
open_err:
   .ascii "Error opening file.\n\0"
read_err:
   .ascii "Error reading file.\n\0"

.section .data
fd0:
   .int 0
fd1:
   .int 0

.section .bss
.lcomm buf, 8
.lcomm trans, 255

.section .text
.globl _start
_start:
   nop

   movl %esp, %ebp        # so we can use ebp as a reference

check_argc:               # check_argc symbol
   xorl %eax, %eax        # set eax to 0
   movl ARGC(%ebp), %eax  # get the argument # count
   cmpl $3, %eax          # compare it to 3
   jne  usage             # if its not equal to 3 jump to the usage symbol

open_codefile:            # open_codefile, the file with the translation table
   movl $SYS_OPEN, %eax   # move SYS_OPEN(5) into eax
   movl $MODE, %edx       # move MODE(0666) into edx
   movl $O_RDONLY, %ecx   # move O_RDONLY(0) into ecx
   movl ARGV2(%ebp), %ebx # move argv[2] (codefile) into ebx
   int  $LINUX_KERNEL     # call linux to open the file
   movl %eax, fd1         # save the file descriptor in fd1

   cmpl $0, %eax          # compare eax to 0
   jl   open_error        # if its less then 0 there was an open error

read_codefile:            # read_codefile symbol, read the file into a buffer
   movl $SYS_READ, %eax   # move SYS_READ(3) into eax
   movl $255, %edx        # length to read in edx
   movl $trans, %ecx      # move the address of the buffer into ecx
   movl fd1, %ebx         # move the file descriptor to read from into ebx
   int $LINUX_KERNEL      # call kernel

   cmpl $0, %eax          # check for read error
   jl   read_error        # if so jump to read_error symbol

close_codefile:           # close_codefile symbol, its now in the buf trans
   movl $SYS_CLOSE, %eax  # move SYS_CLOSE(6) into eax
   movl fd1, %ebx         # move the file descriptor to close into ebx
   int $LINUX_KERNEL      # have linux close the file

open_file_to_encode:      # open_file_to_encode symbol
   movl $SYS_OPEN, %eax   # move SYS_OPEN(5) into eax
   movl $MODE, %edx       # move MODE(0666) into edx
   movl $O_RDONLY, %ecx   # move O_RDONLY(0) into ecx
   movl ARGV1(%ebp), %ebx # move argv[1] (filename) into ebx
   int  $LINUX_KERNEL     # call linux to open the file
   movl %eax, fd0         # save the file descriptor in fd0

   cmpl $0, %eax          # compare eax to 0
   jl   open_error        # if its less then 0 there was an open error

read_char:                # read_char symbol
   movl $0, buf           # move 0 into buf
   movl $SYS_READ, %eax   # move SYS_READ(3) into eax
   movl $1, %edx          # length to read
   movl $buf, %ecx        # move the address of the buffer into ecx
   movl fd0, %ebx         # move the file descriptor to read from into ebx
   int $LINUX_KERNEL      # call kernel

   cmpl $EOF, %eax        # check if were at the end of the file
   jle hit_eof            # if so were done

   leal trans, %esi       # load the address of the trans buf into esi
   add  buf, %esi         # add the ascii value of the char read to esi (as an index)
   lodsb                  # load the byte esi points to into eax

   cmpl $0, %eax          # check if eax is encodable, if not zero
   je write_char          # if its zero just use the char read

   movl %eax, buf         # else use the encoded char

write_char:               # write_char symbol, writes either encoded or regular char
   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $1, %edx          # length to write
   movl $buf, %ecx        # address of buffer to write out
   movl $STDOUT, %ebx     # write to STDOUT
   int $LINUX_KERNEL      # call kernel

   jmp read_char          # go back and read another char

hit_eof:                  # found an eof
   movl $SYS_CLOSE, %eax  # move SYS_CLOSE(6) into eax
   movl fd0, %ebx         # move the file descriptor to close into ebx
   int $LINUX_KERNEL      # call kernel

exit:                     # exit symbol
   movl $SYS_EXIT, %eax   # move SYS_EXIT(1) into eax
   movl $0, %ebx          # move 0 the return address into ebx
   int $LINUX_KERNEL      # call linux to exit

usage:                    # usage symbol
   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $54, %edx         # string length
   movl $use, %ecx        # string addres
   movl $STDERR, %ebx     # write to STDERR
   int  $LINUX_KERNEL     # call the kernel
   jmp exit               # jump to exit symbol

open_error:               # open_error symbol
   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $20, %edx         # string length
   movl $open_err, %ecx   # string address
   movl $STDERR, %ebx     # write to STDERR
   int $LINUX_KERNEL      # call the kernel
   jmp exit               # jump to exit symbol

read_error:               # read_error symbol
   movl $SYS_WRITE, %eax  # move SYS_WRITE(4) into eax
   movl $20, %edx         # string length
   movl $read_err, %ecx   # string address
   movl $STDERR, %ebx     # write to STDERR
   int $LINUX_KERNEL      # call the kernel
   jmp exit               # jump to exit symbol

Assemble it and test it out. We already have the files encodetable and decodetable 
generated by the perl prints earlier.

entropy@phalaris {~/asm/encode} as -g encode.s -o encode.o

entropy@phalaris {~/asm/encode} ld encode.o -o encode

entropy@phalaris {~/asm/encode} echo "This is the SECRET Message, testing 0123456789" > message

entropy@phalaris {~/asm/encode} ./encode message encodetable
Ybmt mt ybu TUHFUY Juttgou, yutymco 4590821367

entropy@phalaris {~/asm/encode} ./encode message encodetable > encoded_message

entropy@phalaris {~/asm/encode} ./encode encoded_message decodetable
This is the SECRET Message, testing 0123456789

Works, now if you make it a rotating key this would be much better.