Exploring Assembly Code for Pentium Ones

heman · Apr 17, 2006

Can anyone explain how this simple assembly code is working:
This is specifically for pentium ones

I am actually lacking codes for assembly,,didn;t get any...if you have any please give..
----------------------------------------------------------------
#include<asm/unistd.h>
#include<syscall.h>

.data
buffer: .int 0
fileread:
.string "fileread.S"
filewrite:
.string "filewrite.S"

.text
.globl _start
_start:
movl $(SYS_open), %eax
movl $(fileread), %ebx
movl $(0) ,%ecx
int $0x80

push %eax

movl $(SYS_open), %eax
movl $(filewrite), %ebx
movl $(1), %ecx
int $0x80

push %eax
c1:
movl $(SYS_read) ,%eax
movl 4(%esp), %ebx
movl $buffer, %ecx
movl $4 ,%edx
int $0x80

cmp $0, %eax
jle end

movl $(SYS_write) ,%eax
movl (%esp),%ebx
movl $buffer,%ecx
movl $4,%edx
int $0x80
jmp c1

end:

addl $8, %esp
movl $(SYS_exit),%eax
xorl %ebx,%ebx
int $0x80
ret

-Job- · Apr 18, 2006

It looks as if it is copying a file to another. I'm not familiar with this specific assembly language but from my assembly experience then i assume:
. main memory locations are preceded by $
. registers are preceded by %
. movl A, B moves the content in A to B, not sure how data is addressed, maybe by word or word*2, but doesn't matter much i guess.
. push would push something onto the stack. typically you use this to backup your registers so that you can reuse them without overwriting data
. labels are followed by ":". So c1: ... is a line labeled "c1"
. jmp clearly jumps to a line of code
. jle probably is "jump if less than or equal to"
. cmp A, B, compares A and B and stores 1 or 0 in a specific register depending on whether A<=B or not, this is the value used by jle, i think
. %0 should be the 0 register which is the register storing the value 0

If I'm not mistaken in the above definitions then the first code block:

Code:

_start:
movl $(SYS_open), %eax
movl $(fileread), %ebx
movl $(0) ,%ecx
int $0x80

push %eax

movl $(SYS_open), %eax
movl $(filewrite), %ebx
movl $(1), %ecx
int $0x80

... opens an input file and saves the location of the first word in register %ecx. It then opens the file it will write to and saves the location of the first word in that file (the first word we'll write data to).
Without getting into specifics it looks like the following code establishes and populates a buffer which will be used in reading from the file:

Code:

push %eax
c1:
movl $(SYS_read) ,%eax
movl 4(%esp), %ebx
movl $buffer, %ecx
movl $4 ,%edx
int $0x80

The next block of code clearly is a loop. We use cmp to compare the word we just read from the input file with 0. If it's 0 then it's an EOF (end of file) and so we exit the loop by jumping to the line labeled "end". If it's not the EOF, then we write the contents of the buffer to the output file and we jump to the beginning of the loop at the line "jmp c1" and follow the process all over again, until the whole input file is read:

Code:

cmp $0, %eax
jle end

movl $(SYS_write) ,%eax
movl (%esp),%ebx
movl $buffer,%ecx
movl $4,%edx
int $0x80
jmp c1

The remaining block of code just seems to clean everything up and exit the program:

Code:

end:

addl $8, %esp
movl $(SYS_exit),%eax
xorl %ebx,%ebx
int $0x80
ret

So it looks very much like this assembly program creates a copy of a file.

rhinovirus · Apr 21, 2006

-Job- has pretty much covered what is going on. I just thought I'd explain that each block ending with 'int $0x80' is calling a C style system call. 'int $0x80' itself generates a software interrupt giving control to the (I assume Linux) kernel. If you look at the file /usr/include/asm/unistd.h (assuming you are on a UNIX system) you will see a list of system call names along with a corresponding number. These system calls are C style functions, and the kernel takes function parameters from registers. You can look up the details and function parameters of these system calls using the 'man' command, i.e. 'man 2 open' gives you information about the open() function (note that most of the programmer man pages are on the second page, hence the 2).

movl $(SYS_open), %eax

The above line basically moves the number corresponding to the open system call (checking my unistd.h, it's number 5) into the %eax. When calling system functions in assembly, the call number must go into register %eax.

Checking the 'man 2 open' page the function prototype is:

int open(const char *pathname, int flags);

The first parameter is 'const char *pathname' this corresponds to the 'fileread' string. As it is the first parameter it must be stored in %ebx:

movl $(fileread), %ebx

Again from the prototype, the second parameter is 'int flags' and this must be stored in register %ecx (basically parameters must be stored in the same order as the protoype). Check the manpage for information on what 'flags' does.

movl $(0) ,%ecx

The final line of each block is:

int $0x80

Again this calls the kernel which looks at the registers and executes the system call for you.

I'll leave it as an exercise to check the manpages for the other calls and figure out exactly what's going on.

rhinovirus

-Job- · Apr 21, 2006

Welcome around Rhino.

rhinovirus · Apr 22, 2006

-Job- said:

Welcome around Rhino.

Thank you -Job-.

rhinovirus

Exploring Assembly Code for Pentium Ones

FAQ: Exploring Assembly Code for Pentium Ones

What is Assembly Code for Pentium Ones?

Why is it important to explore Assembly Code for Pentium Ones?

How is Assembly Code for Pentium Ones different from other programming languages?

Can anyone learn how to code in Assembly for Pentium Ones?

Are there any benefits to using Assembly Code for Pentium Ones?

Similar threads

Hot Threads

Recent Insights