Assembler with LCC-Win32

In Primer 4 we shall look at some of the x86 assembler instructions in detail.

Intricacies Of Multiply

I've read that the multiply instructions are a mess - you can decide for yourself.

Using MUL is not so bad, there is only one operand, either a memory address or another register. The operation is always on the contents of EAX/AX/AL and the result of the operation will always be in EAX/AX/AL.

Here are some macros that make printing the results to the console window easier. The various registers are saved to stack before using printf() and restored afterward. You can of course look in the debug window for the CPU registers but this way is quicker for these particular tests.

#define eax_prt _asm("pushl %eax");_asm("pushl %ecx");_asm("pushl %edx");\
                _asm("movl %eax, %value");printf("%d\n", value);\
                _asm("popl %edx");_asm("popl %ecx");_asm("popl %eax")

#define ecx_prt _asm("pushl %eax");_asm("pushl %ecx");_asm("pushl %edx");\
                _asm("movl %ecx, %value");printf("%d\n", value);\
                _asm("popl %edx");_asm("popl %ecx");_asm("popl %eax")

#define edx_prt _asm("pushl %eax");_asm("pushl %ecx");_asm("pushl %edx");\
                _asm("movl %edx, %value");printf("%d ", value);\
                _asm("popl %edx");_asm("popl %ecx");_asm("popl %eax")

#define nprt _asm("pushl %eax");_asm("pushl %ecx");_asm("pushl %edx");\
             printf("\n");\
             _asm("popl %edx");_asm("popl %ecx");_asm("popl %eax")

MUL (unsigned multiply)

First the instruction MUL using a memory operand.

Declare some global variables that we will use as the memory operands.

char bvalue = 4;
short wvalue = 300;     // bigger than a byte
long lvalue = 80000;   // bigger than a word
long value   = 0;       // store

And now the function. Note: to take the address of a variable use the % character.

int __declspec(naked) mul(int a, int b)
{
    _asm("xorl %edx,%edx");    // clear edx for later use
    _asm("movl 4(%esp),%eax"); // 3
    _asm("movl 8(%esp),%ecx"); // 7

    _asm("mulb %bvalue");
    eax_prt;
    _asm("mulw %wvalue");
    eax_prt;
    _asm("mull %lvalue");
    eax_prt;
    _asm("mull %ecx");
    edx_prt;
    eax_prt;
    _asm("mull %lvalue");
    edx_prt;
    eax_prt;

    _asm("ret");
}

int main(void) { mul(3,7); return 0; }

From the above you can see that the first multiply is a byte operation - the number stored in bvalue is multiplied with the value in AL the the result is stored in AX if the operation resulted in an overflow and in AL if there is no overflow. Most of the other multiplies are similar but use word and long parameters. The last operation using mull %lvalue results in an overflow.

When AL overflows the overflow is stored in AX.

When AX overflows the overflow is stored in EAX.

When EAX overflows the overflow is stored in EDX:EAX.

The MUL %ECX instruction is straight forward and shows how to use another register with mul. Again any overflow will be stored in EDX.

IMUL (signed multiply)

That's all there is to using MUL but when using IMUL things can become tricky.

Using the same function and main as before delete the previous instructions and insert the following code.

    _asm("xorl %edx,%edx");    // clear edx for later use
    _asm("movl 4(%esp),%eax"); // 3
    _asm("movl 8(%esp),%ecx"); // 7

    _asm("imulb %bvalue"); // 4*al, (4*3) byte multiply overflows into AX
    eax_prt;
    _asm("imulw %wvalue"); // 300*ax, (300*12) word multiply overflows into EAX
    eax_prt;
    _asm("imull %lvalue"); // (80000*3600) long multiply overflows into EDX:EAX
    eax_prt;
    _asm("imull %lvalue"); // (80000*288000000) long multiply overflows into EDX:EAX
    edx_prt;
    eax_prt;

The first form of instruction using imul is the same as for mul. Using a memory operand as above results in the same behavior as mul but with signed results.

The next form of instruction uses the memory operand plus the register.

    // clear and start again
    _asm("xorl %edx,%edx"); // clear edx for later use
    _asm("movl 4(%esp),%eax"); // 3
    _asm("movl 8(%esp),%ecx"); // 7

    _asm("imulw %wvalue,%ax"); // 300*ax, (300*3)
    eax_prt;
    _asm("imull %lvalue,%eax"); // 80000*eax, (80000*900)
    eax_prt;
    _asm("imull %lvalue,%eax"); // 80000*eax, (80000*72000000)
    edx_prt;
    eax_prt;

    nprt;

In this case the memory operand is multiplied with the register but no overflow is produced.

Next.

Three operands are used, an immediate value ($200), a memory operand and the register.

    _asm("imulw $200,%wvalue,%ax"); // 200*300->ax
    eax_prt;
    _asm("imull $100,%lvalue,%eax");// 200*80000->eax
    edx_prt;
    eax_prt;

Again, if an overflow occurs it is lost.

Next.

Three operands are used, an immediate value, another register and the accumulator (eax) register.

    _asm("imulw $10, %cx,%ax"); // 10*7->ax
    eax_prt;
    _asm("imull $24,%ecx,%eax"); // 24*7->ax
    eax_prt;

The immediate value is multiplied with the other register and the result is stored in EAX, again if there is an overflow it is not stored.

A mess?

Entanglement with Divide

Divide like multiply has two versions, signed and unsigned, first we will look at unsigned.

DIV (unsigned divide)

The DIV instruction divides a argument (value) leaving the quotient in the lower part of the EAX and the remainder in the upper part of EAX or it uses EDX to hold the remainder depending on the size of the operand, 8, 16 or 32 bit.

For an 8 bit divide, the operation actually occurs on AX as a 16 bit value and divided by say CL, the result then is in two parts, AL will contain the quotient and AH will contain any remainder or modulo.

Here is an example.

#include <stdio.h>

char __declspec(naked) div(char a, char b)
{
    _asm("movl 4(%esp),%eax"); // 0x0000000a
    _asm("movl 8(%esp),%ecx"); // 0x0000000b
    _asm("divb %cl,%al");
    _asm("ret");
}

int main(void)
{
    char a = 7, b = 3;
    div(a,b);
    return 0;
}

Run the above code whilst looking at the CPU registers, you will see that the quotient in AL will be 2 (7/3) the remainder (1) will be in AH. It is necessary to clear the bits in AH before using DIV otherwise the result will range from a wrong result to an exception.

Try the same code but adding a value in AH, watch the eax register as you step through the code. The result will be wrong.

_asm("movl 4(%esp),%eax"); // 0x0000000a
_asm("movb $1,%ah"); // 1 into the upper part of AX
_asm("movl 8(%esp),%ecx"); // 0x0000000b
_asm("divb %cl,%al");
_asm("ret");

Now instead of moving 1 into AH try 4 and see what happens.

Note: After seeing the results it should be obvious one should zero extend the value into the appropriate upper part of either EAX/AX or into EDX as required for all DIV operations.

For a 16 bit divide, the operation occurs on EAX as a 32 bit value and divided by say CX, the result then is in two parts, AX will contain the quotient and DX will contain any remainder or modulo.

Try this and watch for the remainder in EDX.

unsigned short __declspec(naked) div(unsigned short a, unsigned short b)
{
    _asm("movl 4(%esp),%eax"); // 0x000000a
    _asm("movl 8(%esp),%ecx"); // 0x000000b
    _asm("xorl %edx,%edx");    // 0x00000000
    _asm("divw %cx,%ax");
    _asm("ret");
}

For a 32 bit divide, the operation is on EDX:EAX as a 64 bit value and divided by say ECX, the result then is in two parts, EAX will contain the quotient and EDX will contain any remainder or modulo.

int __declspec(naked) div(unsigned long a, unsigned long b)
{
    _asm("movl 4(%esp),%eax"); // 0x000000a
    _asm("movl 8(%esp),%ecx"); // 0x000000b
    _asm("xorl %edx,%edx");    // 0x00000000
    _asm("divl %ecx,%eax");
    _asm("ret");
}

All divides work on the EAX register only so all the above divide instructions can be shortened to the following.

_asm("divl %ecx");
_asm("divw %cx");
_asm("divb %cl");

One can also use a memory operand instead of a register for dividing.

_asm("divb 8(%esp)");
_asm("divw 8(%esp)");
_asm("divl 8(%esp)");

IDIV (signed divide)

In the previous section there is a note about zero extending the appropriate register, when using in IDIV (signed divide) one needs to sign extend the value.

CBW extends byte in AL to word Value in AX by extending sign of AL throughout register AH.

CWD extends sign of word in register AX throughout register DX forming a dword quantity in DX:AX.

CDQ extends signed dword in EAX to a signed quad word in EDX:EAX by extending the high order bit of EAX throughout EDX.

When describing DIV I used the long form, div %ecx,%eax but because EAX/AX is implicit this example will use the short form.

char __declspec(naked) idiv(char a, char b)
{
    _asm("movl 4(%esp),%eax");   // automatically zero extends
    _asm("cbw");                 // sign extend
    _asm("idivb 8(%esp)");       // a/b -> ah/al
    _asm("ret");
}

int main(void)
{
    char a = 7, b = 2;
    printf("%d\n", idiv(a,b));
    return 0;
}

The division of words and longs is left to the reader as an exercise.

Back to main page