Let's Talk Assembler

Started by Donald Darden, April 16, 2007, 08:52:27 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Charles Pegge

Yes there are Darwinian forces at work in the world of coding standards. Whatever is best supported and most convenient to use will win in the end.


x86 calling conventions

http://en.wikipedia.org/wiki/X86_calling_conventions

I came across this article the other, and found it very informative. Of particular interest were the new calling conventions of the 64 bit x86 in Long Mode. Instead of pushing all your parameters onto the stack, the strategy is to take advantage of some of the extra registers and use them instead.

In many instances, it is possible to avoid using the stack altogether and achieve very efficient functions.

From a coding point of view the x86 begins to look like a RISC processor.



Donald Darden

Huh!  The use of registers for calling Interrupts under DOS was also used with the 8086/8088 architecture.  But as they wanted the Interrupts to do more and more,
they moved to the concept of passing input paramters via the stack, or by setting up a buffer or external memory area to be referenced, and the return values either put in AX or a combo of DX & AX.  Since the Stack is another way of using main memory, it is slow when compared to relying on registers, but more flexable in that you can put whatever you need on there.  With COM programming, the concept changed yet again in that you pass a pointer to a structure, and the structure can be set up any way that you want.

You watch, because there will be a split between the camp that wants to do everything in a way that works best with 64-bit architecture, and another group that will strive to maintain compatability with 32-bit architecture as well, for the greater market share, or just to continue with prescribed methodology.
  •  

Charles Pegge

Well Linux was first in with a 64 bit version of their operating system. With a bit of smart compilation, it cant that difficult to make the transition. Only a few of the original opcodes are reassigned to new instructions, and smart optimisation by the compiler will exploit the new registers.

But to rewrite the operating systems from scratch seems impractical in view of the massive bulk of code involved. We will have to use AI to clean up the knotted tangle, and put the code into a more adaptable form for future architectures.

Charles Pegge

#33
A Framework for Assembler Functions

When doing any substantial amount of assembler, it is essential to have a
regular framework that can be consistently applied and meets the needs of the task.

Here is a simple framework I am planning to use, for 3D Opengl, involving real time graphics. Often you have to work with three or four term vectors but the functions in most high level languages only return single values and it is not always convenient to pass pointers.

In this scheme, parameters and local variables are referenced by the ESI index register, and each function has a set workspace of 16 8 byte slots,
the bottom four being reserved for parameters shared with the parental function to the top four being reserved for child functions.

The 8 byte slots can be used to accommodate LONG QUAD SINGLE or DOUBLE datatypes.

This is how parameters are set up before a call

mov  [esi+&h60], ..
mov  [esi+&h68],.
mov [ esi+&h70],..
call fun


The callee function then adds &h60 to ESI and then accesses the parameters as [esi+0], [esi+8], [esi+&h10] and so forth.  The locations [esi+&h20] to [esi+&h58] are avail for an additional 8 local variables. Above this, there are four slots available for passing down to the next level of functions.

Prior to returning, all that has to be done is deduct &h60 from the ESI register.

For calls outside this scheme, cdecl or stdcall can still be used since we have not committed the EBP register to any specific use, and we have not made any assumptions about the stack frame.

In addition to the ESI and EBP registers,  we can make use of the EBX and EDI registers. I intend to reserve the EBX register  to hold the base address for the application's shared or global variables and the EDI register as the base address of an object if one is being used.

With this scheme there are no absolute addresses so the code can be loaded and executed anywhere.

Because it uses a fixed frame size and minimises the use of the stack, verifying, testing and debugging should be a lot easier.



Donald Darden

#34
I want to get into the x86 instruction set a bit more, but I have to first explore some of the tools available to me.  First, I want to see if I can provide a link to the ASM.HLP file I have zipped and attached to this post below.  It's now in the download section as well if you need to find it easily in the future.

This file really does a lot to help you understand assembler and how to program the x86, but it can be daunting to look at, and I want to discuss one of the commands in some detail to help you understand the way to interpret the Help File.

Okay, that worked as expected.  Unfortunately, I can't control where the ADC bitmap goes, this forum software insists on putting it at the bottom and made so small that you can't read it at first glance.  But if you click on the image, it will expand, and if you click on it again, it will toggle back to small size.

The ADC instruction was selected because it works with several of the most important flags.  It also is influenced by the current setting of the Carry Flag.  Note that the mnemonic format is ADC dest,src.  This means you use ADC, and you follow it with any valid destination, then a comma, and any valid source.

The rest of the bitmap page attempts to explain what those valid references
might be, the difference in the way the instruction worked on the different x86 platforms (that's what the 286, 386, and 486 columns are for), the number of
clock cycles needed for the instruction (a way of determining performance), and
then the portion of the code that determines the destination, and finally the portion of the code that determines the source.

The use of "reg" refers to one of the named registers.  If it says reg8, then it is an 8-bit register.  If it says reg16, then it is a 16-bit register.

The use of "mem" says that this reads from a memory address.  The "imm" refers to a byte, word, or dword portion of the instruction itself.  For instance, if you
wrote an instruction ADC al,3, which means to add with carry, a three to al, then the "3" would be a constant, and embedded in the instruction itself, actually taking up one of the 8-bit bytes that forms the instruction.  We know it is just 8 bits (a byte), because al itself is just 8 bits, or one byte in size, and in all cases, we match the same size operands.

For brevity, you may also see something like "r/m8" which means that either a register or memory location of 8 bits may be specified.  You should be able to understand r16, m8, or m32 at this point as well.

When you write or read assembly code, you will sometimes see something like MOV al,"a".  This again would be a constant, and treated as an immediate part of the instruction, but the assembler would understand that the double quote would mean the character "a". which has an ASCII code of 97 in decimal.

Note that the vast majority of modern computers use binary (two state) circuitry for the greatest efficiency and reliability, into what we refer to as "bits"..  Two states can be interpreted as any conditon where only one of two outcomes is possible, such as true/false, yes/no. go/nogo, 1/0, on/off, plus/minus, set/clear, up/down and so on.  By grouping multiple bits together into groups of four (nybbles), 8 (bytes), 16 (words), 32 (double-words or dwords), and 64 (quads), we can represent quantities as well.  The most common grouping for presentation purposes is in nybbles of 4 bits, which we see
as the digits 0 to 9, followed by the letters A to F.  These correspond to a count range from 0 to 15 decimal, and represents a binary number shown in Base 16.

The x86 assembler probably expects most numbers to be entered in decimal form, rather than hexidecimal, octal, binary, or other representation.  That is
because we humans are usually taught to work with base 10 numbers first, and many of the things we do involve base 10 values.  So if you just enter a number like 100, that is one hundred in decimal.  If you want it to represent 100 in hex,
you would trail it with an "h" character, so 100h = 256 decimal.  Now there is an area of possible confusion, where you might have a number like ab, which since this is hexidecimal (base 16), we would represent as abh.  Now how is abh to be understood to be a number, and not the name of some variable?  The answer is, that all numbers must start with a digit 0 to 9 which tells the assembler that this is clearly a number.  So to the assembler, abh would be the name of something, and 0abh would be a hexidecimal number.
  •  

Donald Darden


push ebp
mov ebp, esp
push ebx
push esi
push edi
sub esp, 64h
push offset sub_4010C7
xor esi, esi
push esi
push esi
push esi
push esi
nop

The code above is a small extract from an ASM file that I produced from a small EXE file.  How did I do this?  With a product called IDA Pro, freeware version 4.3, available from the SimTel network.  I downloaded and installed the product, then ran it and gave a path to the Test.exe file I wanted it to process.  When I had the file, I could examine it in depth, seeing how PowerBasic put it together.  But since I was really only interested in analyzing my small piece of it, I performed a search for the first NOP in the file (this was generated by my use of a !NOP mnemonic in the inline assembler when I used PowerBasic to create Test.exe).

But fpr this example, I wanted you to look at the sequence of instructions above
the point where the NOP (No OPeration) instruction occurred.  These instructions are generated by the PowerBasic compiler.  If you had looked at the code in the window under the IDE, you would have seen a lable named Start
and another line identifying this as the beginning of a Near Proc, or near procedure.  A near procedure is generated when a return address, without the
segment register, is placed on the stack as a return point when a RETN (RETurn Near) instruction is executed later on.  Start in this case corresponds to your PBMAIN or WINMAIN entry point.  We don't see Start here, because the ASM file lacks the naming conventions used in our source file.

Most of these initial instructions involve the Stack register in some way. By pushing ebp, we put the ebp's current contents onto the stack, thereby freeing it up for other uses.  In this case, we next put the current contents of the esp (stack pointer) in ebp to serve as a fixed reference to the stack, so that dispite other changes we might make involving the stack or the esp, we can get back to this same point on the stack, using the contents of ebp to help us.

Hext we push ebx, esi, and edi onto the stack.  This saves their contents to be restored later as well.  ESI and EDI can now be used for integer REGISTER variables,  The ebx register is the most powerful memory referencing register, and it appears that PowerBasic uses it as well, and is ensuring that its contents come back intact later, in case you end up using it yourself.

The "sub esp, 64h" instruction brings the esp pointer down an additional 100
decimal addresses.  This must correlate to a reserved area on the stack that
PowerBasic employs.  Next, the "xor esi, esi" instruction effectively sets the contents of the esi register to zero.  The xor instruction is often shorter and faster than using an instruction like "mov esi, 0".  You often see a lot of xor operations doing this, and you can tell that this is the purpose anytime the destination and source are the same, since this will always be the outcome in that case.  Then you see the four "push esi" instructions, which serve the purpose of putting four 32-bit values of zero onto the stack.  If you look at the normal requirements for defining FUNCTION WINMAIN, and compare it to the simplistic use of FUNCTION PBMAIN, you may find some parity between what you could manually define for WINMAIN, and what PowerBasic's shortcut approach under PBMAIN does for you instead with certain defaults being supplied by automatic coding.

And then there is the first NOP instruction.  I put that there, so what follows would be normally what I would be interested in.  If I don't have a second NOP somewhere, then the little bit of program I put under Tricks and Tips would just keep printing out all the code that follows.  That program uses two toggles, one called Flip and the other called Flop, to control what output appears when I examine the ASM file.



  •  

Theo Gottwald

#36
Donald --
I tried the Link from your post:

http://members.cox.net/pc_doer/assembler/asm.hlp

it did not work. Even directly on that site:

http://members.cox.net/pc_doer/Assembler/asm.htm

the download was not possible.

I found this page with the most often used commands:
http://www.jegerlehner.ch/intel/

inside this usefull PDF:
http://www.jegerlehner.ch/intel/IntelCodeTable.pdf

and finally this Text:
http://members.save-net.com/jko@save-net.com/asm/r_x86.txt

as alternative.

Donald Darden

Hi Theo, thanks for letting me know of the problem.  I put the asm.hlp file in a zip and linked it to my post above, and added it to the download section, which should make things easier for all concerned.  I looked at your link, and it was a really good alternative, so I am going to include it in the download section as well.

I think we are making progress here.  I'm looking forward to more participation by our members, and more members as we gain visitors, and more visitors as we expand our posts.
  •  

Donald Darden

#38
When you begin using any language, you need to know what is possible, what is
provided for you, and how to construct valid statements within that language.  Looking at existing code, and reading available help files and other documentation can help you in each of these areas.

The problem is, is that there is simetimes general oe prior knowledge which sometimes gets in the way of going from here to there.  You end up making unexpected detours as you suddenly find that an assumption has proved incorrect, and time has to be taken to research the matter further.

Take the matter of numeric expressions for instance.  As I said earlier, many Assemblers assume that a string of digits together represents a decimal number.  This is despite the fact that the computers themselves are essentially binary devices themselves.  But programmers want the added convenience of being able to enter numeric data in other forms as well.

If you read on the subject, you find that it is not uncommon to represent a
hexidecimal (base 16) number with a leading digit and a trailing small "h".  If you wanted to represent the hexidecimal quantity of "FFFF", you would write it as 0FFFFh.  The leading zero is necessary, since FFFFh would also be a valid name for something.  Now in fact, most assemblers are case insensitive, so you could write this quantity as 0ffffH, 0FFFFH, or 0ffffH as well.

However, none of these forms are acceptable to the PowerBasic inline assembler.  I could not find an explanation of what the PowerBasic syntax should be, but in looking through the Help file, I found some clear examples.  For the PowerBasic inline assembler, you need to use a leading ampersign (&) symbol,
a character to indicate what base you are using ("h" for hexidecimal, "o" for octal, or "b" for binary), than the digits of the number itself.  If you do not know
this, then trying to enter numbers in a different base would be pretty hard to do.

If you enter a command like !MOV eax, &h32, then in all likelyhood, when you look at the conversion performed by a disassembler, you would see mov eax, 50 in the resulting assembler file.  The reason is that the disassembler is reconstructing an assembler source file based on the finished EXE file, and there is no direct correlation your original source file and the resuling ASM file.  That means you have to look at the general form of the instructions, the sequence of instructions, the uniqueness of certain instructions, and have some awareness of the fact that an exact recreation will not be possible.  The biggest lost in information will be the loss of names, labels, and comments that were included in the original source file.  But as also shown, PowerBasic's method of constructing
higher level processes in Assembler will represent another great challenge to our understanding.

What we have learned is that PowerBasic maintains a balance between a place where it sets the EBP from the current ESP value, then it sets ESP further down
by a range of addresses, so that a portion of the stack above the EBP would
represent the passed parameters and other stack entries, and an area of memory between the ESP and EBP point references would represent a temporary work area, suitable to local and static variables.  We can also see that the parameters that were passed appear above where EBP points, requiring a positive offset from EBP, and the local and static area would appear below where EBP points, requiring a negative offset from EBP.  References to both regions are via EBP, which is currently static and fixed, and ESP would continue to be used for things like CALL and RETN statements, and for any PUSH and POP statements you apply yourself.

Technically then, we should be able to do something like this:

#REGISTER NONE

SUB Test(aa AS STRING)
  LOCAL a, b, c, d AS LONG, bb AS STRING
  bb="Test String Two"
  ! mov a, esp
  ! mov b, ebp
  FOR c = a TO b+5*8 STEP 2
    IF c=a OR c=b THEN ?STRING$(70,"*")
    d = PEEK(LONG, c)
    ? USING$("  ###  ",c-a) d;
    COLOR 14
    IF c=VARPTR(a) THEN ?" a lives here";:GOTO pass
    IF c=VARPTR(b) THEN ?" b lives here";:GOTO pass
    IF c=VARPTR(c) THEN ?" c lives here";:GOTO pass
    IF c=VARPTR(d) THEN ?" d lives here";:GOTO pass
    IF c=STRPTR(aa) THEN ?" aa starts here";:GOTO pass
    IF c=STRPTR(bb) THEN ?" bb starts here";:GOTO pass
    IF d=a THEN ?" VAL a?";
    IF d=b THEN ?" VAL b?";
    IF d=c THEN ?" VAL c?";
    IF d=VARPTR(aa) THEN ?" VARPTR aa?";
    IF d=VARPTR(bb) THEN ?" VARPTR bb?";
    IF d=VARPTR(a) THEN ?" VARPTR a?";
    IF d=VARPTR(b) THEN ?" VARPTR b?";
    IF d=VARPTR(c) THEN ?" VARPTR c?";
    IF d=VARPTR(d) THEN ?" VARPTR d?";
    IF d=STRPTR(aa) THEN ?" STRPTR aa?";
    IF d=STRPTR(bb) THEN ?" STRPTR bb?";
    IF d=LEN(aa) THEN ?" LEN aa?";
    IF d=LEN(bb) THEN ?" LEN bb?";
pass:
    COLOR 15
    ?
  NEXT
END SUB

FUNCTION PBMAIN
  COLOR 15,1
  CLS
  ?"Looking at memory from ESP to EBP, plus EBP to EBP + 20"
  test "Test String 1"
  DO:LOOP
END FUNCTION

Now I guess I would say that this method of setting a mark in memory, where
your passed paramters appear above EBP, and your temporary used of memory,
including locals and statics, appear between ESP and EBP, has to be considered somewhat inspired.  To discard the temporary area, all PowerBasic needs to do is execute a mov esp, ebp statement, and the locals and statics are just gone.  At the same time, not only does the called procedure have the area set aside for temporay memory and procedures, but this happens each time the procedure is called - even if the procedure calls itself.  This fully supports the concept of recursion , with each call having its own parameters and temporary variables to work with. 
  •