Structuring Inline Assembler

Started by Charles Pegge, October 03, 2007, 01:13:43 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Charles Pegge

When you are familiar with some of the instructions, Inline assembler is actually very easy to write and you wonder why people bother to use high level languages, when there is such a sophisticated CPU at your disposal.

But after 100 or so lines of assembler, the reason becomes apparent as you begin to lose track of the code with all the conditional jumps and new labels that have to be created, This is the same problem that early forms of BASIC had - a lack of block structure.

There are two ways round this:

One is to embed the inline assembler between BASIC block structures, which means dipping in and out of assembler - sharing integer variables with the host system.

The other is to write the assembler sections with block structure labels and then put the source code through a preprocessing stage to turn these structures into unique labels acceptable to the compiler.

This has the advantage of producing more efficient code since you can make use of  these registers from start to finish without saving them to variables and handing back to the host BASIC, every time you reach the end of a block.

Because you are no longer dealing directly with the source code, the preprocessor can make debugging more difficult, but this is a small price to pay, to provide structured programming where it is needed.

Here is part of the word parser I am using for R$. It scans the text for the next word, but ignores comments -ascii 124: which are terminated by a line feed).

The block notation supports unlimited nesting:

do
   exit
   repeat
   if
   endif
end

BEFORE

   asm
  xor eax,eax
  mov edx,[lsi]
  sub edx,[i]
  mov esi,[i]
  add esi,[p]
  dec esi
  `do
   inc esi
   dec edx
   jl `exit
   mov al,[esi]
   cmp al,32
   jle do0
   cmp al,124
   jnz `exit
   `do
    inc esi
    dec edx
    jl end1
    mov al,[esi]
    cmp al,10
    jnz `repeat
   `end
   jmp `repeat
  `end
  sub esi,[p]
  mov [i],esi
  mov [c],al
end asm



AFTER

  asm
  xor eax,eax
  mov edx,[lsi]
  sub edx,[i]
  mov esi,[i]
  add esi,[p]
  dec esi
  do0:
   inc esi
   dec edx
   jl end0
   mov al,[esi]
   cmp al,32
   jle do0
   cmp al,124
   jnz end0
   do1:
    inc esi
    dec edx
    jl end1
    mov al,[esi]
    cmp al,10
    jnz do1
   end1:
   jmp do0
  end0:
  sub esi,[p]
  mov [i],esi
  mov [c],al
end asm


The preprocessor is itself written in R$, so I am using R$ reflexively to develop R$ :)

The original Basic Routine using string pointer technique

' do                                 ' skip spaces
'  if i>=lsi then c=0:exit do        '
'  c=p[i]                            '
'  if c=124 then                     ' skip line comment
'   do
'    i+=1:if i>=lsi then c=0:exit do '
'    if p[i]=10 then c=10:exit do    '
'   loop                             '
'   continue do                      '
'  end if                            '
'  if c>32 then exit do              '
'  i+=1                              '
' loop                               '


And here is the R$ code for the preprocessor so far. It also performs other tasks such as date stamping and prog statistics.


|
|   R$  POSTFIX LANGUAGE
|
|   Charles E V Pegge
|
|   Oct 2007

(
13 chr ( 10 chr ) , : crlf
"r$.dat" filein : fi
if ne then
  "r$.txt not accessible " ?
  0 returns
endif


"rr$.bas" fileout : fo
0 : dimsc
0 : funsc
0 : subsc
0 : asmsc
0 : lopsc
0 : linsc

"" : iflis
"" : dolis
0 : csy

| FOR GENERATING STRUCTURED ASSEBLER LABELS
| if .. endif   do .. exit .. repeat .. end
|------------------------------------------
def block_strucs
     1 + : p
     s p 10 mid read word : w len : lw
     w
     (
      "if" cmp if eq then
       "endif" ( csy str ) , " " , iflis , to iflis
       "endif" ( csy str ) , : sw
       p 1 - : p p lw + 1 + : q
       ref s sw p q insert s ?
       incr csy
      endif
     )
     (
      "do" cmp if eq then
       "end" ( csy str ) , " " , dolis , to dolis
       "do" ( csy str ) , ":" , : sw
       p 1 - : p p lw + 1 + : q
       ref s sw p q insert s ?
       incr csy
      endif
     )
     (
      "endif" cmp if eq then
       p 1 - : p p lw + 1 + : q
       iflis read word ":" , ref s swap p q insert s ?
       iflis " " 1 pos 1 + iflis swap 1000 mid to iflis
      endif
     )
     (
      "end" cmp if eq then
       p 1 - : p p lw + 1 + : q
       dolis read word ":" , ref s swap p q insert s ?
       dolis " " 1 pos 1 + dolis swap 1000 mid to dolis
      endif
     )
     (
      "exit" cmp if eq then
       p 1 - : p p lw + 1 + : q
       dolis read word ref s swap p q insert s ?
      endif
     )
     (
      "repeat" cmp if eq then
       p 1 - : p p lw + 1 + : q
       "do" ( dolis read word 4 16 mid ) ,
       ref s swap p q insert s ?
      endif
     )
end | block_strucs



| MAIN LOOP FOR EACH LINE
"" ?
(
  fi eof if true then exit endif
  fi in : s
  ( incr linsc )

  | STATISTICS TALLY
  (
   read word : w
   ( "function" cmp if eq then incr funsc endif )
   ( "sub" cmp if eq then incr subsc endif )
   ( "loop" cmp if eq then incr lopsc endif )
   (
    "end" cmp if eq then
     word : w1
     (
      "asm" cmp if eq then incr asmsc endif
     )
    endif
   )
   s
   (
    | for block structured assembler
    '`' 1 pos
    if is then
     block_strucs
    endif
   )
   (
    "dim shared " 1 pos
    if true then incr dimsc endif
   )
  )
  "\" 1 pos
  if null then
   s crlf , fo out repeat
  endif


  | INFO PATCH-INS
  s "\preprocessor" 1 pos : p
  if is then
   ref s "Preprocessed with R$ self coding manager" p p 13 + insert s ?
  endif
  s "\date" 1 pos : p
   p if is then
  date : da
  1 2 mid : d
   0 : i
   " 01 Jan 02 Feb 03 Mar 04 Apr 05 May 06 Jun 07 Jul 08 Aug 09 Sep 10 Oct 11 Nov 12 Dec "
   ( d 1 pos 3 + to i )
   i 3 mid : e
   "" ( da 4 2 mid ) , " " , e , " " , ( da 7 4 mid ) , to e
   d
   ref s e p p 5 + insert s ?
  endif
  s "\time" 1 pos : p
  p if is then
   time 1 5 mid : t
   ref s t p p 5 + insert s ?
  endif
  s crlf , fo out
  repeat
)
fi close
fo close
"" ?
" STATS:" ?
" Global Variables: " ( dimsc str ) , ?
" Functions:      : " ( funsc str ) , ?
" subs:           : " ( subsc str ) , ?
" assembler:      : " ( asmsc str ) , ?
" loops:          : " ( lopsc str ) , ?
" lines:          : " ( linsc str ) , ?
|"fbc r$.bas" shell
"done" ?

)

end | main


Charles Pegge


Having tried this system of structured assembler, for the past day or so, I can say it has made a dramatic improvement in coding speed and reliability, possibly a fourfold increase in productivity, comparing favourably with a high level language.

Structured  code is almost universally used throughout the programming community across many languages, and we take it for granted, without appreciating what a difference it makes to using line numbers or labels.

This for me has substantially bridged the gap between Assembler and Basic, so my plan is to Assemblerise most of the R$ code. Like hand-blending mayonnaise, this is best done gradually, then my project won't curdle :)


Kent Sarikaya

I feel like a radio operator picking up signals from an advanced civilization in reading your work Charles. You and some of the other gurus on these forums are in another league. It is fun trying to understand what I can from your guys interesting posts!

Charles Pegge

Well I am not too far away from planet earth. :) Some of my ideas on this might be a little strange but I will always try to do some preliminary research to see if they are feasible.

One thing I would like to try now is to explore the idea of a MetaBasic to coin a word, extending some of the work I have done so far, to see if some of the items on the PowerBasic wish list can be satisfied, using a preprocessor script.

Starting with the easy ones like collecting all the structures and prototypes and creating a header file to declare them. This will automate an otherwise tedious chore.

Another task that can be automated is, when combining two programs both containing GLOBAL variables is to extend / mangle / decorate the names to avoid a clash. Effectively creating NameSpaces. But the programs may want to share some of their GLOBALS so do we call these SOLAR variables?



Kent Sarikaya

First your MetaBasic sounds like a great idea!!!

About globals, so their would be globals, but using the program name it would make a namespace global for the globals in that program.

Now, you would also like to have globals that are wider in scope than namespace globals. Solar is fine, but Galactic starts with a g and sounds cool
Galactic Globals.  So if a namespace Global was to be universally available to all other apps, would it have galactic_namespace_VariableName hidden naming scheme?

Charles Pegge


Well I thought we could go up one denomination. compared with SOLAR, GALACTIC is awfully big, and might be needed for higher levels of integration. FreeBasic supports COMMON SHARED variables, which allows similar integration of common GLOBAL variables, but this is for tying Programs together at compile time, - sharing variables of the same name  between object modules. Since my idea operates at the source code level, I think it is worth making that distinction.

Operating systems, of course don't like to share their variables directly with anything, to prevent erratic applications from trashing the system and invoking the black screen of death.

Kent Sarikaya

Solar is nice, in that, it perhaps captures the idea better. The Solar Variable is the center (focal point) of the programs (planets) that use it. Its value(solar Rays) are available to all the planets.