Sort It with Inline Assembler

Started by Charles Pegge, July 07, 2007, 10:32:56 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Charles Pegge

This program shows how to efficiently sort any type of data, using CallBack techniques in conjunction with a Merge Sort function.

You wont have to touch the Sort function to customise the sorting procedure.  All you have to do is customise the callback function CHOOSEWHICH.

This code is suitable for all scales of sorting. It becomes efficient after 16-32 elements and can handle millions of elements if required, with the minimum necessary comparisons:

log2 (n) * n / 2
where n is the number of elements

FREEBASIC 0.20



'=======
' SORTM
'######

'
' Indexer using a MergeSort and Callbacks
'
' First   7 July 2007 (FreeBasic 0.16b)
' Revised 7 May  2009 (FreeBasic 0.20 )
'
' Charles E V Pegge

' COMPILED WITH FREEBASIC ver 0.20




' The sortm function is written in assembler, and based on a MergeSort
' which is one of the most efficient sort algorithms available, requiring
' n/2 * log2(n) comparisons.
'
' For example, A database of 1 meg data elements would require  10 meg comparisons
' whereas a simple pick-one sort would take n*n/2 = 500 gig comparisons.
'
' The sortm function requires twice the workspace of the final index, since it
' shuttles the indices from one buffer to the other during the merge sort process.
'
' This function uses a callback Chooser function to make each comparison, so
' it can be applied to any type of data on any criterion. The output is
' an array of 4 byte integers indexing the array of data elements.

' The index array must first be initialised with a set of indices, one for each
' data element. The merge sort function then rearranges the order of these indices
' in the array.



declare function sortm (byval p as long ptr,byval sz as long, byval cbk as any ptr) as long



'-------------------------
'FOR INDEX AND DATA ARRAYS
'=========================


dim shared ri(2000) as long ' must be twice the number of data elements
dim shared rs(1000) as string ' test sample of random string data
dim as any ptr cb,p
dim sz as long

'---------------
'FOR SAMPLE DATA
'===============

dim as long i,j
dim e as long
dim r as long
dim rst as string

e=531 ' number of data elements

p=varptr(ri(0)) ' pointer to index base

'---------------------------------
'GENERATE RANDOM STRINGS AND INDEX
'=================================

for i=0 to e-1
ri(i)=i ' this is our initial index for each element in the string array.
rst=string$(16," ")
for j= 1 to 16 ' generate a string of 16 random uppercase characters
  r=rnd(1)*25+65
  mid$(rst,j)=chr$(r)
next
rs(i)=rst ' store a random string in each data element in the string array
'print rst
next

sz=e*4 ' size of index block in bytes


'-------------------------------------
' CALLBACK FUNCTION TO MAKE THE CHOICE
'=====================================

' choose between first and second or abort the sorting process
' this is called by the bisort function
'
' Parameters:
' first   index number for data element
' second  index number for data element
'
' Return:
' 1 for first choice 2 for second choice 0 to abort the bisort function
'
'-----------------------------------------------------------------------
function ChooseWhich (byval first as long, byval second as long) as long
'=======================================================================
if rs(first) > rs(second) then function = 2 else function=1
'function=0 ' to abandon the sort
end function



'---------------------------------
'GET POINTER FOR CALLBACK FUNCTION
'=================================

'dim cb as sort_callback
cb=@ChooseWhich




'-------------------
'CALL THE MERGE SORT
'===================


sortm(p,sz,cb)



'--------------------------
' DISPLAY SAMPLES OF RESULT
'==========================

open "t.txt" for output as 1
for i=0 to e-1
print #1, ri(i), rs(ri(i))
next
close 1
print "done"
end






'###########
' MERGE SORT
'###########


'
' requires sz*2 work space
'
' parameters
' 1 p    pointer to index array
' 2 sz   size of index array in bytes ( = number data elements *4 )
' 3 cbk  address of Choosing function for callback

' Return:
' 0
'
'----------------------------------------------------------------------------------
function sortm (byval p as long ptr,byval sz as long, byval cbk as any ptr) as long
'==================================================================================

dim as long blk, lmt, q
dim as long ans,first,second

asm
'========================='
'      MERGE SORT         '
'========================='
' p sz                    ' inputs: data base pointer
' blk lmt bdry            ' local vars: block_size source limit dest limit
                          '
'-------------------------'
  mov dword ptr [blk],4   ' stating block size as long word
  mov ebx,[p]             ' base pointer to data
  add ebx,[sz]            ' calc base of transfer buffer
                          '
'========================='
new_pass:                   ' for each block size
'========================='
                          '
  mov esi,[p]             ' source pointer
  mov eax,esi             ' copy to esi
  add eax,[sz]            ' add entire data length
  mov [lmt],eax           ' save as source boundary
'-------------------------'
  mov [q],ebx          '
'-------------------------'
  mov edi,esi             ' copy 1st pointer to second pointer
  add edi,[blk]           ' add block to offset second pointer
                          '
'========================='
set_limits:               '
'========================='
  mov edx,[blk]           ' load block size
  mov ecx,edx             ' ecx and edx to be used as kimit checks
  add ecx,esi             ' add offset block1
  add edx,edi             ' add offset block2
  mov eax,[lmt]           ' load source boundary
  cmp ecx,eax             ' compare esi block limit with source boundary
  jle okecx               ' skip if okay
  mov ecx,eax             ' clip ecx to source boundary
okecx:                    '
  cmp edx,eax             ' compare edi block limit with source boundary
  jle okedx               ' skip if okay
  mov edx,eax             ' clip edx to source boundary
okedx:                    '
'========================='
block_merging:            ' loop
'========================='
  cmp esi,ecx             ' check limit for esi
  jl ok1                  ' okay procede to check edx
'-------------------------'
tran2:                    ' otherwise copy the over remainder of edx block
  cmp edi,edx             ' any left?
  jge next_block_pair     ' if not then next block pair to compare
  mov eax,[edi]           ' load second data for transfer
  mov [ebx],eax           ' store indexer word
  add edi,4               ' add stride o source
  add ebx,4               ' add stride to dest
  jmp tran2               ' repeat if any left to transfer
'-------------------------'
ok1:                      '
'-------------------------'
  cmp edi,edx             '
  jl ok2                  ' then proced to compare
tran1:                    ' otherwise transfer remainder of 1st
  cmp esi,ecx             '
  jge next_block_pair     '
  mov eax,[esi]           ' load second data for transfer
  mov [ebx],eax           ' store indexer word
  add esi,4               ' add stride o source
  add ebx,4               ' add stride to dest
  jmp tran1               ' repeat
'-------------------------'
ok2:                      ' ready to do comparison
'*************************'
  'mov eax,[esi]          ' load first data
  'cmp eax,[edi]          ' compare with second data
'-------------------------'
                          ' The Callback method:
  mov eax,[esi]           ' get index in [esi]
  mov [first],eax         ' save in First
  mov eax,[edi]           ' get index in [edi]
  mov [second],eax        ' save in Second
  push ecx                ' save limit reg ecx
  push edx                ' save limit reg edx
'end asm                   ' other registers will be preserved
'cbk.cp(first,second)      ' make the callback
'asm                       ' reenter assembler
  push [second]
  push [first]
  call [cbk]
' mov [reax],eax          ' diagnostic
  pop edx                 ' recover edx
  pop ecx                 ' recover ecx
                          ' other registers were preserved except eax
                          ' result expected in eax
  cmp eax,0               ' is it zero ?
  jz xit                  ' then terminate immediately
  cmp eax,1               ' is it the first choice?
  jz chosen1              ' first is chosen so skip
'*************************'
  mov eax,[edi]           ' load second data for transfer
  mov [ebx],eax           ' store indexer word
  add edi,4               ' add stride o source
  add ebx,4               ' add stride to dest
  jmp block_merging       ' continue sort
'-------------------------'
chosen1:                   '
'-------------------------'
  mov eax,[esi]           ' load second data for transfer
  mov [ebx],eax           ' store indexer word
  add esi,4               ' add stride to source
  add ebx,4               ' add stride to dest
  jmp block_merging       ' continue sort
                          '
'========================='
                          '
next_block_pair:          '
'-------------------------'
  mov eax,[blk]
  add esi,eax             ' add for next block comparison
  add edi,eax             '
  cmp esi,[lmt]           ' check against data boundary
  jl set_limits           ' continue if less
'========================='
next_pass:                '
'-------------------------'
  shl dword ptr [blk],1   ' double block size
  mov ebx,[q]             ' restore ebx base value
  xchg [p],ebx            ' swap source and dest pointers
  mov eax,[blk]           '
  cmp eax,[sz]            ' check block size against size of data
  jl new_pass             ' repeat with larger blocks
'========================='
buffer_tran:              '
                          ' data is now held at p (due to xchg)
                          ' move data back to base
  mov edx,[p]             '
  cmp edx,ebx             ' is p less than ebx?
  jl xit                  ' then no need to transfer
'-------------------------'
  mov ecx,[sz]            ' use ecx as a down counter
'-------------------------'
bt_loop:                  ' loop
'-------------------------'
  mov eax,[edx]           ' get source
  mov [ebx],eax           ' move to dest
  add edx,4               ' inc source pointer
  add ebx,4               ' inc dest pointer
  sub ecx,4               ' decrement down counter
  jg bt_loop              ' continue loop till zero
'========================='
  mov eax,[sz]            ' get data size
  sub [p],eax             ' set p to its original value
  mov eax,0               ' return 0 for okay
  jmp xit                 ' finish
'========================='
serrors:                  '
                          '
'========================='
xit:                      '
mov [function],eax      '
'========================='
end asm
end function



Charles Pegge



POWERBASIC

This bisort function required:
#REGISTER NONE
to avoid conflicts between the assembler code and the BASIC compilation.



#COMPILE EXE
#DIM ALL


' SORTIT
' Indexer using a MergeSort and Callbacks
' 7 July 2007
' Charles E V Pegge

' The bisort function is written in assembler, and based on a MergeSort
' which is one of the most efficient sort algorithms available, requiring
' n/2 * log2(n) comparisons.
'
' For example, A database of 1 meg data elements would require  10 meg comparisons
' whereas a simple pick-one sort would take n*n/2 = 500 gig comparisons.
'
' The bisort function requires twice the workspace of the final index, since it
' shuttles the indices from on buffer to the other during the merge sort process.
'
' This function uses a callback Chooser function to make each comparison, so
' it can be applied to any type of data on any criterion. The output is
' an array of 4 byte integers indexing the array of data elements.

' The index array must first be initialised with a set of indices, one for each
' data element. The bisort function then rearranges the order of these indice
' in the array.

' USING POWERBASIC ver 8.x



' test bed for bisort

'FOR CALLBACK CHOOSER FUNCTION
'declare function chooser_callback(byval a as long, byval b as long ) as long

'FOR INDEX AND DATA ARRAYS


' CALLBACK TO MAKE THE CHOICE
' choose between first and second or abort the sorting process
' this is called by the bisort function
'
' Parameters:
' first   index number for data element
' second  index number for data element
'
' Return:
' 1 for first choice 2 for second choice 0 to abort the bisort function
'

GLOBAL rs() AS STRING
GLOBAL ri() AS LONG

FUNCTION ChooseWhich (BYVAL first AS LONG, BYVAL second AS LONG) AS LONG
'
IF rs(first) > rs(second) THEN FUNCTION = 2 ELSE FUNCTION=1
'function=0 ' to abandon the sort
END FUNCTION

' MERGE SORT
' requires sz*2 work space
'
' parameters
' 1 p    pointer to index array
' 2 sz   size of index array in bytes ( = data elements *4 )
' 3 cbk  address of Choosing function for callback

' Return:
' 0
'
FUNCTION bisort (BYVAL p AS LONG PTR,BYVAL sz AS LONG, BYVAL cbk AS LONG) AS LONG

#REGISTER NONE

DIM blk AS LONG, lmt AS LONG, q AS LONG
DIM ans AS LONG, first AS LONG, second AS LONG

'asm
'========================='
'      BINARY SORT        '
'========================='
' p sz                    ' inputs: data base pointer
' blk lmt                 ' local vars: block_size source limit dest limit
                          '
'-------------------------'
! mov dword ptr blk,4     ' stating block size as long word
! mov ebx,p               ' base pointer to data
! add ebx,sz              ' calc base of transfer buffer
                          '
'========================='
new_pass:                   ' for each block size
'========================='
                          '
! mov esi,p               ' source pointer
! mov eax,esi             ' copy to esi
! add eax,sz              ' add entire data length
! mov lmt,eax             ' save as source boundary
'-------------------------'
! mov q,ebx               '
'-------------------------'
! mov edi,esi             ' copy 1st pointer to second pointer
! add edi,blk             ' add block to offset second pointer
                          '
'========================='
set_limits:               '
'========================='
! mov edx,blk             ' load block size
! mov ecx,edx             ' ecx and edx to be used as kimit checks
! add ecx,esi             ' add offset block1
! add edx,edi             ' add offset block2
! mov eax,lmt             ' load source boundary
! cmp ecx,eax             ' compare esi block limit with source boundary
! jle okecx               ' skip if okay
! mov ecx,eax             ' clip ecx to source boundary
okecx:                    '
! cmp edx,eax             ' compare edi block limit with source boundary
! jle okedx               ' skip if okay
! mov edx,eax             ' clip edx to source boundary
okedx:                    '
'========================='
block_merging:            ' loop
'========================='
! cmp esi,ecx             ' check limit for esi
! jl ok1                  ' okay procede to check edx
'-------------------------'
tran2:                    ' otherwise copy the over remainder of edx block
! cmp edi,edx             ' any left?
! jge next_block_pair     ' if not then next block pair to compare
! mov eax,[edi]           ' load second data for transfer
! mov [ebx],eax           ' store indexer word
! add edi,4               ' add stride o source
! add ebx,4               ' add stride to dest
! jmp tran2               ' repeat if any left to transfer
'-------------------------'
ok1:                      '
'-------------------------'
! cmp edi,edx             '
! jl ok2                  ' then proced to compare
tran1:                    ' otherwise transfer remainder of 1st
! cmp esi,ecx             '
! jge next_block_pair     '
! mov eax,[esi]           ' load second data for transfer
! mov [ebx],eax           ' store indexer word
! add esi,4               ' add stride o source
! add ebx,4               ' add stride to dest
! jmp tran1               ' repeat
'-------------------------'
ok2:                      ' ready to do comparison
'*************************'
  'mov eax,[esi]          ' load first data
  'cmp eax,[edi]          ' compare with second data
'-------------------------'
                          ' The Callback method:
! mov eax,[esi]           ' get index in [esi]
! mov first,eax           ' save in First
! mov eax,[edi]           ' get index in [edi]
! mov second,eax          ' save in Second
! push ecx                ' save limit reg ecx
! push edx                ' save limit reg edx
'end asm                  ' other registers will be preserved
CALL DWORD cbk USING ChooseWhich(first,second)      ' make the callback
'asm                      ' reenter assembler
'! mov reax,eax           ' diagnostic
! pop edx                 ' recover edx
! pop ecx                 ' recover ecx
                          ' other registers were preserved except eax
                          ' result expected in eax
! cmp eax,0               ' is it zero ?
! jz xit                  ' then terminate immediately
! cmp eax,1               ' is it the first choice?
! jz chosen1              ' first is chosen so skip
'*************************'
! mov eax,[edi]           ' load second data for transfer
! mov [ebx],eax           ' store indexer word
! add edi,4               ' add stride o source
! add ebx,4               ' add stride to dest
! jmp block_merging       ' continue sort
'-------------------------'
chosen1:                   '
'-------------------------'
! mov eax,[esi]           ' load second data for transfer
! mov [ebx],eax           ' store indexer word
! add esi,4               ' add stride to source
! add ebx,4               ' add stride to dest
! jmp block_merging       ' continue sort
                          '
'========================='
                          '
next_block_pair:          '
'-------------------------'
! mov eax,blk
! add esi,eax             ' add for next block comparison
! add edi,eax             '
! cmp esi,lmt             ' check against data boundary
! jl set_limits           ' continue if less
'========================='
next_pass:                '
'-------------------------'
! shl dword ptr blk,1     ' double block size
! mov ebx,q               ' restore ebx base value
! xchg p,ebx              ' swap source and dest pointers
! mov eax,blk             '
! cmp eax,sz              ' check block size against size of data
! jl new_pass             ' repeat with larger blocks
'========================='
buffer_tran:              '
                          ' data is now held at p (due to xchg)
                          ' move data back to base
! mov edx,p               '
! cmp edx,ebx             ' is p less than ebx?
! jl xit                  ' then no need to transfer
'-------------------------'
! mov ecx,sz              ' use ecx as a down counter
'-------------------------'
bt_loop:                  ' loop
'-------------------------'
! mov eax,[edx]           ' get source
! mov [ebx],eax           ' move to dest
! add edx,4               ' inc source pointer
! add ebx,4               ' inc dest pointer
! sub ecx,4               ' decrement down counter
! jg bt_loop              ' continue loop till zero
'========================='
! mov eax,sz              ' get data size
! sub p,eax               ' set p to its original value
! mov eax,0               ' return 0 for okay
! jmp xit                 ' finish
'========================='
serrors:                  '
                          '
'========================='
xit:                      '
! mov function,eax        '
'========================='
'end asm

END FUNCTION


FUNCTION PBMAIN () AS LONG

DIM ri(2000) AS GLOBAL LONG ' must be twice the number of data elements
DIM rs(1000) AS GLOBAL STRING ' test sample of random string data


DIM p AS LONG PTR
DIM sz AS LONG

'GET POINTER FOR CHOOSER FUNCTION

DIM cb AS LONG
cb=CODEPTR(ChooseWhich)



'FOR EXAMPLE DATA
DIM i AS LONG, j AS LONG
DIM e AS LONG
DIM r AS LONG
DIM rst AS STRING

'FOR DIAGNOSTICS'
GLOBAL reax AS LONG, rebx AS LONG, recx AS LONG, redx AS LONG, resp AS LONG, rebp AS LONG, resi AS LONG, redi AS LONG

e=532 ' number of data elements

p=VARPTR(ri(0))

'GENERATE RANDOM STRINGS AND INDEX
FOR i=0 TO e-1
ri(i)=i ' this is our initial index for each element in the string array.
rst=STRING$(16," ")
FOR j= 1 TO 16 ' generate a string of 16 random uppercase characters
  r=RND(1)*25+65
  MID$(rst,j)=CHR$(r)
NEXT
rs(i)=rst ' store a random string in each data element in the string array
'print rst
NEXT

sz=e*4 ' sizr of index block in bytes


bisort(p,sz,cb)

DIM s AS STRING
DIM k AS LONG
s=SPACE$(2000): k=1

' DISPLAY SAMPLES OF RESULT
FOR i=0 TO 15
'print ri(i)
'print ri(i)
MID$(s,k)=rs(ri(i))+$CR:k=k+18
NEXT
IF e>31 THEN
MID$(s,k)="----------------"+$CR:k=k+18
FOR i=e-16 TO e-1
  'print ri(i)
  'print rs(ri(i))
  MID$(s,k)=rs(ri(i))+$CR:k=k+18
NEXT
END IF

'DIAGNOSTICS
'print "eax  ";hex$(reax)
'print "ecx  ";hex$(recx)
'print "edx  ";hex$(redx)
'print "ebx  ";hex$(rebx)
'print "esp  ";hex$(resp)
'print "ebp  ";hex$(rebp)
'print "esi  ";hex$(resi)
'print "edi  ";hex$(redi)

MSGBOX LEFT$(s,k-1)
   

END FUNCTION