O2 rocks! :) and feature request

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Ed Davis

First, let me just say that I am so impressed with Oxygen Basic.  I just finished porting a simple interpreter I wrote a long time ago to O2.  And I kept finding more and more things I liked about O2 and cools things that make a programmers life easier.

And you did this all by yourself?  I am very jealous! As one who has written many simple (e.g., nowhere near the scope of O2) byte code and pure interpreters, and has done quite a bit of assembly programming, I can appreciate the complexity of a full-blown compiler that generates machine code, and does it in the blink of an eye to boot!  Again, you have done an amazing job!  Definitely not something that most of us could do no matter how many years we tried!
It is just a shame that O2 isn't more well known, but hopefully it will continue to gain users.

Any plans to allow select to be used on strings?

Also, SQL and both Just Basic have an interesting select variation that would be cool to have:

select case
  case expression: statement
  case expression: statement
end select

So the select does not have an expression.  It is really just an if-elseif-else-endif chain, but in some cases it is convenient.

Thanks again for creating OxygenBasic, and for continuing to enhance and support it!
  •  

Theo Gottwald

#1
Let me add that Powerbasic has SELECT CASE AS LONG, and SELECT AS CONST$ to speed things up.
It should be easy to check how its internally done and also do something like this for O2.
I often use this in my scripts.

Mistral AI thinks this is easy to do.

2024-02-28 14_07_35-Le Chat Mistral - Brave.png

2024-02-28 14_09_10-Le Chat Mistral - Brave.png


GPT-4 would ode it similar:
; Assume EAX holds the value to be compared
; Constants are hypothetical and should be replaced with actual values

; Compare EAX with the first constant
cmp eax, CONST1
je .CaseConst1 ; Jump if EAX is equal to CONST1
cmp eax, CONST2
je .CaseConst2 ; Jump if EAX is equal to CONST2
cmp eax, CONST3
je .CaseConst3 ; Jump if EAX is equal to CONST3
; More comparisons as needed
jmp .DefaultCase ; If none of the above, jump to the default case

.CaseConst1:
; Code for the first case
jmp .EndSelect ; Jump to the end to avoid falling through

.CaseConst2:
; Code for the second case
jmp .EndSelect ; Jump to the end to avoid falling through

.CaseConst3:
; Code for the third case
; More case labels as needed

.DefaultCase:
; Code for the default case

.EndSelect:
; Continuation of the program after the select case

QuoteTranslating a Select Case As Const statement involving strings from PowerBASIC to x86 assembly is significantly more complex due to the need for string comparison rather than simple integer comparison. In assembly, strings are compared byte-by-byte using loops, and there's no direct equivalent to high-level constructs like Select Case for strings. However, you can implement a series of string comparison routines and conditional jumps to achieve a similar decision-making process.

Here's a conceptual approach to handling string comparison in x86 assembly, which involves defining the strings, comparing them to the input, and branching based on the comparison results. This example assumes use of the NASM assembler syntax and that the strings are null-terminated.

Step 1: Define Strings
First, define the constant strings you want to compare against, and ensure your input string is accessible:

assembly
Copy code
section .data
    inputString db 'YourInputString', 0 ; Null-terminated input string
    constString1 db 'Option1', 0
    constString2 db 'Option2', 0
    ; Add more strings as needed

section .text
    global _start
Step 2: Write a String Comparison Function
Next, implement a basic string comparison function. This function will compare two null-terminated strings and set the ZF (zero flag) if they are equal:

assembly
Copy code
; Assumes pointers to the two strings are in ESI (string1) and EDI (string2)
strcmp:
    .nextChar:
        mov al, [esi]      ; Load the next byte of string1 into AL
        mov ah, [edi]      ; Load the next byte of string2 into AH
        inc esi            ; Move to the next byte of string1
        inc edi            ; Move to the next byte of string2
        cmp al, ah         ; Compare the bytes
        jne .notEqual      ; Jump if not equal
        test al, al        ; Check if we've reached the end of the string (null terminator)
        jz .equal          ; If zero flag is set, strings are equal
        jmp .nextChar      ; Otherwise, continue comparing
    .notEqual:
        clc                ; Clear carry flag to indicate strings are not equal
        ret
    .equal:
        stc                ; Set carry flag to indicate strings are equal
        ret
Step 3: Compare Input String Against Constants and Branch
Finally, use the comparison function to compare your input string against each constant string and branch accordingly:

assembly
Copy code
_start:
    ; Compare inputString with constString1
    lea esi, [inputString]
    lea edi, [constString1]
    call strcmp
    jc .handleOption1 ; If strings are equal, jump to handleOption1

    ; Compare inputString with constString2
    lea esi, [inputString]
    lea edi, [constString2]
    call strcmp
    jc .handleOption2 ; If strings are equal, jump to handleOption2

    ; Add more comparisons as needed

    ; If no match found, handle default case
    jmp .handleDefault

.handleOption1:
    ; Code to handle option 1
    jmp .endSelect

.handleOption2:
    ; Code to handle option 2
    jmp .endSelect

.handleDefault:
    ; Default case code

.endSelect:
    ; Continue execution
This assembly code demonstrates a simplified approach to implementing a string-based Select Case decision structure in x86 assembly. Real-world usage would require careful consideration of string storage, memory management, and calling conventions, especially when interfacing with high-level languages or operating system APIs.

For x64 ASM GPT-4 is significantly slower but again the code looks more usable.

QuoteTranslating the concept of string comparison and branching based on those comparisons to x86-64 (also known as AMD64) assembly involves considerations similar to those in x86, with adjustments for the 64-bit environment. These adjustments include using 64-bit registers and potentially different calling conventions depending on the operating system.

Here's a conceptual approach to handling string comparison in x86-64 assembly, again assuming the use of NASM syntax. The main differences from x86 to x86-64 in this context are the register sizes (e.g., using RSI instead of ESI) and the fact that we might adhere to a specific calling convention (like the System V AMD64 ABI used on Unix-like systems, including Linux and macOS, which affects how parameters are passed to functions).

### Step 1: Define Strings
```assembly
section .data
    inputString db 'YourInputString', 0  ; Null-terminated input string
    constString1 db 'Option1', 0
    constString2 db 'Option2', 0
    ; Add more strings as needed
```

### Step 2: Write a String Comparison Function

```assembly
; String comparison function (strcmp), comparing RDI (string1) and RSI (string2)
strcmp:
    .nextChar:
        mov al, [rsi]    ; Load the next byte of string1 into AL
        mov ah, [rdi]    ; Load the next byte of string2 into AH
        inc rsi          ; Move to the next byte of string1
        inc rdi          ; Move to the next byte of string2
        cmp al, ah       ; Compare the bytes
        jne .notEqual    ; Jump if not equal
        test al, al      ; Check if we've reached the end of the string (null terminator)
        jz .equal        ; If zero flag is set, strings are equal
        jmp .nextChar    ; Otherwise, continue comparing
    .notEqual:
        mov rax, 0       ; Return 0 in RAX to indicate strings are not equal
        ret
    .equal:
        mov rax, 1       ; Return 1 in RAX to indicate strings are equal
        ret
```

### Step 3: Compare Input String Against Constants and Branch

```assembly
section .text
    global _start

_start:
    ; Compare inputString with constString1
    lea rsi, [inputString]
    lea rdi, [constString1]
    call strcmp
    cmp rax, 1
    je .handleOption1 ; If strings are equal, jump to handleOption1

    ; Compare inputString with constString2
    lea rsi, [inputString]
    lea rdi, [constString2]
    call strcmp
    cmp rax, 1
    je .handleOption2 ; If strings are equal, jump to handleOption2

    ; Add more comparisons as needed

    ; If no match found, handle default case
    jmp .handleDefault

.handleOption1:
    ; Code to handle option 1
    jmp .endSelect

.handleOption2:
    ; Code to handle option 2
    jmp .endSelect

.handleDefault:
    ; Default case code

.endSelect:
    ; Exit or continue execution...
    ; On Linux, for example, to exit cleanly:
    mov rax, 60       ; syscall: exit
    xor rdi, rdi      ; status: 0
    syscall
```

This example follows a similar structure to the x86 version but uses 64-bit registers and the calling conventions appropriate for x86-64. Note that for Windows, the calling convention and system call mechanism are different, so adjustments would be needed for Windows-specific assembly programming.

Remember, assembly language programming requires a good understanding of the specific architecture and operating system you're targeting, as conventions can vary significantly.

Now something funny. I say this:
QuoteAsk 5 virtual Assembler optimization experts and let each of them suggest optimizations of that code, then rewrite the code using the best suggestions

QuoteCreating an optimized version of the x86-64 assembly code for string comparison and branching based on those comparisons involves considering various aspects such as efficiency, readability, and adherence to conventions. Let's imagine consulting with five virtual assembler optimization experts and compile their top suggestions into an improved version of the code.

### Expert 1: Minimize Branching

- **Suggestion**: Reduce the number of jump instructions by using more efficient comparison logic, which can decrease the instruction path length and improve branch prediction on modern CPUs.

### Expert 2: Loop Unrolling

- **Suggestion**: Apply loop unrolling in the string comparison to minimize the loop overhead for strings of certain lengths, enhancing the comparison speed for short strings.

### Expert 3: Use SIMD Instructions

- **Suggestion**: For longer strings, consider using SIMD (Single Instruction, Multiple Data) instructions to compare multiple bytes at once, significantly speeding up the comparison process.

### Expert 4: Function Inlining

- **Suggestion**: Inline the string comparison logic for cases where high performance is critical, eliminating the function call overhead. This is more applicable when the comparison logic is used frequently with short strings.

### Expert 5: Register Usage Optimization

- **Suggestion**: Optimize register usage to avoid unnecessary data movements, ensuring that the most frequently used data is kept in registers throughout the comparison loop.

### Optimized Code Incorporating Suggestions

Given the complexity and varying applicability of these suggestions (e.g., SIMD might be overkill for very short strings and loop unrolling needs to be done carefully to actually provide benefits), a balanced approach that incorporates minimizing branching and optimizing register usage would be most universally beneficial. Here's an attempt to incorporate these improvements:

```assembly
section .data
    inputString db 'YourInputString', 0
    constString1 db 'Option1', 0
    constString2 db 'Option2', 0

section .text
    global _start

; Simplified strcmp to minimize branching
; Assumes RDI = string1, RSI = string2
strcmp_minimal_branching:
    .compareLoop:
        movzx rax, byte [rsi] ; Load next byte of string1
        movzx rbx, byte [rdi] ; Load next byte of string2
        or rax, rbx           ; Set ZF if both bytes are 0
        jz .equal             ; If zero flag is set, strings are equal (end of strings)
        cmp al, bl
        jne .notEqual
        inc rsi
        inc rdi
        jmp .compareLoop
    .equal:
        mov rax, 1            ; Return 1 if strings are equal
        ret
    .notEqual:
        xor rax, rax          ; Return 0 if strings are not equal
        ret

_start:
    ; Inline comparison logic for short, critical paths could be considered here
    ; Compare inputString with constString1
    lea rsi, [inputString]
    lea rdi, [constString1]
    call strcmp_minimal_branching
    test rax, rax
    je .handleOption1

    ; Other comparisons follow similarly...

.handleOption1:
    ; Handle case 1
    jmp .endSelect

.handleOption2:
    ; Handle case 2
    jmp .endSelect

.handleDefault:
    ; Default case code

.endSelect:
    ; Exit sequence...
```

This version aims to reduce branching within the string comparison loop and makes a minor optimization to register usage. It's a compromise that seeks to improve performance without the complexity of more advanced techniques like SIMD or loop unrolling, which are highly context-dependent and may not always yield performance gains for small or variable-length strings.


Charles Pegge

#2
Many thanks Ed, for your Basic interpreter.May I include it alongside your 'Toy Interpreter'?

I am looking into the feasibility of the select syntax you suggest. The logic is very dense around cases, but it should be a fairly simple redirect to if-expression processing by means of a nestable flag.

Thanks Theo, Mistral gives very detailed advice!  But we can already do jump tables quite easily, but not requiring the select/case structure. Very useful for the fast interpretation of tokens. The address of any label (including subroutines) can be expressed as a value with the '@' prefix: It goes something like this:


dim sys JumpTable[0x100]
dim int i

JumpTable[11]={ @labelA, @labelB, @labelC }

...
goto JumpTable[i]

labelA: '11
...
labelB: '12
...
labelC: '13
...

You can also use gosub as well as goto

Ed Davis

Quote from: Charles Pegge on February 28, 2024, 05:34:17 PMMany thanks Ed, for your Basic interpreter.May I include it alongside your 'Toy Interpreter'?

Of course!  And thank you!  Note: I've updated the source in the original message, so be sure and get that one.

QuoteI am looking into the feasibility of the select syntax you suggest. The logic is very dense around cases, but it should be a fairly simple redirect to if-expression processing by means of a nestable flag.

What about select on strings?  Or is the jump table below for that?  If so, I can't figure out how use it with strings :(

QuoteThanks Theo, Mistral gives very detailed advice!  But we can already do jump tables quite easily, but not requiring the select/case structure. Very useful for the fast interpretation of tokens. The address of any label (including subroutines) can be expressed as a value with the '@' prefix: It goes something like this:


dim sys JumpTable[0x100]
dim int i

JumpTable[11]={ @labelA, @labelB, @labelC }

...
goto JumpTable[i]

labelA: '11
...
labelB: '12
...
labelC: '13
...

You can also use gosub as well as goto
  •  

Zlatko Vid

Hey Ed

you should try this method:

'string sele
bstring s = "tesT123"
'int p As bstring ptr

int *p = strptr s
'byte p  at strptr(s) 'pointer select
'sys *p = strptr (s)

      Select p
         Case "tesT"
           print "wrong?"
         Case "tesT123"
           print "right!"
      End Select

Zlatko Vid

oh no
this one should be better

'selection with INT
'string casting -> select/filtering

string s="SELECT_STRING_WITH INTEGER" ' more than 4 chars
'(i AS string pointer)
INT i at strptr s  ' integer pointer cast string s

select i
    case "SELECT_STRING_WITH INTEGER"
        print "OK..this WORK!"

    case "PRINT"
       print "OK..just to compare"

    case else
       print " Error : Unknown selection!-> "

end select

print "OK!"

Charles Pegge

Hi Aurel, This technique works for short strings up to 4 bytes in length. Anything longer will require a full string expression: case Xstr=Ystr

I have the extended syntax working -just need to test it for any interference.

Thanks Ed, I will include TinyBasic as demos\Interpreters\TinyBasic\TB.o2bas

Zlatko Vid

I think that you said some time ago 4 char length
but ok ..let say could be good for scanning string char by char ?

Ed Davis

Quote from: Charles Pegge on February 28, 2024, 05:34:17 PMMany thanks Ed, for your Basic interpreter.May I include it alongside your 'Toy Interpreter'?

(hopefully) final update - updated original message.
This one takes advantage of O2 idioms/features.
Thanks again!
  •