Usage of Fixed Length Strings

Started by Jeff Marshall, March 09, 2024, 08:15:06 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Jeff Marshall

Hello, all

I have been recently exploring freebasic's handling of all string types getting prepared to make another go at improving a number of fbc string internals and what is offered to users.

Most briefly, we are working on a change for STRING*N
- occupies exactly N bytes
- no implicit null terminator
- padded with spaces on initialization and assignment
- can have embedded chr(0)'s
- for the most part is like QB's handling of STRING*N

However, this change will almost certainly break user code that employs STRING*N (which currently is equivalent to ZSTRING*(N+1) )

It seems that usage of STRING*N is rare but not never.  My presumption is that over the years ZSTRING*N has been much more favoured over STRING*N since they behave nearly identically but ZSTRING*N tends to describe the behaviour better.

My concern is that perhaps there is a usage of STRING*N that is intentional and that I have missed the usage.  If there is, perhaps this can be addressed.

I have been trying to go through some of the larger code bases looking to find where STRING*N has been used and what impact it will have on users and developers.

Additionally, exploring to allow passing ZSTRING*N and WSTRING*N arguments to procedures taking BYREF AS STRING parameters, allowing for modification.

Is there any major usage issue I may be missing?

Thanks in advance for feedback
  •  

Charles Pegge

Hello Jeff,

Does your proposed string include a length field like an OLE-string?

Frank Brübach

Hey Jeff,

I am not a professional programmer but I know freebasic and have some further ideas ...

1) Dynamic resizing: Allow the STRING*N to dynamically resize to accommodate longer strings

2)Error handling: Implement robust error handling

3)Compatibility with other languages

4)Implement memory management techniques to optimize memory usage

5)Unicode support: Consider adding support for Unicode characters to handle a wider range of text input

6)Documentation and examples: Provide comprehensive documentation and usage examples to help developers understand and utilize the STRING*N procedure effectively.

7)Offer more flexibility in padding options, such as allowing users to specify padding characters other than spaces

8)Community support with feedback

kind regards, frank
  •  

Jeff Marshall

#3
Quote from: Charles Pegge on March 10, 2024, 12:14:36 PMDoes your proposed string include a length field like an OLE-string?

That would be a BSTR / BSTRING, correct?  With the length hidden just before the data?

The freebasic STRING*N would be a fixed length string where length is known at compile time.
Should be compatible with QB, QB64, powerbasic (I think), STRING*N declarations.

So no length attribute stored in the attribute.  STRING*N occupies N bytes.
  •  

Jeff Marshall

Quote from: Frank Brübach on March 10, 2024, 01:29:18 PM1) Dynamic resizing: Allow the STRING*N to dynamically resize to accommodate longer strings
That's possible now with variable length STRING.  Or maybe I misunderstand your meaning.

Quote2)Error handling: Implement robust error handling
yes, freebasic lacking on the error handling front.  Not related to string*n but otherwise understandable suggestion.

Quote3)Compatibility with other languages
Changes to STRING*N does do that to an extent. 

Quote4)Implement memory management techniques to optimize memory usage

5)Unicode support: Consider adding support for Unicode characters to handle a wider range of text input
Yes, works in progress at least for strings development coming up.

Quote6)Documentation and examples: Provide comprehensive documentation and usage examples to help developers understand and utilize the STRING*N procedure effectively.
noted, thanks.

Quote7)Offer more flexibility in padding options, such as allowing users to specify padding characters other than spaces
Not sure what is meant by 'options'.  Goal is to implement a default behaviour - padded with spaces.  There are some alternatives but depends on context and usage.

For example, for initialization:
union U1
z as zstring * 10
f as string * 10
end union

union U2
f as string * 10
z as zstring * 10
end union

dim x as U1        '' zero initialized
dim y as U2        '' space initialized
dim z as U2 = any  '' no initialization

Quote8)Community support with feedback
hullo!  ;)
Thank-you, I appreciate your time to offer feedback and possibility to discuss. 

I see that WinFBX uses some STRING*N but not very many.  Possibly ZSTRING*N can work just as well, or there was a reason for selecting STRING*N instead (even though old  STRING*N and ZSTRING*N behaviour are very similar).  I hope I can address concerns.
  •  

Charles Pegge

#5
Quote from: Jeff Marshall on March 10, 2024, 02:25:17 PM
Quote from: Charles Pegge on March 10, 2024, 12:14:36 PMDoes your proposed string include a length field like an OLE-string?

That would be a BSTR / BSTRING, correct?  With the length hidden just before the data?

The freebasic STRING*N would be a fixed length string where length is known at compile time.
Should be compatible with QB, QB64, powerbasic (I think), STRING*N declarations.

So no length attribute stored in the attribute.  STRING*N occupies N bytes.

Thanks Jeff,

In OxygenBasic zstrings, I retain len as a variable function, which is determined by the nearest null char -1, as in C chars. But instead of len(myStr), bytesof(myStr) or countof(myStr) will give the allocated length of myStr.

Would this be useful to have such functions in FreeBasic?

Frank Brübach

#6
Thanks Jeff for your feedback and explanations.. I have a little question about freebasic...

Definition

Is BSTR a WCHAR PTR or a WSTRING PTR?

And I am curious do you Program freebasic with freebasic?  :)

Good Luck for your new programming ideas and realisations of freebasic

Frank 
  •  

Jeff Marshall

#7
Quote from: Frank Brübach on March 10, 2024, 04:22:26 PMIs BSTR a WCHAR PTR or a WSTRING PTR?
No, because BSTR needs a length field hidden just before the where the data pointer points to.
Yes, but only sort of, because the data part is an array of WCHAR's which in freebasic is mapped to the WSTRING pointer type. I have no idea if there is an ASCII equivalent of BSTR.

José would be the expert on this subject, and there is CBSTR class in the WinFBX framework that deals with all the gritty details.


QuoteAnd I am curious do you Program freebasic with freebasic?  :)
The compiler is written in freebasic.  The runtime library is written in C and assembler.  There is a port of the runtime library to freebasic source that is regularly updated and tested.  So technically, freebasic the compiler could be built from freebasic only source.  The graphics library is currently only written in C and assembler.
  •  

Jeff Marshall

#8
Quote from: Charles Pegge on March 10, 2024, 04:14:16 PMIn OxygenBasic zstrings, I retain len as a variable function, which is determined by the nearest null char -1, as in C chars. But instead of len(myStr), bytesof(myStr) or countof(myStr) will give the allocated length of myStr.

Would this be useful to have such functions in FreeBasic?

Not sure.  Does countof() figure out number of characters in a multibyte encoding?  (I downloaded your CHM to see if I can find the answer myself, but I didn't see the definition there).

freebasic has LEN() and SIZEOF()
SIZEOF() is always a compile time constant and reports the allocation size known at compile time.
LEN() could be a compile time constant or could be evaluated at runtime.  It depends on the expression.  If the expression is constant then LEN(expression) is also constant and can be optimized at compile time.  If the expression is variable (like a string-kind-of-thing with a variable length or a null terminator) then the length is evaluated at runtime.

So, probably? Or at least something like it if it's going to be necessary to determine allocated size versus used size versus decoded "length" either in bytes or in elements.

For the fixed length STRING*N changes the nature of the change is that len(string*N) == sizeof(string*N) == N

Charles, As mentioned looking to see how this change might affect larger code bases.  I downloaded the source for OxygenBasic, though I think may be older?  Anyway a quick scan of the files seems to indicate there isn't any use of fixed length strings, so I conclude that the change won't affect your project.
  •  

Charles Pegge

#9
I'll fix that omission in the manual. Thanks Jeff! spanof() is the same as countof(). bytesof() gives the whole space allocation.

OxygenBasic's ssizeof() refers to the size of the element rather than the whole array, hence the need for these other functions.

PS:
Oxygen was compiled using FreeBasic until 2017, when it became self-compiling. FreeBasic performed very well but I wanted O2 to be entirely toolchain independent.