FreeBASIC CWstr

Started by Juergen Kuehlwein, April 09, 2018, 11:39:00 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Jeff Marshall

Hi José,

I've thought quite a bit about the WSTRING meaning, and in hindsight would have made much more sense to name it ZWSTRING for symmetry within fbc's string types.  I have a plan that would rename wstring to zwstring over 2 releases, but it breaks everything to do with wstring, so pretty sure I will get every user angry at me.  Not impossible to implement, but seems impossible to justify, so I think we must live with it.  Maybe in fbc 2.0 I can break everything. :)

I wasn't specifically thinking of embedded nulls in Z|WSTRING when looking at length parameters; only that embedded nulls may be a potential (desired) side effect when combined with a UDT that extends Z|WSTING.  i.e. the rtlib STRING handling functions are same for ZSTRINGS and STRINGS, the difference being where it gets the "length" of string, either from data passed (as in a string descriptor), or from always performing a strlen() call.  I was thinking mostly in terms of speed/performance, that if the length is known, like in a UDT that extends Z|WSTRING and stores length data, that it would be preferable to use the stored length rather than always calling w|strlen(), especially for large strings.

But that's all just implementation in rtlib, not really what the user sees.  I agree, from the user's point of view WSTRING should always be considered zero terminated string and nothing else, because that's what we are advertising a WSTRING to be.

Jeff Marshall

JK, I added similar logic to ignore a self-assign, and tested, so I would say "it works".  Keep in mind though, depends on what the actual implementation of "this.clear" does; if an extra null terminator is written (double null terminated) to position [1], or memory is released (free'd) instead of writing a single null at
  • , then would need to have different logic when self assigning.

José Roca

Quote
I was thinking mostly in terms of speed/performance, that if the length is known, like in a UDT that extends Z|WSTRING and stores length data, that it would be preferable to use the stored length rather than always calling w|strlen(), especially for large strings.

That is what I do in my CWSTR class:


PRIVATE OPERATOR LEN (BYREF cws AS CWSTR) AS UINT
   OPERATOR = cws.m_BufferLen \ 2
END OPERATOR


Charles Pegge

Hi Jeff,

I've been quietly following this topic :)

String types are hard to nail down but adopt the 'w' prefix seems to provide the most logical naming scheme:

char wchar
zstring wzstring
string wstring

Juergen Kuehlwein

José,


i just wanted you to be aware of this...

Yesterday Jeff merged the compiler changes for ... EXTENDS WSTRING. Today i made a pull request adding ustring.bi and some USTRING specific tests. Ustring.bi still makes your CWstr the default for USTRINGs.

Do you want this ?

If your answer is yes, you may have to adapt your code in some places to be compatible with Jeff´s code (add "EXTENDS WSTRING", prevent self assignment for WSTRING PTR, remove the "CONST" specifier somewhere). There is no need for "m_pBuffer" to be in first place anymore, this one of the weaknesses of my code Jeff fixed.

"**" still works, but there shouldn´t be a need for it anymore.  I have a working USTRING test version for all statements dealing with files and paths (OPEN, DIR, etc.), but it will take me some time to rewrite it a bit to be compatible and restructure it for a pull request. This will be one of the next steps.

For comparison i attach the current version of ustring.bi here


JK

José Roca

Thanks very much. I will download the latest build to test.

The much hated ** workaround is the best solution that I could find after trying many others. It was not the ideal, but it worked very well. I'm very glad that Jeff has taken the trouble of modifying the compiler to allow better integration.

I will keep it because, otherwise, I will break all of the current Paul Squires' code, and also my framework.

Juergen Kuehlwein

QuoteThe much hated ** workaround
... was a very clever thing to code! It isn´t needed anymore now, but it still works. So no need to change any code.

The main question is, do you want your CWSTR class to be the default USTRING in Windows? Now is the time for such a decision!

If so, you will have to make it compliant to the new EXTENDS WSTRING feature. The necessary changes shouldn´t break anything. I didn´t try Paul´s code with Jeff´s version yet, but i did try with my compiler version. Removing all "**" from Paul´s code worked, it compiled, and as far as i can tell, WINFBE worked as usual. So i don´t expect major problems here either. But as "**" still works, Paul wouldn´t have to change anything.

Jeff´s new version passed all tests i wrote for own purposes when developing my version. So there are already two persons digging deeply into that matter, who cannot find any bugs anymore.


JK 

José Roca

> prevent self assignment for WSTRING PTR

Where?

I already have:


' ========================================================================================
PRIVATE OPERATOR CWstr.Let (BYREF cws AS CWSTR)
   CWSTR_DP("CWSTR LET CWSTR - m_pBuffer = " & .WSTR(m_pBuffer) & " - IN buffer = " & .WSTR(cws.m_pBuffer))
   IF m_pBuffer = cws.m_pBuffer THEN EXIT OPERATOR   ' // Ignore cws = cws
   this.Clear
   this.Add(cws)
END OPERATOR
' ========================================================================================



> remove the "CONST" specifier somewhere)

Why?

So far, the only changes that I have needed to do are:


#if __FB_VERSION__ < "1.07.0"
TYPE CWSTR
#else
TYPE CWSTR EXTENDS WSTRING
#endif


And remove a wrong cast in the functions AfxBase64EncodeW and AfxBaseDecodeW.

Juergen Kuehlwein

self assign is possible with WSTRING PTR too:

PRIVATE OPERATOR DWSTR.Let (BYREF pwszStr AS WSTRING PTR)
  IF m_pBuffer = cast(ubyte ptr, pwszStr) THEN EXIT OPERATOR              'ignore self assign
  this.Clear
  IF pwszStr = 0 THEN EXIT OPERATOR
  this.Add(*pwszStr)
END OPERATOR


see here http://www.jose.it-berater.org/smfforum/index.php?topic=5253.msg23916#msg23916



and i removed CONST here:

    DECLARE OPERATOR CAST () BYREF AS WSTRING
PRIVATE OPERATOR DWSTR.CAST () BYREF AS WSTRING       'returns the string data (same as **).
  OPERATOR = *cast(WSTRING PTR, m_pBuffer)
END OPERATOR


see here http://www.jose.it-berater.org/smfforum/index.php?topic=5253.msg23876#msg23876

I kept getting compiler errors, removing the CONST qualifier solved the problem. I´m sure Jeff has mentioned it too, but i cannot find it right now.


JK


José Roca

I'm not getting any error. Even Jeff is using byref as const wstring in https://www.freebasic.net/forum/viewtopic.php?f=17&p=261843#p261830

Juergen Kuehlwein

please try the following code:

dim u as ustring = wchr( 1234 )
      u = wstr(u)
      print u


and see what happens, fixing self asginment helps.


this one gives me a compiler error (with CONST):

      dim w1 as wstring * 50 = wspace(5) & "asdfghjklmnop"
      dim u1 as ustring = w1
      dim w  as Wstring * 50 = wspace(25)
      dim u  as ustring      = wspace(25)
      lset w, w1
      lset u, u1
      print u
      print w



without CONST it works properly


JK

José Roca

Ok. I have made the changes.

Jeff Marshall

Quote from: Charles Pegge on June 16, 2019, 12:29:54 PM
char wchar
zstring wzstring
string wstring


Agreed, hindsight is 20/20.  Except, I think what we have available in fbc is:

null terminated => zstring & wstring
var-len => string & ??string


It's possible to change current wstring meaning to be named wzstring, as in it is technically possible.  But, it would break so much user source code, I would be afraid that users would find my house and burn it down. ;)  So, I think we need to find a different name for "var-len wstring" to reserve, and live with the asymmetry of the type naming, forever.

Jeff Marshall

#223
Quote from: Juergen Kuehlwein on June 18, 2019, 09:15:32 PM
this one gives me a compiler error (with CONST):

      dim w1 as wstring * 50 = wspace(5) & "asdfghjklmnop"
      dim u1 as ustring = w1
      dim w  as Wstring * 50 = wspace(25)
      dim u  as ustring      = wspace(25)
      lset w, w1
      lset u, u1
      print u
      print w


Yeah, just choose whatever is suitable for your usage.

declare operator cast() byref as CONST wstring
- fbc should throw an error if used with lset, rset, swap, mid statement, or passed to a non-const parameter in a procedure.
- this is useful if the supporting class has it's own rules, for example ensuring that the string is double-null-terminated, length member is set, etc.
- because, writing to the raw wstring data bypasses any logic in the class let operators or constructors

declare operator cast() byref as wstring
- this should allow the raw wstring data to be modified without any warning or error, expecting that the user is aware that any special logic in the class operators is bypassed.

José Roca

In my string functions in AfxStr.inc, I have added CONST to all the wstring parameters (only to avoid compiler errors if the user passes a constant string/wstring), except one in AfxStrPathName (BYREF wszFileSpec AS WSTRING), because although the function does not modify the passed parameter, it gives me an error when I try to assign it to an instance of CWSTR (cws = wszFileSpec).