Improvements in Programming Languages

Started by Charles Pegge, May 12, 2007, 10:38:30 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Charles Pegge


There are over 8000 recorded programming languages, past and present, so inventing them is not that difficult.
With many of these languages, just about any computation is possible, but with varying degrees of difficulty and complexity.

What makes a language easy to use? Will it compile and execute efficiently? Is it easy to read, - to trace errors or modify? Can the language be used to develop itself or generate new code?

What are the strengths and weaknesses in moderm programming languages like C++ BASIC Java, Python or XML?

Is it possible to take the best features of all of these languages and meld them into one unified system that is not deficient in any quarter? Would the result be a monster?

These questions, I believe are answerable, when it comes down to specifics, and the possibilities of simplification will become apparent.

This thread is for some dreaming along this theme, and since this site is mostly about PowerBasic, that will be a good starting point.



Charles Pegge

#1

BASIC and C

Developed in 1963 as a computer science teaching aid, Beginners All-purpose Instruction Code, was designed to make short, easy to understand code that could be picked up rapidly by people with very limited experience.
because it was a compact but complete language, it was widely adopted as a standard for home computers in the late seventies, usually on 8 to 16kbyte ROM. Turn on the computer, and up came the BASIC editor after about 1 second.

With the rise to dominance of the Microsoft based PC, C/C++ became the language of professional software developers and BASIC, as a standard language, supplied with the PC was left to languish in its primitive form.

This gave the opportunity for other developers to come in, but resulted in divergent versions of the language.
Turbobasic, later to become Powerbasic  made its appearance in the late 80s, as an exceptionally efficient and well specified compiler.

In the 90s Windows and its graphical user interface displaced most MSDOS applications and drove the development of BASIC to be able to interface the complex operating system calls needed to drive windows, and interface with windows applications. This required the adoption of a number of C type constructs eg pointers, and passing parameters by value.

But While Powerbasic can do most of the things that C does, it has not gone all the way to acquire fully fledged C++ capabilities. Judging by the misuse of C++, resulting in the massive inflation of code size in Windows based systems, and tangle of complexity, this is quite understandable. Even the C++ Stream library which basically manages strings anf files adds 500k of code to the executable. But C++ has many virtues, that do not lead inevitably to monstrous code.

Anyway here is a cursory list comparing the two languages.

The Virtues of modern compiled BASIC

Keywords with obvious meaning.
Block structured programming.
Rich Kernel of Functions.
Built in Strings handling.
Built in Memory management / automatic garbage collection.
Macros
contains most of the functionality of C

The virtues of C++

Scoping of variables in namespaces and blocks.
Initialisation of variables at declaration.
Encapsulation.
Overloading.
Object Oriented Programming: Inheritance, classes and objects.
elemental enough to form basis of higher level languages.
widely used across many platforms.


Some BASIC vices

verbose syntax.
ad hoc syntax eg File operations.
The name itself: BASIC is sophisticated not basic.


Some C++ vices

confusing use of symbol combinations.
multiple meanings of curly braces - difficult to read.
features leading to bloated code: Templates.
no automatic garbage collection.
lack of intrinsic functions - just about everything requires a library / include.

Combining the languages

Powerbasic already has most of the C capability, using different words/symbols, therefore adopting C++ extensions would not mangle the syntax in any way. In fact it would produce a much clearer expression of the logic by dispensing with many of the confusing symbol combinations, overused in C/C++.


Some weakness in both BASIC and C++

missing interpretive or Just-In-Time compilation capability.
cannot support Functional Programming paradigm.

These are weaknesses that are inevitable in any purely compiled language. For the flexibility that is demanded by many situations, an interpretive layer is required or a Just-In-Time compiler as part of the run-time system. One of the causes of bloating in C++ is trying to make a statically compiled language look flexible. - all variations in parameter types have to be catered for prior to execution. The same is true for BASIC but its generous built-in string handling capabilities, make interpretative operations much simpler to accomplish.

...to be continued...

Charles Pegge

#2
SIMPLIFYING PROGRAMMING LANGUAGES

Removing unnecessary syntax and structures, makes a language easier to learn and easier to check for errors.

Some radical ideas!

Operator precedence

by observing strict left to right evaluation of an expression, all ambiguities are removed for a minimal cost of a few extra brackets and all ambiguities are removed.
Compilation is simplified.

Control Structures

IF ELSEIF ELSE CASE FOR WHILE REPEAT REDO EXIT can all be replace with a unified construct which simplifies logical checking. It goes something like this:

{
if a then exit
if b then
  ...
end if
if c then repeat
if d then goto label
}

..
label:

The curly braces delineate a block.
There is a single line 'if..then' and a multi-line 'if .. then .. end if.'
'exit' forces an early exit to the end of the block.
'repeat' directs the execution back to the beginning of the block
'goto' can be used specifically to jump out of nested blocks or to jump over other blocks.

Thus Occam's razor has been applied and no other control structures are needed.

Not only does this make programming logic cleaner, it also simplifies compilation, which is
essential if Just-in-Time methods are used.

...

Charles Pegge

#3
< :)> SUPPORTING MARKUP </ :)>

Reserving <> Brackets

There are not enough bracket types in the basic ascii character set to do all the things we need to express. In particular, there is a fundamental conflict between the inequality symbols and brackets used in markup languages, and other uses. Syntactically the simplest way to resolve this is to do away with the inequality symbols and replace them with assembler-like mnemonics thus:

LT <
LE <=
GT >
GE >=
NE <>

also

EQ ==
This does away with the confusion between 'equality' and 'assignment' by reserving '=' exclusively for the latter.


if (a GE 42)and(b EQ 80) then c=4

As you can see, the solution is not verbose and in my view reads better.

We now have a pair of symbols which are available exclusively for use as brackets, and no longer need to continually switch contexts. And one of the main uses for these brackets is Markup expressions.


Using markup ro define objects and complex data structures.

It goes without saying that life on the internet without markup languages is almost inconceivable. Within procedural languages too, they could be used for declaring, building and manipulating objects and data structures.

In an interpretive language, objects can be represented in a string, for example:

screwA="_
<type> screw</>
<material> steel</>
<coating> phosphate</>
<diam>#3.5</>
<length>#38</>
<thread> double</>
<head>
   <shape> bugle</>
  <top> posidrive</>
</head>
"

Its elements are referred to like this:
srewA.length
screwA.head.shape

New objects are created simply by copying the string content:
new screwB=screwA

which in turn may be modified in several ways:

  screwB.head.top="slotted"   changing a property

  screwB+="<cost>#0.02</>"  adding a property

  screwB.head+="<diam>#6.0</>"  inserting a property

  screwB.head=""  removing a group of properties

If necessary code can also be efficiently embedded in a markup field.

What is proposed here is a very simple form of markup language where attributes are not used within tags. The tags only contain names. And the end tag may or may not contain the name. That is just a matter of clarity.





....

Charles Pegge

#4
PASSING AND RETURNING PARAMETERS


Functions with default parameters

function CreateWindowX( x=100, y=100, width=512, height=256, sysmenu=0 )
...
end function

various ways to call the function:

  CreateWindowX( )  use all default values

  CreateWindowX( 200, 200 )    set x and y positions only

  CreateWindowX( , , 250, 250 )    set width and height only

  CreateWindowX( sysmenu=1 )    use all the defaulr values except for sysmenu

  CreateWindowX( width=800, x=50 )  use all default values execpt for x and width

Passing blocks of data by reference

For sets of contiguous data.

new a[100]=-1   Create an aray of 100 numbers of value -1
a[10]=1,2,3,4,5,6,7,8,9,10 now put some numbers into the array starting at a[10]
lookat(ref a[12])

...

function lookat(p)
p[0]?
p[1]?
p[10]?
end function

results:
12
13
-1



end function

Returning data by reference

If a block of data is passed by reference, it is possible to write back into any of the elements.
To ensure clarity of intention in the code, any variables that were not created within the function,
should be preceded by set before they are written to.

example:

new vector[100]

vector[1]=1,2,3
addv(ref v[1],2,4,6)
$ Result: `v[1]` `v[2]` `v[3]`  Result: 3 6 9
...

function addv(v,a,b,c)
set v[0]+=a; v[1]+=b; v[2]+=c
end function






..to be expanded..


Eros Olmi

Charles,

very very interesting discussion about programming languages.
I will consider some of your suggestion in thinBasic: default values for parameters passed BYVAL and mnemonics for logical operations

I will continue to follow your thoughts on that.
Thanks a lot
Eros
thinBasic Script Interpreter - www.thinbasic.com | www.thinbasic.com/community
Win7Pro 64bit - 8GB Ram - Intel i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB
  •  

Charles Pegge

Thanks Eros,

I am testing these ideas as I go, on an experimental  scripting language. The current version is in FreeBasic, running on Linux and Windows. The logic around default parameters can get quite complex.

If you would like to see a work in progress, its about 50k source, 100k compiled, being tested and updated on a daily basis.

   http://www.pegge.net/xfers/run.bas
   http://www.pegge.net/xfers/run.exe
   http://www.pegge.net/xfers/main.pro

  sample extension library

   http://www.pegge.net/xfers/module.bas
   http://www.pegge.net/xfers/module.dll

  type run at the console

I hope to do a proper release soon!

Eros Olmi

Thanks a lot Charles.
I will for sure test it this night when back to home.

I'm happy to say I've already implemented default values for all parameters passed BYVAL in thinBasic functions. It is a great addition and works perfect.
You have great ideas.

Ciao
Eros
thinBasic Script Interpreter - www.thinbasic.com | www.thinbasic.com/community
Win7Pro 64bit - 8GB Ram - Intel i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB
  •  

Charles Pegge

#8
Simple Object Oriented Programming

This is a very simple scheme that does not assume a taxonomic tree of classes but will support one if required.

   All objects are stored in markup strings

   methods or functions are related to their objects by the <type> specified in the object.

  functions may defer to 'parent' functions or any other functions  explicitly

  objects calling functions pass an invisible parameter refering to themselves called this



Taking the previous example of a data structure slightly extended:

new DryWallScrew="_
<type> screw</>
<material> steel</>
<coating> phosphate</>
<diam>#3.5</>
<length>#38</>
<thread> double</>
<head>
   <shape> bugle</>
  <top> posidrive</>
</head>
<supplier>
  <name> Screwfix</>
  <price>#0.01</>
</supplier>
"
new ScrewA=DryWallScrew

ScrewA.buy( quantity=1000 )

..

function screw.buy hardware.buy //linking function to another type or class

...

function hardware.buy(length=0,quantity=1)
new t=length
if not t then t=1
t*=quantity*this.supplier.price
$ Supplier `this.supplier.name`
if length then $ Length `length`
$ Quantity `quantity`
$ Unit price `this.supplier.price`
$ Total price `t`
...
return t
end function


..to be expanded..

Eros Olmi

#9
I'm sorry Charles but here I start not to follow. I did some OOP and have some OOP theory but never get too much deep.
I need some time to study your RUN sources and get the inner data connections.

For the moment I will stole some of your ideas (sorry :D )
I have already added some in thinBasic: http://community.thinbasic.com/index.php?topic=890.0
Will continue to follow your post here. Also interested in FreeBasic. We are making a thinBasic SDK in order to use FreeBasic as development environ for thinBasic modules so ... even more interesting this post.

Thanks a lot
Eros
thinBasic Script Interpreter - www.thinbasic.com | www.thinbasic.com/community
Win7Pro 64bit - 8GB Ram - Intel i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB
  •  

Charles Pegge


Eros, I'm putting in a few more notes and cleaned up the example above. It is a very experimental form of OOP, the main innovation being the use of markup strings to define the object, but I think the only way I am going to grasp the subject is to come up with a system I would enjoy using. So I have ignored much of the c++ stuff.

My source code has very few comments, and some of the parsing logic is quite complex. It partially tokenises
and links the script at load time, but some of the linking happens later during run-time, and some of the script
is left untokenised where flexibility is required. So good luck trying to make sense of it!

Adding extra internal functions and extension library functions is pretty easy though.

Eros Olmi

Quote from: Charles Pegge on May 18, 2007, 02:34:32 PM
but I think the only way I am going to grasp the subject is to come up with a system I would enjoy using. So I have ignored much of the c++ stuff.
I like that. I would have never created thinBasic without the pleasure to experiment and follow personal idea.

Quote from: Charles Pegge on May 18, 2007, 02:34:32 PM
My source code has very few comments, and some of the parsing logic is quite complex. It partially tokenises
and links the script at load time, but some of the linking happens later during run-time, and some of the script
is left untokenised where flexibility is required. So good luck trying to make sense of it!
If the links you gave here will remain valid I will follow your improvements with interest. My target is not to understand full details but get the main design. Always interested in parsing, tokenizing, interpreting stuff, really. I'm finding your ideas brilliant, I will follow with interest even if I will not post.

Quote from: Charles Pegge on May 18, 2007, 02:34:32 PM
Adding extra internal functions and extension library functions is pretty easy though.
Well, same here. Once there is a general method it is not so difficult. Problem is to find a general way that is so "open" and general to be valid for future implementations even if right now future implementations are not even in my thoughts.

Ciao
Eros
thinBasic Script Interpreter - www.thinbasic.com | www.thinbasic.com/community
Win7Pro 64bit - 8GB Ram - Intel i7 M620 2.67GHz - NVIDIA Quadro FX1800M 1GB
  •  

Charles Pegge

#12
Passing a Function as an Argument

do_many( 3, "greeting()" )
end

function greeting()
$ hello!
end function

function do_many( i, f )
(
exec f
if --i GT 0 then repeat
}
end function


result:
hello!
hello!
hello!



A Program to Write a Program

new fi=output("hello.pro"); out(fi,"")
$ function greeting()
$ $ ------------
$ $ Hello World.
$ $ ------------
$ end function
close(fi)

$ execute hello
load "hello"; greeting();unload
$ done hello

Result:
execute hello program
-----------
Hello World.
-----------
done hello


Passing a Program as an Argument

do_many(2,"load 'hello'; greeting(); unload")

Result
-----------
Hello World.
-----------
-----------
Hello World.
-----------



...

Theo Gottwald

#13
I generally like new revolutionary ideas in programming.

There are finally two directions:
a) easy to use languages which follow the way humans are thinking and working
b) machine oriented languages

The machine oriented languages have much shorter programms.
As a beginner you may have trouble to read them.
As a Pro, you save time compared to the human oriented languages.

I remember the time, when I was programming in FORTH.
I really liked the idea with the stack, the UPN etc.
FORTH is clearly machine oriented.
It is still the language which solves problems with the shortest code.
Besides APL.

Most people don't know APL. APL (for those who know it) is not only a programing language, it is a mathematical way of describing algorhytms or problems. It will even improve the way a student is thinking. Thats why APL is really a good recommendation for scientific students.
Many commands are just 1 character long, because APL has an own keyboard-Layout with  (mostly mathematical) functions. APL was my favourite besides FORTH.
Because the code was really short.

To type a mathematical Sollution in APL  or a program in FORTH just saves time.
You don't need to type that much. While it may be harder for someone else to read it later.

Some of the APL Elements can be found in Euphoria.

Writing in FORTH, the programmer needs to have a paper near the keyboard.
He needs to write down "whats on the stack" actually. I doubt this could get popular these days.
While it has advantages.

Some of the APL Elements can be found in Euphoria.
Euphoria and APL have an interesting way to remove inner Loops. See ->
See http://www.rapideuphoria.com/.

You just give an array to a command. And most commands just work on Arrays or parts of an array.
This means: You just don't need a Loop-Statement.

From my point of view, just looking at languages like C or Basic is not enough.
Because the differences dissapear with time. What left are just a bit diffrent syntax definitions.

To get a better overview what concepts are out there, a closer look on really different concepts is recommended.

Here are some suggestions:
APL (very short code!),LISP, PERL, FORTH, EUPHORIA, SCHEME, D++ (http://www.digitalmars.com/d/),EIFFEL and of course there are some missing.

Where i can absolutely agree to what you write is, that the compilers should use the power of the new cpu's to get more intelligent.

For example:

If I write:

FQR a=0 TO 10

The IDE CAN find out that this could be a FOR-statement and do what GOOGLE does with wrong input under worse conditions. Ask directly:

Do you mean "FOR a= ...."
Click ok, to correct the mistake.

There is a lot more about this.

Charles Pegge

#14
Thanks Theo, if all these languages can steal each others best ideas, we will all have a much better programming experience.

I did some experiments once trying reverse polish notation (like Forth) in the script but ran into logistical problems, when it came to using optional or default parameters. The script internals resolve the parameters and function calls into a stack, or in my system a queue before execution.

I know little about APL except that it used a number of specialised symbols, wich made it easy to program, for the mathematically minded but difficult for others to read afterwards. But I knew some programmers in the 80s who worked for I P Sharp and thought it was the best thing ever invented.

The Functional languages you mentioned, Eiffel etc,  are interesting in that they are mathematically rigorous and will not allow a variable to be assigned a value more than once, and this helps to deliver error free code. They also allow functions to be passed as arguments to other functions, which is quite easy to adopt in a scripting language, but very troublesome for static compilation.

As I see it, computer languages are multilayered, with very fast but rigid processes at the lower levels, and slower but much more flexible processes above. It should be possible however to automatically compile static processes down to machine code to get the best of both speed and flexibility, within a single language.
To do this the run-time module must have a simple compiler at hand ready to perform JIT compilation, when needed.

The "no loops" way of operating on arrays of data is quite easy to implement: put the size of the array into its element 0. My script is half way there. The loop is syntactically minimal.

new a[10] ,i=1
{
  ... ;  if ++i LE a then repeat
}

but this could also be put into a function called iterate to give you a single liner.
Example multiplying each element by 100:

iterate(ref a, "a[ i]*=100")

...

function interate(a,f, i=1)
{
exec f;  if ++i LE a then repeat
}
end function

OOPing the syntax

  a.iterate(' this[ i]*=100 ')

...

function .iterate(f, i=1)
{
exec f;  if ++i LE this then repeat
}
end function