Skip to content
10 April 32011 / Robin Wellner

Lith revisited

A while back, I wrote about Lith, a minimalistic language.

Yesterday, I decided to implement the rest of the language, which was surprisingly easy. (Of course, Lith is still a simple language, and I have more experience with this sort of thing now.) It’s up on GitHub now.

In today’s post, I discuss the theory and implementation of the latest version of Lith.

First, I added numbers, such that 5 was recognised as [ T T T T T ]. Next up, strings. "Hi" turns into [ [ T T T … T ] [ T T T … T ] ]. After that, I added operations (head, tail, test, length, concat, and later flatten), with a generic Special class, which takes as argument the next element in the source.

Now, the operations were not processed, only parsed. I wrote an evaluator, similar in structure to the parser, which took its input from the output of the parser.

An overview of the operations:
!, head
Takes the first element of a list. Is the identity function for T and [ ].

! [ T ]             # = T
! [ [ T T ] [ T ] ] # = [ T T ]
! 34                # = 1
! [ ]               # = [ ]
! "Hello"           # = 72

$, tail
Returns everything but the first element of a list. Is the identity function for T and [ ].

$ [ T ]             # = [ ]
$ [ [ T T ] [ T ] ] # = [ T ]
$ 34                # = 33
$ [ ]               # = [ ]
$ "Hello"           # = [ 101 108 108 111 ]

?, test
Tests for truth. T is true, [ ] is false and any non-empty list is true.

? [ ]  # = [ ]
? T    # = T
? 3    # = T
? "Hi" # = T

&, length
Returns the length of a list. This means that for any number, including zero, length is the identity function. The length of T is defined as [ ].

& [ ]                     # = [ ]
& T                       # = [ ]
& 3                       # = 3
& "Hi"                    # = 2
& [ T [ T T [ T T T ] ] ] # = 2

:, concat
A bit of a weird operator, as it is infix instead of prefix. Concat is a bit like cons in Lisp: it prepends an element to a list,

T : [ ]   # = [ T ]
[ ] : [ ] # = [ [ ] ]
[ T ] : 4 # = [ [ T ] T T T T ]
T : 4     # = 5
!"H":"i"  # = "Hi"

_, flatten
Flatten can be understood to return the total number of Ts in its argument.

_ T                   # = [ T ] = 1
_ [ 4 T T [ [ 1 ] ] ] # = 7
_ [ ]                 # = [ ] = 0

When that was done, I wanted to simplify the tree I had to text the original parser could understand (so only T and [ ]). So I did.

Although the original parser ignored whitespace¹, several separator characters and #comments, I decided to skip all that, and just pasted all the elements together. That output is what I call the canonical form of a tree: the simplest form, free from any syntactic sugar or operation.

As of now, Lith doesn’t have any flow control (I still haven’t found a way to do functions), so it is actually more of a data description language than a programming language.


¹ There are two places whitespaces matter: between two numbers: 1 1 is entirely different from 11, and inside strings, for obvious reasons. But the original parser didn’t recognise numbers or strings, so all’s good.

Advertisements

5 Comments

Leave a Comment
  1. Luiji Maryo / Apr 10 2011 18:56

    What if you were to modify the language so that each operation was represented as a number? For example, head would be [T], tail would be [TT], concat would be [TTT], etc. Then the various symbols would simply evaluate to one of these numbers. This would require concat to become a prefix operator instead of an infix operator, unless your parser were to transform []:[] into [TTT] [] [] or something similar. Thus, when creating a function you would simply be able to create a list in which each internal list is a list of operation lists. I.e. head then tail would be like [[[T] “Hi!”] [[TT] “Hello!”]]. Thus, functions could simply be lists that were evaluated with the ~ symbol (for example) followed by one of these lists.

    As for control flow, you could simply do what Tcl and Scheme do and have control flow be controlled by operations such as [TTTTTT] for “if” that would accept a condition followed by two function lists.

    Lastly, this language could probably do with the ability to define variables. I have a few ideas myself on how this could work but all of them seem a bit hackish.

    Good luck and happy coding!

    • Robin Wellner / Apr 10 2011 21:29

      Interesting suggestions.

      Since writing this post, I’ve added new operator @, which substitutes itself with another list, and ( ) notation for lists that are not directly evaluated, only when substituted.

      The problem with making operations simply numbers, is that it could be ambiguous, unless a lot of crufty context would be added.

      I’m thinking about a notation that allows operations to take something from the stack, for example &. for “take the length of the top element on the stack”.

      • Luiji Maryo / Apr 11 2011 0:15

        The substitution mechanism sounds very useful. The @() combination sounds a bit like how C does it’s preprocessing (substitution before evaluation.)

        I don’t think using numbers as instructions would be anbiguous if you were to have the parser substitute i.e. ! with [T T] since it would look the same to the user but simply appear differently in the internals of the language.

        A stack based architecture sounds very useful for this form of language. Personally, though, in my own theoretical virtual machine designs I prefer register-based computing since it feels more natural to program with as a high-level developer.

Trackbacks

  1. Lazy-Lith « gvxDev
  2. Lazy-Lith « gvxDev

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: