In The Beginning Was The Word

This post started as an email to @eikeon.  Our conversations on language design are moving forward at a lumbering pace, and I want to make sure that some of them are set down, for two reasons: so we don’t forget where we are, and so later we can look back and see how far we’ve come.

Term

We have talked about terms as a centerpiece of our language.  One reason for this is to craft a language that is oriented around humans, who generally seem to have an easy time with the concept of a term.  But it is not particularly problematic to think of a term as an object or a function or even a closure if you’re wired that way, e.g. you are a being made partly of wires.

Alphabet

The term has a label, which is just a contiguous sequence of characters.  We haven’t talked a lot about which alphabet we’ll use, but I’m guessing it will either be something Unicode-ish, or we will leave it open to the lexer.  If we do the latter, the alphabet will be a set of letters from which terms may be spelled (e.g. {a, b, c, d, e, f}).

Lexicon

For any given program, there is a corresponding list of entries, which are the lexicon for the program.  The entries are terms which must be spelled with whatever alphabet is used.  No terms may be used in the program that are not in the lexicon.  The list of terms contains a corresponding list of definitions- zero or more per term.

Entries

A an entry is either empty (the term may only be replaced with itself) or contains two parts: its shape and its definition.  The shape of an entry can be thought of as its part of speech, or as a named graph with the term at the center.  The definition, when it exists, consists of one or more sentences.

Part of Speech

The part of speech, or named graph, is part of a grammar.  The grammar of a language describes which parts of speech may be employed in sentences.  A part of speech “connects” to other parts of speech (or named graphs).  This can be thought of as the local structure on an abstract syntax tree for those more inclined to think that way, e.g. robots.

Sentence

A sentence is a collection of terms that are evaluated concurrently.  This is not to say that word order in a sentence is unimportant, as the first term encountered is the first one for which evaluation begins, and subsequent terms in the sentence may rely upon this fact.

Paragraphs, etc.

Sentences may be collected together into groups and called a paragraph.  A paragraph, simply put, is a group of sentences over which side effects may be relied upon to persist.  To the extent that paragraphs are collected together into larger collections (a section, perhaps, or a chapter) the larger collection is a group of paragraphs over which side effects of the paragraph may be relied upon to persist.

Side Effects

That word, “side effects” is also at the center of this language.  Terms do not return a value like a function or a method- instead, they transform into something else, and register this transformation with the other terms.  In this sense, each term can be thought of as a simple callback if you’re inclined that way, e.g. you are Donald Knuth or someone smarter than me, at least.

Concurrency

When we say the word “concurrent” it is as a counterpoint to the word “sequential.”  As such, the statement “A, then B, then C,” is different from the statement “All of A, B, C.”  In the former, perhaps B depends upon A, and C upon B.  In the latter, each depends upon the other (or none depend upon any, if you prefer to think about it that way, e.g. you are way out there).

Fragments

Last, but not least, a sentence fragment may not be parseable.  Think of book titles or recipe ingredients- these are intentionally fragments.  When a fragment is not meant to be parsed, this is a way of registering that it will be used subsequently, and it is not considered part of a paragraph.  E.g. no side-effects persist, but the fragment itself is kept around for use later in the program.  A fragment has an implicit part of speech (shape, but not a named graph, since it has no name).

Lexing

Lexing is the process of looking sequentially through a sentence and chopping it into terms.  This process requires a lexicon.  Note: lexing, lexicon.  The lexer consumes letters until it finds a term.  It then passes this term to a parser instance and branches.  One branch starts on a new term for that first parser instance, and the other continues to consume letters until it can be certain there is no alternative lexing.  If it finds an additional possibility, it passes the term to an alternative parser instance.  As such, lexing is deterministic, but it also may yield more than one parser instance.

Parsing

Each parser instance is given a set of terms by the lexer.  As they are encountered, each term is looked up in the lexicon, where zero ore more definitions are present.  If there are zero definitions present, the term is replaced with itself.  If there is one definition present, the term is replaced with that definition.  If there are two or more definitions present, the parser chooses which to use by examining the part of speech and seeing which will work in the context of the current sentence.  If this cannot be determined, the parser branches again, and both are tried.

Compiling

The process of compiling is essentially going through all the possibilities and following them to the bottom until every single term is replaceable only by itself.  Once this has been completed, the entirety of the program is represented by a graph of terms, where the shape of the graph locally is determined by a part-of-speech.

I Do Not Stand Up for Stallman

There has been an interesting little tempest in a teapot in the Free Software community, which has resulted in some blog responses:

I have a slightly different perspective on this.  I do not stand up for Stallman, because he has no need of my support in this regard.  Stallman is perfectly capable of standing up for himself, but seems to consider it beneath him to spend time on it.  Instead, he spends his efforts standing up for Free Software.

He’s not some kid to feel sorry for, he’s a fucking MacArthur genius.  He invented Emacs.  He wrote the GPL.  He founded the Free Software movement.  He’s taken on huge corporations and tyrants and all other manner of adversaries (and friends) with their slings and arrows affecting him like water affects a duck’s ass.  He dines with presidents and prime ministers.

The origin of all this kerfluffle is some email that’s going around which makes Richard sound particular about his accommodations when he comes to town to give a FREE SPEECH.  Well, let me say this: Richard stayed at my place in DC during some such speeches, and he never once complained about anything.  At the time I lived in a bachelor shit-hole in a neighborhood in DC that was basically an open-air drug market.  Not.  One.  Complaint.

A fucking MacArthur genius, sleeping on my couch, in my shit-hole apartment, and the only thing he ever asked for was fucking wifi.

I don’t stand up for Stallman, I look up to Stallman.

Way up.

If you don’t, it means one of two things: either you don’t know about Stallman, or you’re a tool.  I don’t stand up for Stallman, but I look up to him.  And I run GNU/Linux, and I use Free Software.

Because Stallman said so, that’s why.

Musings on Concurrency

Programming languages have long-since learned to short-circuit sequential operations.  In damn near any language, if I do the following:

thing one && thing two

I only get to thing two if thing one evaluates true.  Even if you’re not a programmer, you probably do this in practice.  Consider:

If it’s dry and warm, we’ll play outside.

Which means, if I get to the first part of the sentence, and see that it’s raining, I can just kind of disregard the rest.  I don’t have to even read to see what we were going to do if it were dry and warm, because it’s not dry.  We do the same sort of thing with “or.”  Consider:

We’ll play outside if it’s wet or if it’s dry.

If it’s wet outside, we don’t even need to read the last part of that sentence.  We short-circuit the final operation, because it’s not necessary for evaluating the truth of the whole construct.  The short-circuiting pattern is important for computers, because in the best-case scenario, it can save a some of computation.

In fact, if you imagine a sentence that includes a list to be evaluated in short-circuit mode, it can save a whole lot of computation:

We’ll play outside if it’s wet or if it’s dry or if it’s cold or if it’s hot or if there’s a hurricane.

Chances are, we’re going to play outside, and if it’s wet, we save the trouble of evaluating all the subsequent possibilities.

Now, for computers, there are some downsides to doing this.  For instance, it’s expected that in order to save time and in order to keep everything consistent, these operations need to happen sequentially.  A && B && C is different than B && C && A, even if the operations don’t have any side effects.  If there are side effects of the computations, the situation is even worse, since in one case maybe you get the side effects of A and B, and in the other you get the side effects of B and C.

If you’re not a programmer, you can skip the part about side effects without any side effects, I think.

What we’re missing is a systematic way of thinking about short-circuit operations that can happen in any order, and can happen either sequentially or out of sequence or concurrently.  It’s not that there isn’t special built-in machinery to do this in particular languages, but for a polyglot like me, there’s very little overlap between one way of thinking about it and another.

The funny thing is, English has a great many ways of designating a list as a short-circuit concurrency operation:

  • any
  • all
  • none
  • one
  • some
  • more
  • fewer

If you’re going to the store and I give you a list, I can gesture at the list with any of these words to let you know when you can quit.  ”Any” of the items means you can stop when you find the first one.  ”All” means you can get them in any order, but you need to get all of them.

We should be able to do the same thing, specifying a kind of concurrency and a kind of short-circuiting in programs with the same kind of shorthand.  Imagine I have a list of operations to perform, but I only care that at least six of them happen.  Or I know that if any of them fail, I can stop all of them.  Or if any of them succeed I can stop all of them.

I think that’s a level at which a programmer (at least the kind of programmer I am, which is to say one with more hubris and laziness than usual) should interact with a language’s underlying concurrency model.

Larry Wall put this idea into Perl 6, though I’m not convinced the semantics are exactly right.  For instance, I’m not sure if I should generally be saying:

someMethod(any(1,2,3))

or

any(someMethod(1), someMethod(2), someMethod(3))

The former looks terse at first glance, but requires a lot of bookkeeping inside of someMethod that might be avoided, particularly in languages that don’t have the notion of a Junction (that’s what he calls this kind of construct) baked in.  The latter looks verbose at first glance, but one can imagine cases where it would be useful to evaluate several different methods and/or put some syntactic sugar around the actual call.

I can also imagine a kind of chaining of these terms together in a way that creates a massively concurrent operation that can return quite quickly in the best-case scenario:

if any(all(none(1,0,0), all(1,1,1), one(1,1,1)), any(1,0,1))

then someMethod()

This could use the same truth-logic as usual boolean algebra, but could also short-circuit efficiently in a way that would be more difficult (or require more sophisticated machinery) if those same calls were the usual ands and ors that we would use in such code.

I don’t think the machinery for this is too complicated.  It would, of course, matter how the underlying concurrency was managed.  In something like Go or Erlang, it would probably be quite fast, and in something that required a heavier process for each unit of concurrency, it would probably only be worthwhile in the case of heavy computations that could benefit from the short-circuiting.

Last, but not least, I can imagine mixing this together with a bit of map-reduce.  So, if I do something like:

all(someMethod(“foo”), someMethod(“bar”))

I can write these methods to also stash a result somewhere (nasty side-effect, that) but only use it if the computations complete successfully according to the all operator (e.g. none of them return false and all of them return).