My Mental Model of a Tree

We’ve planted a lot of trees at Right Field Farm.  Fruit trees, mostly: few kinds of apples, walnuts, plums, cherries, pecans, figs.  But we’ve also moved around some trees that were already here (cedar, maple, pine) and planted some trees for ornamental purposes and to enhance the view.  Besides all the planting, we’ve taken down a few dozen sick, damaged, and crowded trees, and watched all manner of trees grow from year to year.

Doing all this has challenged my mental model of a tree.

See, I used to think of a tree as an organism, whose purpose was to make more trees.  All the different parts of the organism work together to accomplish this laudable goal.  But the more I’ve watched them, the more I’ve come to think that a tree is more of an ecology of leaves more-or-less in direct competition with one another for energy and nutrients.

Trees, in other words, are benign.  Leaves, however, are mean.

Component of Term Oriented Programming

@eikeon and I have been circling around term-oriented programming (even that term is just a placeholder- that’s how far to the outside of this particular whirlpool we still are).  We had a chat that got me thinking about context, and in particular about what it would mean for the context to have first-order existence.

The core idea of term-oriented programming is the term itself.  A term is basically an entry in a dictionary, which has three attributes:

  1. One or more characters in sequence.
  2. A description of its graph structure.
  3. A definition made up of one or more terms.

As always, if it sounds like something you already know about, that’s probably because it is.  Don’t get hung up on it yet.  A term might look like this:

lentil (noun) Lens culinaris.

I’m borrowing from @eikeon’s favorite example here…  One of the core ideas of term oriented programming is that a term may have multiple entries in any given lexicon.  For instance:

lentil (adjective) Made with lentils.

It’s a noun in the sentence, “Add one cup of red lentils.” Roughly speaking, one can replace the word with its definition, and each statement will still make sense: “Made with lentils soup” but not “Add one cup of Made with lentils.”  Once determines which form to use by looking at the other terms in the statement, and deciding which form would fit into the graph of meaning for that sentence.

That’s still kind of a rough example with some gesturing, but it becomes less so when we start unpacking the part of speech.  Noun and adjective are words used to describe the graph of meaning in another word.  We call something an adjective because it makes sense in the context where an adjective is warranted. We call something a noun because it makes sense where a noun is warranted.  If the same characters in sequence (e.g. “lentil” and “lentil”) can be either a noun or an adjective, we simply say the word has two definitions and use the description of the grammatical graph (e.g. the part of speech) as one way to distinguish between them.

Of course, it’s also completely reasonable in the real world that there would be more than one noun form, at which point we may need to examine context outside of a single sentence.  More on that later, I hope.

Ad Hoc Meeting of the Outer Banks Cosmological Association

In a recent Ad Hoc Meeting of the Outer Banks Cosmological Association, it was resolved that neither logical thinking (whereby agreed-upon axioms are used to build up more complex statements) nor empirical thinking (whereby one compares a hypothesis against what one has observed and determines whether the two match) shall be deemed sufficient for categorically eliminating further discussion on a topic.

Further, it was resolved that so-called “magical thinking” shall henceforth be referred to by the name “prophetic method.”  Prophetic method can be considered a counterpart of both the scientific method (e.g. empiricism) and the axiomatic method (e.g. logic).  In the prophetic method, one determines a course of action based upon criteria that need not be divulged, and attributes subsequent good or ill fortune as an indication of the success of the original course of action.

The discussion leading up to the resolution centered upon a particular point of debate, which can be summed up thus: if a particular phenomenon is impossible (e.g. it can be logically determined that the phenomenon cannot occur), and that phenomenon subsequently occurs, it would now be deemed simultaneously logically impossible and empirically evident.  Since this was troublesome to some of the attendees, we determined that other methods would be considered, and a motion from the floor was taken that prophetic method be included.  Pragmatic method, discussed in a previous set of minutes was deemed insufficiently methodical.

Please contribute any amendments in the comments below, and don’t say I never update this blog.

Defense of Marriage

I was married in Severna Park, MD, at my wife’s old family home.  When our third child was born in the same home, she joined the fourth generation of the Bell family to live in the home at one time or another.  We were living with my in-laws, on the way to moving our family to Anne Arundel County, where we now live on a little farm right across the river from Grammy’s house.  Lord willing, our fourth child will be born right on the farm, any day now.

While it is hard for older generations to completely understand this, we couldn’t care less if our kids grow up gay.  It’s just not an issue for us, and I suspect it’s not an issue for most people under the age of forty.  Much as our parents were the first generation to have a lot of interracial friendships and relationships, our generation doesn’t care who’s gay.

So, when I imagine one of my kids growing up, and not being able to get married at the Bell House where I was married, because of falling in love with the wrong person… it breaks my heart.  There are already enough reasons for kids to leave.  Why give them one more?

Go To Considered Awesome

10 PRINT "DAVID IS KEWL."
20 GOTO 10

Yeah, you read it right: David is kewl.  David was a third grader in nineteen-eighty-something, and that’s his first program.

It’s worth noting that while the GOTO instruction in that program is “logically superfluous,” that’s sort of like saying, “Y dnt nd ny vwls t wrt ths sntnc.”  GOTO is considered awesome because it’s how computers work under the hood, and people understand it.  A not-particularly-bright third-grader can guess what that program’s going to do, along with a thirty-year-old computer.

Down with Djikstra!  Bring back goto!

Tangle and Weave

Some Reflections on Literate Programming

Knuth’s Literate Programming is beautiful and digestible.  It is also surprisingly relevant to our times.  At fifteen pages, it’s worth a read, but here I recreate Knuth’s essential diagram:

Figure 1. Dual usage of a WEB file.

Now, I have read this paper three times, and reflected on on while gathering eggs and mulching garden beds.  I’m no Don Knuth, but I think it’s possible he made a pretty essential mistake here (though I’m not holding my breath for my $2.56 check).

Knuth wrote two programs: “TANGLE” generates code as its output, and “WEAVE” generates human-readable documentation.

To recap, we start with “WEB”; we “TANGLE” it for the computer, and we “WEAVE” it for the person.

It’s backward.

Documentation is not a tapestry, and code is not a knot.  Not even ideally.  Ideally, documentation is a knot, and code is a tapestry.

Building on this central disagreement, I wonder if it’s why we haven’t improved much upon Knuth’s ideas from 1984?  I mean, seriously, I was nine.  I’m now thirty-five.  We should have come up with something better than Javadoc and Perldoc and Python docstrings.

For instance, one direction it could have gone is to format code more (e.g. TEX or HTML or some other kind of markup).  Python’s enforcement of indentation is kind of an example of this, actually- makes it much easier for a person to grok.  But why isn’t there more?

Why, for instance, why is this not perfectly valid Go:

<h1>type <b>node</b> interface</h1> { ... }

If it’s an important type signature, let’s make it a big heading, and let’s bold the name to make it easy to pick out.  Of course, syntax highlighting does some automated work for us, but are you really going to tell me that Emacs’s best guess is better than any hint a person could provide?

Hell, with something more TEX-like, it would be possible to organize the program hierarchically into a book.  Same goes with HTML5, I guess, which is still frantically trying to catch up with where TEX was twenty or so years ago.

It’s still a half-formed thought, but I think I’m zeroing in on something here.

More Term Oriented Programming

Term Oriented Programming is what I’m calling structures that help the programmer define a term, possibly with multiple meanings, in a single place.  See the previous post for some examples.  This post is more hypothetical: what might mostly-term-oriented code look like?

On one hand, it’s easy to compare with purely functional code, in which functions are defined in terms of (I like that phrase “in terms of”) other functions.  Terms in a lexicon are defined in terms of other terms in that lexicon.  It’s weirdly recursive, even if you’re just using a dictionary.

Moving on with the dictionary metaphor, there are parts of speech.  For the moment, let’s also define parts of speech in terms of parts of speech.  A term’s part of speech describes which other parts of speech must occur before and/or after that part of speech in order to be valid grammar.  Roughly speaking, a transitive verb is a term that must always be sandwiched between two nouns (even if they’re not directly adjacent).  That’s a part of speech.  In term oriented programming, every term definition would have a part of speech, and terms might have multiple definitions:

  • fast (adj.) firmly fixed in place
  • fast (adj.) moving quickly
  • fast (n.) a period of abstention
  • fast (v.) to abstain

Here we’ve got some overloading, some multi-method dispatch, maybe even some generics.  Poking around the OED, there are a couple other things worth noticing: each definition is followed by examples where the term is used in the sense described; there are examples with each definition; and, other forms of the term are listed up front.

I really like this.  As a programmer, if I encounter the term “fast” I can simply look in the lexicon for fast, and once I’m there I can figure out which sense of the word the parser will decide on, and I’ll know what to do.  If one of the terms is defined in terms of another that I don’t know, I repeat the process until I reach “the bottom.”

It’s easy enough for the parser, too.  Which is not to say it will always be unambiguous, but it will at least quickly get to the point of confusion ;)  An input stream would be tokenized and looped over, and each token would in turn be dispatched to the proper definition, which in turn requires the terms around it to have been dispatched to the proper definition, which in most cases would probably work.  It’s funny to think that “in most cases it would probably work” might actually be good enough.

“The terms around it” begins to look suspiciously like a sentence.  E.g. for each term the parser might need to have the whole sentence available.  However, if our lexicon for a particular term in a program had only a single definition, dispatch is easy.  Use the single definition.

Doug Hunter also pointed out to me that the whole sentence wouldn’t have to be parsed before starting, it could just as easily opt for one of three tactics: optimistic, where the “best” definitions get used right away but the parser starts over if it’s wrong, pessimistic where the compiler waits for the syntax tree to be fully built before doing anything, and melioristic where the parser starts out by executing every sense of the first term but refines the computation as more terms are available.  I think that’s actually a nice description of a tactic just about any compiler could take, really.  Doug and I also talked about how some sentences could have the same meaning or result when the terms are moved around, even though they execute much differently.

Which begs the question, “Can meaning and result differ?”

There’s another post lurking behind this one, I think, then maybe some sample code is in order.

Term Oriented Programming, Definition One

Or: Reflections on Generics, Multi-Method Dispatch, Interfaces, and Parametric Roles

Generics are a cool hack.  If “123″ and “Foo” are of two different types (say, an integer and a string) they can still share code for some methods.  For instance, one might want both to have a print() method: 123.print() and “Foo”.print(), and one might want them to share some code.  Using generics accomplishes this and also helps express the “sameness” of these two methods in code.

Multi-method dispatch is related for some use cases.  With multi-method dispatch, I can define multiple print() methods, and the compiler will distinguish which one I want based upon the argument.  Thus, print(123) and print(“Foo”) are two different methods even if they live in the same namespace.  The programmer doesn’t have to remember more than one (e.g. print_int(123) and print_str(“Foo”)).

Interfaces are another interesting and related concept.  By defining a printer interface, I could specify print() methods are required and later any type that supports these methods is said to “implement the interface.”  The code might be scattered around this way, but any type that implements the interface can be used interchangeably using the interface.

Parametric roles occupy another cool little spot in the ecosystem.  It’s essentially the same as the interface described above, except it can take a parameter and dispatch on the inside.  With a parametric role, one might keep the code a bit more centralized but still implement interfaces in a way that the user of the interface doesn’t need to think about it too much.

Each of these techniques has advantages not described here, but I want to zero in on a single nuance of the behavior that’s shared between all of them: a term (e.g. the word “print”) can be reused and do something different in different circumstances.  Thus, even though each time the term is encountered it executes a different codepath, the same term gets used.

I’m going to call this, collectively, “term oriented programming.”  If there’s a name for it already, someone correct me.  In the meantime, I have something I want to say about term oriented programming:

teh awesome

Seriously.  Mostly it’s awesome for programmers, because we can organize our code somewhat differently, without causing too terrible of havoc for the compiler.

What I want now, is more of it.  I can imagine an argument that I’m looking for a Lisp, or that I just don’t like namespaces and I want to use PHP, and the list goes on…  But what I really want is to collect all the different meanings of the term “print” into a single location (alphabetically, perhaps?) and define how they behave.  Each meaning of the term might give me references to all the places where it is used (sort of like the OED does for first use of particular terms).  It starts to look an awful lot like a lexicon.

A term might have a part of speech, a rigorous definition (defined “in terms of”), some documentation, and some examples.  In fact, the examples might be links to every time the term is used in this particular sense.

Pragmatic Method

To apply pragmatic method, ask the question:

What Difference Does It Make?

It’s a useful question, provided it’s not asked rhetorically: it needs to be answered.  As a rhetorical question, it seems to have the wrong effect.

As the question implies, pragmatic method is about assessing alternatives- alternatives in actions.  As such, it has two obvious limitations: first, I may not be good at imagining alternatives; second, I may not be good at predicting outcomes.  That’s okay, though.  Like any method, practice and improvement go hand-in-hand.  There are a couple tricks that can be helpful when starting out.

If I can’t imagine an alternative, I choose a null hypothesis.  I’m thinking of building a fence, and I’m thinking of putting it over there on the property line.  I would like to apply pragmatic method, but I can’t really imagine anywhere else to build my fence.  So instead I imagine, as my alternative, not building a fence at all.  Then I ask my question, what difference does it make?

It’s much less work to not build a fence.  But if I take that course of action, my dogs will run into the neighbor’s property, and he might be mad about that.  Of course, maybe he doesn’t care.  But perhaps the dogs will run in the road and be killed.  I like the dogs, and that would make me very sad.  And maybe the neighbor really does care: I should go ask him.  Yup, he cares.  And Lina told me to, anyway.  So, I can build the fence, doing work (frankly, work I enjoy), keeping the dogs safe, keeping the neighbors happy, pleasing my wife.  Or I can not-build it, resulting in less exercise, unhappy neighbors, dead dogs, and a mad wife.  I think I’ll build it there on the property line.

Sometimes when I ask the question, “What difference does it make?” I am confronted with the very small difference between a seemingly important choice and its alternative.  But sometimes, it’s not real easy to get there, because I get stuck at proximate consequences.  Another example.

Week before last, I forgot my train pass.  I didn’t have any money.  And I didn’t discover this until I was just getting on a train that was just about to get underway.  I considered two alternatives: stay on the train, and get off the train.  If I stay on the train, I might get in trouble.  Eek!  If I get off the train, I will certainly be late, but I will probably not get in trouble.

This is where I took it a step further: what does “get in trouble” mean?  Well, the conductor may ask me to get off the train.  If she does, I will comply, and I will be no worse off than before.  But she may not ask me to get off the train, and then I will much better off than before.

I tried to consider the intangibles as well.  How embarrassed will I be?  Will she call the police?  Will I receive some sort of fine or penalty?  I didn’t know the answer to any of these questions, but they made me think of another alternative: what if I just go find her and explain my situation.  In my mind, addressing the uncertainties directly greatly reduced my risk of both embarrassment and penalties.

It worked out great.  I got on the train, and found the conductor (as it turns out, right after the doors closed).  I explained my situation, and she said, “that will be eight dollars.”  I explained I did not have eight dollars, and she rolled her eyes and told me to please go sit down somewhere.  I arrived home on time.

I did two things here: one, I imagined her using the pragmatic method (I was prepared to help her do this if I needed to), and two, I didn’t stop at fear of getting in trouble- I enumerated what “trouble” might mean in terms of outcomes.  When I imagined her using the pragmatic method, I realized it would be a much worse outcome for her to kick me off the train (probably involving lateness and paperwork, two things train conductors seem not to like) than it would be for her to simply overlook my forgotten train pass.  And when I pursued the trouble to its logical end, I realized that in terms of outcomes, “trouble” was probably about equivalent (or at least it could be made equivalent by my action) to my alternative hypothesis: in either case, I would be a little late getting home.  So I chose possible trouble over certain lateness, and in this case I won out.

Now, when I say that I was prepared to help her use pragmatic method, I mean that I was prepared to assure her that my presence on the train would be trouble-free and inconspicuous.  I was also prepared to offer to make amends in the future.  Lastly, I was prepared to remind her that kicking me off the train would probably involve lateness and paperwork, neither of which she presumably desired.  As it happens, I did not need to assist her in this matter.  I find this is often the case: people are pretty smart.

Pragmatic method is not perfect.  It’s limited by my imagined alternatives, it’s limited by my ability to predict outcomes, and it’s limited by habitually applying and refining it.  But I must say that in many situations it beats the pants off of trying to imagine what rule applies and simply following that rule.

Never been so much of a rule follower, I guess.

What Is A Word?

Or: That Boat is Fast

Or: Why Natural Language Processing Doesn’t Work

After giving the keynote at the Wolfram Data Summit last week, Stephen Wolfram took questions from the audience, one of which was about natural language processing.  He gave a thorough answer to the question, which I will sum up: “we haven’t gotten very far with it.”

Processing natural language isn’t all that hard.  Two of my three dogs can do it.  Computers, however, are uniquely bad at it, and there’s a reason for that.  A computer doesn’t keep a mental model of its universe handy.  So when I say, “that boat is fast,” and point to a boat that is stuck fast in the sand, the computer parses it as a boat moving quickly, instead of a stuck one.  Two of my three dogs will know exactly what I mean.  The other one will hump my leg.

Words like “fast” are called contranyms.  Words that mean one thing, and the opposite of that thing.  Fast and fast.  Cleave and cleave.  Awful and awful.

And I’ve got news: just about every damn word can mean something, something else, or nothing at all.  Grammar can only define away so much ambiguity- at some point we must recognize that words are way points along a trail, and they only make sense to other folks on the same trail.  Words work only to the extent that the parties exchanging them share some context.