21 February 2011

Circular printing in Klink

Circular printing in Klink

Recap: I've been writing an interpreter called Klink for a Scheme-derived languages called Kernel.

One issue that came up was circular printing. Now, every language that can print all its objects has to deal with circular printing to some degree. Some can afford to be fairly lackadaisical about supporting it. For instance, Elisp (still?) only prints circular objects thru a special library, and didn't read circular objects at all until I added that capability to lread.c around version 20.5.

Not Kernel. Circular objects are fully first class objects in Kernel. I needed circular printing and I needed it early.

Finding recurrences

Before printing, I get an object's recurrences, ie a table of the objects in it that occur more than once.

The interface is:

recurrence-table?
True if the argument is a recurrence table
get-recurrences
Given an object, return a recurrence table.
recurrences-get-object-count
Given a sub-object of the object, return how many times it occured in the object. If it's not even part of object, return 0.

Making a recurrence tracker

Then, still before printing, I make a recur-tracker, which is a table of objects that expects to be "stepped" as other code walks a tree. It keeps track of:

  • Whether a given object will be seen more than once.
  • Whether a given object has been seen before.
  • A numerical index for the current object, if it has been seen before. That index is unique for this walk.

The interface is:

recur-tracker?
True if the argument is a recurrence tracker.
recurrences->tracker1
Given a recurrence table, return a fresh recurrence tracker. If there are no recurring objects in the table, return () instead.

The printing itself

Now when printing, every time we are about to print a sub-object, we check2 the recurrence tracker.

  • If this is the only occurence of the object, we just print it.
  • Otherwise, if this is the first time we have seen the object, we first print #N=, where N is the numerical index. Then we print the object.
  • Otherwise we print #N, where again N is the numerical index.

Those of you with sharp eyes will notice that this is the same as Common Lisp's representation of shared objects. That may not hold true later, because overall I prefer a simpler syntax that I call EMSIP.

History

What I originally wrote didn't return a recurrence table, it returned an ordered list of objects to skip during printing and another ordered list of objects to print specially. To do that, it had to basically walk the object tree twice: Once to find the recurring objects, again to find the objects to print specially in the right order. (It could find the objects to skip from the first walk)

But I noticed some things:

  • First, that wasn't right. Printing needed to see a merged list, not two lists.
  • It was duplicating the work of walking the tree during printing.
  • I did do walking somewhat faster by keeping an ordered array of objects it had seen. But doing it that way made it impossible to use a hashed lookup table.
  • Recurrence counts might be of more general use, so I wanted to make them available.

It was then that I settled on the current design.

Footnotes:

1 As I wrote this, I realized that I need to expose this function.

2 Actually, first we check whether we can find a special name to print as its entire representation. If we find that, circular printing isn't needed no matter how many times it recurs.

No comments:

Post a Comment