10 September 2011

Could backquote be adapted sensibly for Kernel?

Backquote vs Kernel

If you look at Kernel source, you'll quickly see constructions that a Lisper or Schemer would immediately use backquote for, like:

(eval ($if (null?/1 bindings)
           (list* $let () body)
           (list $let
                 (list (car bindings))
                 (list* $let* (cdr bindings) body)))

Quote and backquote are discouraged in Kernel. They tend to detach symbols from the environments they pertain to. They're not needed much, because in Kernel it's usually not neccessary to pass bare symbols around unevaluated.

But backquote is a lot more convenient than managing tree structure manually. It's more WYSIWYG. It lets you read and write something like:

(eval ($if (null?/1 bindings)
         ($` $let () . body)
         ($` $let
            (#unquote(car bindings))
            ($let* (cdr bindings) . body)))

Can we have it both ways? Yes.

Is it possible to have a variation of backquote that only exposes the sort of things manual list construction exposes, forms that don't naturally include symbols?

Yes. For instance, one could construct a form as traditional backquote does and evaluate that form in the current environment, without exposing the quoted form. The result would usually be evaluated a second time.

That gives the right result, but some pieces of the sexp get evaluated early (the low-quoted pieces), some get evaluated late (but still internally). It works but it seems dodgy.

With an equivalent result, one could say that backquote internally builds a form that, when evalled, recovers the original structure of the tree, except that low-quoted pieces are copied literally. Then it evals that form. In this formulation, there are no new quoted parts even internally, and all evaluation wrt the current environment happens in the final stage of backquote.

Any new machinery needed? Not much.

The backquote itself can be just an operative. The existing primitives supply enough functionality. The one piece of new language machinery needed is a means to introduce the low quote without causing special cases.

This rules out using a combiner, because backquote doesn't ordinarily evaluate combiners until it has settled the form structure, and low quote must be seen before that.

So low quote needs to be something that backquote understands particularly, which implies a unique object defined with backquote. But we can't convey this object by the usual means of binding it and writing its symbol. The logic is similar to the above: We'd need to use it before we evalled it and found it was special.

So we need a special way to read this object, thus the "#unquote" above. And that's about it.


  1. The error-prone-ness of quasi-quote in Kernel isn't just about ending up with unevaluated symbols lying around.  The reason it's error-prone to have unevaluated symbols lying around is they make it easy to lose track of what you mean to be doing (or even, hard not to lose track) — and quasi-quote notation itself has this losing-track problem.  At the heart of the notational problem is that you're asking to lose track the moment you start mixing explicit-evaluation and implicit-evaluation styles (which my dissertation introduces in section 1.2.3, and explores also in 1.2.4 and 7.3).

  2. John: I read those sections of your thesis. It's quite readable; if I'd known, I'd have read it a long time ago.

    Would it be satisfactory if backquote had phases in the opposite order? Ie, it evaluates all the pieces first, and then builds a tree with those objects in it. They can be equated even when not syntactically identical. You could consider the individual atoms and #unquoted parts operands.


    ($` $let
    (#unquote(car bindings))
    ($let* #unquote(cdr bindings) . body))

    ...under the hood could be something like:

    ((t0 $let)
    (t1 (car bindings))
    (t2 $let*)
    (t3 (cdr bindings))
    (t4 body))
    (list t0
    (list t1)
    (list* t2 t3 t4)))

    All without one single $quote.

  3. Backquote, or something like it, is only really needed if you're writing fexprs the way you'd write Common Lisp macros, i.e. writing passing large, template-like blocks of code to eval.

    I've come the the conclusion that good fexpr style is to avoid passing large amounts of code directly to eval, but rather do as much as possible directly in the body of the fexpr itself—which is, after all, executed at runtime—and only eval the bare minimum.

    I can't say for certain that this can be done for all code, but in the kernel-like code I've written so far, it's worked quite well.

  4. @Ralith: So if I understand right, you'd write the above example in something like the intermediate form I showed,
    ($let ((t0 (car bindings)) etc...

    It's a thought.

    FWIW, code is not the only thing I use backquote for. For instance, all the formatting in my project Emtest is constructed using backquote. I view it as the WYSIWYG way of constructing a tree.