17 February 2011

Generic functions and Klink

Generic functions and Klink

Recap: I've been writing an interpreter called Klink for a Scheme-derived languages called Kernel.

Now, I prefer generic functions over methods, for a lot of reasons that I won't go into just now. And it's clear that some built-ins, such as `+' or `equal?', must behave rather like generic functions: They must accept more than one type, and act different on the different types, and may need to be extended if new types are defined that are relevant for them. To me, that strongly suggests that I should provide real generic functions.

I looked at existing approaches for handling generics. I like the way MIT Scheme handles them, but their approach was obviously not going to suffice for Kernel.

One of the principles of Kernel is that built-ins are not special. We deliberately don't ever subtype functions based on whether the system provided them or the user composed them. Generics need to be functions so that we can call them. But they need to be extensible, as built-in functions are not.

So I've been unsure how to satisfy both these ideas at once, until I figured it out yesterday.

First idea, which didn't work

My first idea was to make generic functions by simply composing normal functions, as $vau and $lambda do. That's easy. It doesn't even require anything special in the language or the platform. But that wouldn't do the job. It would just make an object that we must pass to other code. When we change a generic, we want to alter something that other code naturally sees, not have to arrange to pass it something.

Generics are mutable functions

Yesterday I got what I think is the key insight: Generics are mutable functions. Adding or removing generic clauses is a mutation operation. The function object is a different thing after the mutation, and other code now sees its new value.

NB, it's not a decomposition operation - that's another thing we deliberately don't allow in Kernel. It doesn't see "parts" of a function except in a certain very narrow way.

This is applicable even to "normal" functions that didn't know they were supposed to be extensible. They are immutable, but (like most immutables) can be copied, and the copy can be mutable.

So I plan to have a copy operation that takes any operative1 and returns a mutable operative. Naturally the mutable operative initially has the same value as the original operative. Other combiners2 mutate it, basically adding or removing guarded clauses from it.

Different ways of composing guarded clauses

Having special status for built-ins is anethema to Kernel. I also don't want or expect to choose a perfect method of composing clauses that never requires extension. So I don't want the generics to have just one way of composing clauses.

Instead I want the copy operation to specify how clauses are to be composed. I considered specifying it later, when clauses are added or removed. But I felt that that would be too wobbly and the implementation would be too intricate.

I also want the clause composers to be normal operatives, available for other purposes. Looking at them, they seem to be reasonable to expose. They also seem fairly useful:

$run-first-match
It's like good ol' cond, except that its guards are predicates called on some given argobject. It just tries the clauses in the order they are given until one is accepted. Returns the value that the chosen clause returns.
$run-only-one
Like $run-first-match, but it raises an error if more than one guard accepts the argobject. Exception: One clause may be a default that runs just if no other clause accepts the argobject. Returns the value that the chosen clause returns.
$run-all-matches
A cross between `map' and `cond', it calls each clause that accepts the argobject and returns a list of their return values. The list may be empty if none match.
$run-till-true
Like $run-first-match except that if a clause returns #f, it tries further clauses. Returns #t if any clause satisfied it, otherwise #f.
$run-till-false
The reverse of $run-till-true

In each case, the original operative is effectively the default clause. It's guarded by a predicate that accepts anything and is considered last (In effect, not neccessarily in implementation)

Clause composers' idiosyncracies

One issue troubled me: These clause composers have different ideas about clause precedence and clause removal. Some care about the order the clauses are considered in, some don't (much). Some allow overlapping clauses, so removing a clause (as if it was never added) isn't the same as blocking every argobject the clause's guard would accept.

My original idea was to "simply" provide operations to add and remove clauses. But this won't do. Clause composers are too idiosyncratic.

In fact, the "clauses" need not even be stored as a list of clauses. Consider a generic that is meant to handle only a single enumerated type. It might be stored as a vector or even a bitvector.

So instead, the copy-operative combiner will take a little more information: how to add and remove clauses. Since we can't determine for all time how they may add and remove clauses, we don't know what their arglists should be. So we will pass an operative but not define its argobject. It will take whatever argobject is passed.

But composers and add/removers tend to covary. That is, when you use a particular composer, there's one specific add/remover that you ought to use too, and vv. Mixing them up is likely to be just a source of bugs.

So let's group them together in a dispatcher type. We'll pass that to copy-operative.

Argobjects are single objects

Above I spoke of type predicates that were called on "argument objects". I didn't mean to say "argument lists" or "each object in the argument list". In Kernel, the argument "list" is a single object. It's usually a list but need not be. Type predicates apply naturally to argobjects.

Types and clause removal

I said that I like the way MIT Scheme handles generics but its approach was obviously not going to suffice for Kernel. Another reason it doesn't is that MIT Scheme's mechanism knows too much about types. It relates a built-in lattice of types to a system of dispatch-tags, and you largely manipulate generic functions via the tags.

The way I have done typechecking for Klink, any unary predicate can define a type. ISTM it has to be this way to follow the non-specialness principle.

That posed some problems. We want to be able to remove clauses from generics, as well as add them. MIT Scheme does that nicely. What I like best is that one doesn't have to keep the original add-clause arguments around in order to know what to remove. Given a set of type tags, it can remove exactly the clauses that correspond to them. It can even, by a slightly indirect maneuver, remove the clauses that would act on a given object.

But for Kernel the situation is not so simple. Say we add a clause with a particular type predicate and later we want to remove that clause. And suppose that that predicate was anonymous and we didn't keep it. How can we remove the clause?

We can't just construct another type predicate. The system has no good way of knowing that it is the same predicate.

But there is another way. As a fallback, instead of removing the clause, we can block it. That is, if we get an argobject that would have satisfied it, do nothing, or call the default. If you think about it, that behavior is what we really want when we remove a clause. It may even be useful in and of itself.

Or as hooks

(Think emacs hooks, not MIT Scheme hooks, which are a very different creature)

Earlier I talked about `$run-till-false' and `$run-till-true'. I shamelessly borrowed that idea from emacs' hooks. Emacs lets you `run-hook-with-args-until-success' and `run-hook-with-args-until-failure'.

I find that pretty useful in emacs lisp, and I'd like to provide the same in Klink. So these mutable operatives provide more than generic functions. They also provide the equivalent of hooks. The clause composers above plus `$run-all-matches' provide full parallel functionality.

Now, this design requires that `$run-till-true' etc be specified early while emacs specifies it late, but I doubt that will be a problem. I have yet to see a hook that wants to be run in more than one way, and in my experience, code that provides a hook always knows how it is intended to be used well in advance.

A first draft

So a first draft of the generic interface looks something like this:

make-dispatcher
  • Takes a clause composer operative, possibly defaulting to `$run-only-one'.
  • Takes an immutable mutator operative.
  • Optionally takes an initializer operative.
  • Returns a dispatcher
copy-operative!
  • Takes an operative
  • Takes a dispatcher
  • Calls the dispatcher's initializer operative, if given. Otherwise it constructs a single clause consisting of the original operative and the type predicate true/1, which accepts anything.
  • Returns a mutable operative which, when called in its current state, behaves the same as the operative argument when called in its current state.
    • If it does not so behave, the dispatcher is buggy.
copy-operative-immutable
  • Takes an operative
  • Returns an immutable operative which, when called, behaves the same as the operative argument when called in its current state.
  • Intended for when a generic function is wanted to be constant.
get-operative-mutator
  • Takes a mutable operative
  • Returns an applicative. When called, this will:
    • Call the dispatcher's mutator operative, passing it:
      • The mutable operative's current representation, usually a list of clauses.
      • The actual call's argument list.
    • Alter the mutable operative accordingly.
    • Return #inert
get-operative-remover
Similar to get-operative-adder.

A clause composer must:

  • Take an argument which is the argobject the actual call got.
  • Take the operative's internal value, usually a list of guarded clauses
  • Return a value.

An mutator operative must:

  • Take the operative's internal value
  • Take a symbol argument, often `add' or `remove'.
  • Take a freeform argument which may:
    • provide a type predicate (almost always wanted)
    • provide an operative to be called (almost always wanted)
    • indicate precedence
    • help to recognize this clause for removal.
  • Return a new internal value

Some typical mutator arguments and behavior, as for $run-only-one:

add
  • Freeform argument consists of:
    • An immutable unary predicate
    • An operative
  • Composes the unary predicate and the operative into a guarded clause
  • Adds the clause to the front of the clause list
  • Returns the new list
remove
  • Freeform argument consists of:
    • An immutable unary predicate
  • If that predicate is guarding any clauses, remove those clauses
  • If all other clauses' predicates can be shown to be disjoint with this, we're done.
  • Otherwise insert a new clause that is consulted earlier and calls the default clause if there is one.
  • Returns the new list
remove-by-object
  • Freeform argument consists of:
    • An arbitrary object
  • Removes all clauses whose predicates accept the object
  • Returns the new list

An example

;;Create the hook
($define! my-hook (copy-operative! (unwrap some-applicative) my-dispatcher))

;;Add a handler that expects one integer
((get-operative-mutator my-hook) 'add
   (arglist integer?)
   #'(lambda (x)
        (do-stuff))
   append)

;;Remove the handler that expects one integer
((get-operative-mutator my-hook) 'remove (arglist integer?))

Footnotes:

1 An "operative" is like a function whose arguments are not evaluated and which is passed the static environment as an object. It's what Kernel has instead of macros.

2 A "combiner" is more-or-less a function in Kernel. The "more" part is that it can also be an operative.

No comments:

Post a Comment