21 February 2011

Circular printing in Klink

Circular printing in Klink

Recap: I've been writing an interpreter called Klink for a Scheme-derived languages called Kernel.

One issue that came up was circular printing. Now, every language that can print all its objects has to deal with circular printing to some degree. Some can afford to be fairly lackadaisical about supporting it. For instance, Elisp (still?) only prints circular objects thru a special library, and didn't read circular objects at all until I added that capability to lread.c around version 20.5.

Not Kernel. Circular objects are fully first class objects in Kernel. I needed circular printing and I needed it early.

Finding recurrences

Before printing, I get an object's recurrences, ie a table of the objects in it that occur more than once.

The interface is:

recurrence-table?
True if the argument is a recurrence table
get-recurrences
Given an object, return a recurrence table.
recurrences-get-object-count
Given a sub-object of the object, return how many times it occured in the object. If it's not even part of object, return 0.

Making a recurrence tracker

Then, still before printing, I make a recur-tracker, which is a table of objects that expects to be "stepped" as other code walks a tree. It keeps track of:

  • Whether a given object will be seen more than once.
  • Whether a given object has been seen before.
  • A numerical index for the current object, if it has been seen before. That index is unique for this walk.

The interface is:

recur-tracker?
True if the argument is a recurrence tracker.
recurrences->tracker1
Given a recurrence table, return a fresh recurrence tracker. If there are no recurring objects in the table, return () instead.

The printing itself

Now when printing, every time we are about to print a sub-object, we check2 the recurrence tracker.

  • If this is the only occurence of the object, we just print it.
  • Otherwise, if this is the first time we have seen the object, we first print #N=, where N is the numerical index. Then we print the object.
  • Otherwise we print #N, where again N is the numerical index.

Those of you with sharp eyes will notice that this is the same as Common Lisp's representation of shared objects. That may not hold true later, because overall I prefer a simpler syntax that I call EMSIP.

History

What I originally wrote didn't return a recurrence table, it returned an ordered list of objects to skip during printing and another ordered list of objects to print specially. To do that, it had to basically walk the object tree twice: Once to find the recurring objects, again to find the objects to print specially in the right order. (It could find the objects to skip from the first walk)

But I noticed some things:

  • First, that wasn't right. Printing needed to see a merged list, not two lists.
  • It was duplicating the work of walking the tree during printing.
  • I did do walking somewhat faster by keeping an ordered array of objects it had seen. But doing it that way made it impossible to use a hashed lookup table.
  • Recurrence counts might be of more general use, so I wanted to make them available.

It was then that I settled on the current design.

Footnotes:

1 As I wrote this, I realized that I need to expose this function.

2 Actually, first we check whether we can find a special name to print as its entire representation. If we find that, circular printing isn't needed no matter how many times it recurs.

Emsip sexps redux

Emsip sexps redux

A few months ago I blogged about Emsip. I proposed that an Emsip interpreter could neatly handle source code by farming the specialized parts off to special {}-interpreters and handling the core {numbers,symbols,punctuation} with just a few classes and states.

It occurs to me that my design might be made even simpler. Instead of punctuation grouping at the beginning of a sequence of weak constitutents, let it group at the end.

That lets the object-making states be almost entirely regular. There are no longer special conditions for what interrupts them, except for the state Uncommitted, which is interrupted by anything other than a weak constitutent.

That means that punctuation can appear directly before symbols,

'a   +:'b 
;;quoted symbol "a", "+:'" applied to symbol "b".

but not directly after, because the punctuation constitutents will be incorporated in:

a'   b+:' 
;;Two symbols, "a'" and "b+:'"

This is reasonable, since it covers the common use case and because the purpose of punctuation is to modify following sexps, not preceding ones.

The new design

We have these states:

StateInterruptable by
No objectAll but whitespace
UncommittedAll but weak cons
CommittedOnly whitespace

The classes of character are the same as before, and now have these transition behaviors:

ClassObject begunNew state
WhitespaceNothingNo object
Weak constituent(Possible symbol)Uncommitted
Strong punctuationXforms the tailNo object
LetterSymbolCommitted
DigitNumberCommitted

We make an object when we enter the No-object state. But that is really a reversed reversal. The real rule is to make a new object when we leave the union of the other two states. So to nail down the details:

  • Beginning to read the sexp doesn't make an object, even though we start reading in No-object state.
  • Leaving the sexp makes an object unless state was No-object. So it's as if the sexp read is surrounded by No-object state.
  • Strong punctuation makes an object even if previous state was No-object, as if we pass thru the Committed state momentarily before re-entering the No-object state.
  • Nested reads (which are handled by EMSIP outside of here) enter No-object before they begin, make one object, and leave state as No-object.

suitable SREs

I'll update the SREs as well. And I'll trivially add `nested-object', which I left out last time. It would be supplied by EMSIP framework.

These SREs assume `nested-object' and the classes of character as primitives. They assume longest-match parsing so `uncommitted' will be part of a larger construct when possible (only the symbol second alternative is at issue)

whitespace
(+ whitespace-constituent)
uncommitted
(+ weak-constituent)
number
(seq (? uncommitted) digit (* (~ whitespace-constituent)))
symbol
(or (seq (? uncommitted) letter (* (~ whitespace-constituent))) uncommitted)
punctuation
(seq (? uncommitted) strong-punctuation)
thing
(or whitespace number symbol punctuation nested-object)
(overall)
(* thing)

Notes on numbers

When I say a "number" object is made, I don't mean that "number" must be a primitive type. On the contrary, I expect distinct types including at least integers, exact reals, and inexact reals.

What I mean is that all numbers are of this form. That substring is parsed by another very specialized regular-expression matcher.

Since we can just farm any misfits off to specialized {}-interpreters, we can afford to be very picky about what we treat here. My thoughts on what to support:

Positive vs negative
Of course.
Decimal point
Of course.
Scientific notation
I think so.
Bases
Maybe just decimal and hex.
The infinities and NaN
No. They are semantically numbers but have no digits. They gain nothing by sharing their syntax with this.
Rationals
Maybe. They are convenient.
Inexact vs exact
Probably. A default to exact integers and inexact reals is reasonable. But one would sometimes want exact reals, and it might be a nuisance to have to switch notations just for that. We can't use prefixes #e and #i since they make us read a symbol instead of a number, and trailing "e" is used by scientific notation. So perhaps trailing "+-" could indicate inexactness.
Uncertainty
Maybe. Having "+-", it's tempting to use "+-" followed by a (positive, exact) number to indicate uncertainty.
Precision
Ie, floats vs doubles vs perhaps bignums. No. That's really about object construction. If that sort of control is wanted, real object construction should be used, not syntax tricks.
Digit group separators
As in "1,000,000". Probably, since it almost falls out of this syntax. Since the comma isn't universally accepted in this role, let's accept the underscore too. But IMO going any further would add a great deal of complexity. So no provisions to treat the comma as decimal point in some locales, etc.

17 February 2011

Generic functions and Klink

Generic functions and Klink

Recap: I've been writing an interpreter called Klink for a Scheme-derived languages called Kernel.

Now, I prefer generic functions over methods, for a lot of reasons that I won't go into just now. And it's clear that some built-ins, such as `+' or `equal?', must behave rather like generic functions: They must accept more than one type, and act different on the different types, and may need to be extended if new types are defined that are relevant for them. To me, that strongly suggests that I should provide real generic functions.

I looked at existing approaches for handling generics. I like the way MIT Scheme handles them, but their approach was obviously not going to suffice for Kernel.

One of the principles of Kernel is that built-ins are not special. We deliberately don't ever subtype functions based on whether the system provided them or the user composed them. Generics need to be functions so that we can call them. But they need to be extensible, as built-in functions are not.

So I've been unsure how to satisfy both these ideas at once, until I figured it out yesterday.

First idea, which didn't work

My first idea was to make generic functions by simply composing normal functions, as $vau and $lambda do. That's easy. It doesn't even require anything special in the language or the platform. But that wouldn't do the job. It would just make an object that we must pass to other code. When we change a generic, we want to alter something that other code naturally sees, not have to arrange to pass it something.

Generics are mutable functions

Yesterday I got what I think is the key insight: Generics are mutable functions. Adding or removing generic clauses is a mutation operation. The function object is a different thing after the mutation, and other code now sees its new value.

NB, it's not a decomposition operation - that's another thing we deliberately don't allow in Kernel. It doesn't see "parts" of a function except in a certain very narrow way.

This is applicable even to "normal" functions that didn't know they were supposed to be extensible. They are immutable, but (like most immutables) can be copied, and the copy can be mutable.

So I plan to have a copy operation that takes any operative1 and returns a mutable operative. Naturally the mutable operative initially has the same value as the original operative. Other combiners2 mutate it, basically adding or removing guarded clauses from it.

Different ways of composing guarded clauses

Having special status for built-ins is anethema to Kernel. I also don't want or expect to choose a perfect method of composing clauses that never requires extension. So I don't want the generics to have just one way of composing clauses.

Instead I want the copy operation to specify how clauses are to be composed. I considered specifying it later, when clauses are added or removed. But I felt that that would be too wobbly and the implementation would be too intricate.

I also want the clause composers to be normal operatives, available for other purposes. Looking at them, they seem to be reasonable to expose. They also seem fairly useful:

$run-first-match
It's like good ol' cond, except that its guards are predicates called on some given argobject. It just tries the clauses in the order they are given until one is accepted. Returns the value that the chosen clause returns.
$run-only-one
Like $run-first-match, but it raises an error if more than one guard accepts the argobject. Exception: One clause may be a default that runs just if no other clause accepts the argobject. Returns the value that the chosen clause returns.
$run-all-matches
A cross between `map' and `cond', it calls each clause that accepts the argobject and returns a list of their return values. The list may be empty if none match.
$run-till-true
Like $run-first-match except that if a clause returns #f, it tries further clauses. Returns #t if any clause satisfied it, otherwise #f.
$run-till-false
The reverse of $run-till-true

In each case, the original operative is effectively the default clause. It's guarded by a predicate that accepts anything and is considered last (In effect, not neccessarily in implementation)

Clause composers' idiosyncracies

One issue troubled me: These clause composers have different ideas about clause precedence and clause removal. Some care about the order the clauses are considered in, some don't (much). Some allow overlapping clauses, so removing a clause (as if it was never added) isn't the same as blocking every argobject the clause's guard would accept.

My original idea was to "simply" provide operations to add and remove clauses. But this won't do. Clause composers are too idiosyncratic.

In fact, the "clauses" need not even be stored as a list of clauses. Consider a generic that is meant to handle only a single enumerated type. It might be stored as a vector or even a bitvector.

So instead, the copy-operative combiner will take a little more information: how to add and remove clauses. Since we can't determine for all time how they may add and remove clauses, we don't know what their arglists should be. So we will pass an operative but not define its argobject. It will take whatever argobject is passed.

But composers and add/removers tend to covary. That is, when you use a particular composer, there's one specific add/remover that you ought to use too, and vv. Mixing them up is likely to be just a source of bugs.

So let's group them together in a dispatcher type. We'll pass that to copy-operative.

Argobjects are single objects

Above I spoke of type predicates that were called on "argument objects". I didn't mean to say "argument lists" or "each object in the argument list". In Kernel, the argument "list" is a single object. It's usually a list but need not be. Type predicates apply naturally to argobjects.

Types and clause removal

I said that I like the way MIT Scheme handles generics but its approach was obviously not going to suffice for Kernel. Another reason it doesn't is that MIT Scheme's mechanism knows too much about types. It relates a built-in lattice of types to a system of dispatch-tags, and you largely manipulate generic functions via the tags.

The way I have done typechecking for Klink, any unary predicate can define a type. ISTM it has to be this way to follow the non-specialness principle.

That posed some problems. We want to be able to remove clauses from generics, as well as add them. MIT Scheme does that nicely. What I like best is that one doesn't have to keep the original add-clause arguments around in order to know what to remove. Given a set of type tags, it can remove exactly the clauses that correspond to them. It can even, by a slightly indirect maneuver, remove the clauses that would act on a given object.

But for Kernel the situation is not so simple. Say we add a clause with a particular type predicate and later we want to remove that clause. And suppose that that predicate was anonymous and we didn't keep it. How can we remove the clause?

We can't just construct another type predicate. The system has no good way of knowing that it is the same predicate.

But there is another way. As a fallback, instead of removing the clause, we can block it. That is, if we get an argobject that would have satisfied it, do nothing, or call the default. If you think about it, that behavior is what we really want when we remove a clause. It may even be useful in and of itself.

Or as hooks

(Think emacs hooks, not MIT Scheme hooks, which are a very different creature)

Earlier I talked about `$run-till-false' and `$run-till-true'. I shamelessly borrowed that idea from emacs' hooks. Emacs lets you `run-hook-with-args-until-success' and `run-hook-with-args-until-failure'.

I find that pretty useful in emacs lisp, and I'd like to provide the same in Klink. So these mutable operatives provide more than generic functions. They also provide the equivalent of hooks. The clause composers above plus `$run-all-matches' provide full parallel functionality.

Now, this design requires that `$run-till-true' etc be specified early while emacs specifies it late, but I doubt that will be a problem. I have yet to see a hook that wants to be run in more than one way, and in my experience, code that provides a hook always knows how it is intended to be used well in advance.

A first draft

So a first draft of the generic interface looks something like this:

make-dispatcher
  • Takes a clause composer operative, possibly defaulting to `$run-only-one'.
  • Takes an immutable mutator operative.
  • Optionally takes an initializer operative.
  • Returns a dispatcher
copy-operative!
  • Takes an operative
  • Takes a dispatcher
  • Calls the dispatcher's initializer operative, if given. Otherwise it constructs a single clause consisting of the original operative and the type predicate true/1, which accepts anything.
  • Returns a mutable operative which, when called in its current state, behaves the same as the operative argument when called in its current state.
    • If it does not so behave, the dispatcher is buggy.
copy-operative-immutable
  • Takes an operative
  • Returns an immutable operative which, when called, behaves the same as the operative argument when called in its current state.
  • Intended for when a generic function is wanted to be constant.
get-operative-mutator
  • Takes a mutable operative
  • Returns an applicative. When called, this will:
    • Call the dispatcher's mutator operative, passing it:
      • The mutable operative's current representation, usually a list of clauses.
      • The actual call's argument list.
    • Alter the mutable operative accordingly.
    • Return #inert
get-operative-remover
Similar to get-operative-adder.

A clause composer must:

  • Take an argument which is the argobject the actual call got.
  • Take the operative's internal value, usually a list of guarded clauses
  • Return a value.

An mutator operative must:

  • Take the operative's internal value
  • Take a symbol argument, often `add' or `remove'.
  • Take a freeform argument which may:
    • provide a type predicate (almost always wanted)
    • provide an operative to be called (almost always wanted)
    • indicate precedence
    • help to recognize this clause for removal.
  • Return a new internal value

Some typical mutator arguments and behavior, as for $run-only-one:

add
  • Freeform argument consists of:
    • An immutable unary predicate
    • An operative
  • Composes the unary predicate and the operative into a guarded clause
  • Adds the clause to the front of the clause list
  • Returns the new list
remove
  • Freeform argument consists of:
    • An immutable unary predicate
  • If that predicate is guarding any clauses, remove those clauses
  • If all other clauses' predicates can be shown to be disjoint with this, we're done.
  • Otherwise insert a new clause that is consulted earlier and calls the default clause if there is one.
  • Returns the new list
remove-by-object
  • Freeform argument consists of:
    • An arbitrary object
  • Removes all clauses whose predicates accept the object
  • Returns the new list

An example

;;Create the hook
($define! my-hook (copy-operative! (unwrap some-applicative) my-dispatcher))

;;Add a handler that expects one integer
((get-operative-mutator my-hook) 'add
   (arglist integer?)
   #'(lambda (x)
        (do-stuff))
   append)

;;Remove the handler that expects one integer
((get-operative-mutator my-hook) 'remove (arglist integer?))

Footnotes:

1 An "operative" is like a function whose arguments are not evaluated and which is passed the static environment as an object. It's what Kernel has instead of macros.

2 A "combiner" is more-or-less a function in Kernel. The "more" part is that it can also be an operative.

03 February 2011

Refuge markets done right

Refuge markets done right

Why

To summarize briefly, idea futures markets have a problem we call "existential risk", that if the world ends, bets can't be paid off, so bets on armageddon cases tell us nothing.

Like, if I were to bet you $1,000,000 to your $1 that the world won't end, how would I pay if I lose? So it'd be just you giving me a dollar, which says nothing about the likelihood the world will end.

This is a particular problem for decision markets. If a blind spot blocks them from seeing armageddon, what they falsely see in its place might look like the best way to go.

Refuge markets try to sell some chance of post-catastrophe survival in order to measure the likelihood that a catastrophe will happen, to see past that blind spot.

Prior ideas

Twice Robin Hanson has blogged about refuge markets. The first time around, he said that people would buy into a sort of lottery, and if catastrophe struck, they'd survive in a sort of bomb shelter. Some things he got right:

  • Tickets would be selected well before catastrophe struck. The ticket holders would live at the shelter for some predetermined time period.
  • There'd be different types of refuges appropriate to different types of catastrophe.

In the comments, and skipping over the majority that just don't see the point, some of us told him why it wouldn't work:

  • Season variation, fads, etc would swamp the real signal.
  • There was no way to guarantee that shelter guardians would follow the rules. They might save themselves and their families. Even if maybe they wouldn't, the perception that they might could alter the price a great deal. The real signal would be swamped by fluctuations of that perception.

So the second time around, Robin proposed that refuges be paired with downscale resorts. Some ticket-holders would go to the refuges, some to the resorts - they're supposed to be the same but of course they wouldn't be. When disaster struck, the refuge would be sealed.

His second approach solves nothing and just exacerbates the problems of the first approach. I give him due credit for trying to solve an important problem, and he does have a glimmer of a real idea, but he has not remotely got it right. I have diagnosed and fixed his broken ideas before, so here goes again.

How to measure likelihood

One important principle in measuring is to remove common modes beforehand. Measure the thing you're really trying to measure and make the rest cancel out.

Have two types of ticket holders, which we will treat identically until catastrophe:

Real ticket holders
as above.
Decoys
Other refuge dwellers who are ejected when catastrophe is declared. Possibly moved to one of the downscale resorts Robin imagines.

The real ticket holders presumably pay a premium to have exactly the same experience as the decoy ticket holders plus a very good likelihood of surviving a catastrophe. That price difference essentially measures the likelihood of catastrophe.1

Yes, the ejection of decoy ticket holder is a bit cruel, but compared to losing a major city it's minor. And the shelter would have only enough provisions for real ticket holder. It has to be so: if it weren't, we'd increase the size of real ticket holder beforehand.

Rest assured that I don't daydream about tossing strangers out the door into "Day After" scenarios. It's a case of thinking the unthinkable and not blinking at the results. Just to forewarn you, the rest of this post contains more "unthinkable" thinking. Proceed at your own risk.

Keeping the measurement clean (and thinking the unthinkable)

Other than likelihood of survival, we don't want there to be any perceptible distinctions between real ticket holder and decoy ticket holder. Here are some threats and how we might address them:

Stolen tickets

Presumably when living at close quarters with armageddon approaching, decoy ticket holders would be tempted to simply steal a life-giving ticket from some unwary real ticket holder. We don't want a decoy ticket to reflect a chance to steal a real ticket.

So it can't be a physical ticket2. It would have to be identity-based. Perhaps biometric.

Social effects

Ticket-holding might be a status item, or a fad gift, or other social signal. Holding a real ticket holder might even mean more status among the ticket holders than holding a decoy ticket. We don't want the ebb and flow of those social things to affect our measurement.

So tickets should be cryptographically blinded. I said that they'd be identity-based, but the two aren't incompatible. That can be implemented something like this: In a database, for each ticket holder there is a digital certificate that connects his identity to either a real ticket or a decoy ticket. And that certificate can only be read with a certain key, and that key is only made available when catastrophe is declared.

This wouldn't prevent real ticket holders from bragging. After all, they know they've got a real ticket even if the refuge guardians haven't checked. When they paid, they presumably got something that proved they had a real ticket and not a decoy ticket, and they might be tempted to reveal it.

My answer: Flood the situation. Make it possible, even easy, for decoy ticket holders to pretend to be real ticket holders. Create fake "proof" of real ticket holder status for everybody - such that it won't hold up under cryptographic check but passes casual inspection. If the physical token is just a printout, offer decoy ticket holders fake real ticket printouts.

Guardians unwilling to eject

What if refuge guardians refuse to eject decoy ticket holders, perhaps on moral grounds or because they like particular individuals? This is largely an issue of physical circumstance, so let's change the physical circumstance. Let's blind the guardians to which fate they're dealing out.

Build several smaller shelters, or just compartmentalize a single shelter. When catastrophe is declared, move both real ticket holders and decoy ticket holders. Each real ticket holder is taken to another shelter, and each decoy ticket holder is taken elsewhere, perhaps to one of the resorts that Robin imagines.

So now checking tickets just reveals "to be sent to destination Codename A" or "destination Codename B". The individual goes with a driver or escort who has sealed instructions. Which sealed instructions he gets corresponds to A or B.

Ticket holders resist being ejected

What if decoy ticket holders try to stay by force?

Again it's an issue of physical circumstance. It almost seems as if the refuge guardians would have to treat decoy ticket holders as prisoners to be forcibly moved. But that would mean treating real ticket holders as prisoners too. Some people might feel that a real ticket holder was hardly worth it under those circumstances, or that being a decoy ticket holder was a particularly horrible fate. I can't persuade myself that this would not distort the measurement.

Fortunately that's not the only possible approach. We could try to make the ticket holders unware of which fate they are receiving. That's much harder, since they obviously know which type of ticket they bought. But it might be doable. Approaches:

  • Create incentives to buy either type of ticket for another individual, while telling all recipients that they got a real ticket. It's a different measurement than the one we've been talking about but seems just as revealing.
  • Probabilistically save some decoy ticket holders, with a predetermined probability. Let everyone know that this will happen, but not who is to be saved. Of course this affects the relative price of decoy tickets to real tickets, but we'd just multiply to get the original answer. This also lets us measure the effect of probability on the price.

We'd also want to remove any cues that one group or another "must all be decoy ticket holders" or "real ticket holders" as deduced from the presence of particular individuals. So we'd want individuals to be directed one by one, not as a group.

Incentive for guardians to follow the rules

For this to work, would-be buyers of real tickets have to be sure that refuge guardians have to actually do this as planned. But what incentive do guardians have to follow yesterday's rules on doomsday?

Dry runs

Of course there should be dry runs. Any number of aspects of it need to be frequently rehearsed in ways that make it as similar to the real thing as possible. We might even consider sometimes cutting off communication to the outside and falsely declaring catastrophe.

Loved ones

As mentioned above, one threat was that in the face of catastrophe, shelter guardians would of course rescue their own loved ones, ejecting the ticket-holding strangers.

We can turn that incentive into an advantage. Let each guardian's loved ones stay at a different refuge (up to some number). When catastrophe strikes, let the loved ones be moved as well. Normally, the loved ones are moved to another shelter just like real ticket holders. But if a guardian failed his duties, his loved ones are ejected and other people, presumably some of the decoy ticket holders, are saved instead. Don't be too shocked; I told you beforehand that this was thinking the unthinkable.

So now we have two operations of moving people between refuges. I'll call this one the "loved ones move" and the other the "ticket-holder move".

Schedule the ticket-holder move first. After it's done, check that everybody has got to the place they were supposed to go and the rules were followed. If not, assign responsibility. Then do the loved ones move.

The assignment of responsibility needs to require very little judgement and to be heavily cross-checked.

Who watches the watchers?

Other watchers, of course. Part of the protocol should be that guardians, by prearrangement, check on other guardians' correct performance, including their correct performance of checks on other guardians.

But then what are their refuge guardians' incentives to do the last loved ones' move by the rules? Maybe, impelled by a sense of camaraderie, they would save their fellow guardians loved ones regardless of their performance, instead of saving decoy ticket holders who are not beloved by their fellow guardians.

More generally, what incentive does the last-to-check guardian have to follow the rules? And if he needn't, the next-to-last-to-check guardian's performance is not assured, and so forth.

The performance guarantee is stronger with the chain of guardians. None but the last-to-check can be confident that everyone after them will let them off the hook.

But let's not entirely depend on that. Instead, let's keep it uncertain where the sequence ends. Let no guardian know for certain that nobody will check after they act. That means pre-arranging checks and keeping them secret, and having penalties for breaking the secrecy prematurely. Also, it means doing each loved ones' move at a random time and keeping it secret from the other refuges, so that no loved-one-mover may be confident that they are the last to act.

Hardware

Computers are not swayed by the prospect of armageddon, so when trying to ensure that certain rules will be followed on doomsday, it's natural to use them heavily.

But I can imagine catastrophes where computers don't work, perhaps if the disaster includes EMPs. Fortunately, this does not entirely prevent us from using cryptographic signatures and blinding and so forth. There are fallbacks such as doing the algorithm with pencil and paper or on a mechanical calculating machine. We'd need to make sure these fallbacks were available.

Afterword

Again I want to point out that this post is about predicting (and this maybe preventing) catastrophes, and as such needed to "think the unthinkable". Rest assured that none of this nasty business warms my heart.

Footnotes:

1 Nitpick: It also measures the will to survive.

2 But presumably we'd give some sort of physical token, even if it's just a printout.

completing-read wishlist

Completing-read wishlist

Two features I'd like to see in emacs' completing-read:

  • A way to make it return the value or the whole object instead of the string.
  • A way to make it possibly loop using a new collection generated from the selection. That is useful for:
    • Showing the user the likely options first, but allowing him to get the full list of options by choosing something like "\*More\*"
    • Presenting a set of options that mostly behave alike, plus a few exceptional ones. Like "pick one of {A,B,C,etc} or \*Create a new entry\* and use it", or "Pick a function name or \*Rescan\* or pick from \*Variables\* instead". My `hbmk' does this, and so does `imenu'. Some packages such as filesets.el would probably have used this if it were available and would be more convenient.
    • Chaining, where one selection leads to a new set of options that elaborates the choice. Sometimes your code wants to immediately do work on the first selection in which case this isn't useful. But more often your code doesn't commit to action until the user has committed to a choice, and sometimes there's nothing to do between successive selections.

Wrote them, but that didn't suffice

I actually have written both of these. #1 I wrote back in the stone age; I believe I called it tehom-4.el. #2 I more-or-less wrote when I rewrote pcomplete to support pcmpl-elisp; it's not a separate thing and only does what pcomplete wanted, but it could be factored out and added to. Other people have had a hand at this too.

But being written doesn't make them available! If I write code that uses tehom-4, that gives it an additional dependency that's not bundled with emacs. AFAICT, nobody uses tehom-4 for another project, they just roll their own. And if I were to factor the #2 code out of pcomplete, it would just be the same story again.

That's an aspect of a problem I've bumped into a lot lately with emacs: Too little re-use of code. You can't really require non-bundled libraries unless you make them part of your package.

How

I implemented (2) something like this: after completing-read finished, if the value was a list it would check the car of it. If it was:

`complete-further'
completing-read would run again, using the cdr of the value as a collection. (I used the symbol `further')
nil
Done, treat the cdr as the value.

Other possibilities

  • Call the cdr as a function and use its return value as a collection.
  • Show something (Help message, context, anything) but not be finished.

Minor wishlist items:

  • A slightly more refined sense of the "require-match" param, including:
    nil
    Do not require match at all
    always-interact
    Require match. Always interact, even if there are zero completions
    yn
    Require match.
    • Abort if there are no completions.
    • If there's only one completion, interact to give user a chance to abort.
    t
    Require match.
    • Abort if there are no completions.
    • If there's only one completion, just use it.

01 February 2011

fileset-whole works well

fileset-whole

Saturday I blogged about fileset-whole. It's extensions to emacs filesets, which I wrote because I was frustrated with all project managers and because emacs filesets was tantalizingly close to what I needed. I promised to blog about it again after I tried it out.

Works great

Well, I tried it out well the past few days, and it works great! For a newborn package, it's pretty robust. I only had to squash a few tiny bugs.

I'm using right now (literally a minute ago) to develop with. I created a default setting that supports the commands I commonly use, and a default keybinding that launches any filesets or fileset-whole command.

I also added support for filesets in Emtest, so now I'm running test suites from anywhere with just a couple of keystrokes,

C-c p t TAB RET

Where it is

It now lives in git://repo.or.cz/fileset-whole.git, see here I haven't made a package for it; just clone it from the repo. There is a mob branch if you want to contribute.

Running it (A tentative README)

I haven't written a README or other documentation for it, other than docstrings. That step always feels like talking into a vaccuum. I can never tell what information is actually wanted out there, or whether I'm just wasting time doc'ing something nobody uses.

But here goes:

  • Enable filesets. They are bundled with recent emacs. Just put in your emacs load sequence,
    (filesets-init)
    

    As usual, I recommend Era's my-site-start.el for managing the emacs load sequence neatly.

  • Install it. You'll probably want elinstall at git://repo.or.cz/elinstall.git. Having that, just run do-elinstall.el
  • Customize it. You don't strictly need to, but you will probably want to. You can do this just by
    M-x customize-apropos fileset-whole-commands RET
    
  • Customize group data. At the moment, this has to be done manually. In the future I hope to provide this automatically. Again you can get by without fully doing this, but filesets this isn't done for can't do much that's special. At least mirror each filesets group and add :root-dir to each.
    M-x customize-apropos fileset-whole-alist RET
    
  • To get the autoloads, restart emacs or just (if installed my way):
    M-x load-file 50fileset-whole.el RET
    
  • To launch any fileset-based command,
    C-c p
    

    and you will have a menu of commands.