08 May 2011

Chains in Klink

Chains in Klink

As I mentioned in the previous post, Klink is a hybrid of C functions and a dispatch loop (_klink_cycle). The dispatch loop finds a "next" C-based combiner and runs it. This can be an applicative or an operative. Under the hood, operatives come in several C types.

About a week ago, I added chains to this set. Chains are really a vector of combiners. Evaling a chain queues all those combiners to run, then continues to them with the value it was passed.

Store and load

In order to make this work neatly, I also added two other new types of (pseudo)combiners, T_STORE and T_LOAD. Those are only available "under the hood" for optimization, because they know things about the stack that shouldn't be exposed.

T_STORE destructures the current value and stores on the stack a vector of the resulting objects.

T_LOAD builds a tree from recent values according to a template. "Leaves" of the tree index an element from a stored vector. That's a slight simplification: Leaves can also be constants, which are used unchanged.

About the stored vectors

These stored vectors collectively are somewhat like an environment in that they retain and retrieve values. But unlike the environment, they are distinct across distinct continuations that have yet to assign to them. In other words, if you use this mechanism to capture a combiner's return value, and then before using that value, you re-enter and capture it again, the new return value will not overwrite the old return value, nor vv. Both will be available in their respective continuations.

These stored vectors expire a certain number of frames down the stack, at the bottom of the chain. That keeps them from confusing other chains and keeps the vectors from hanging around uselessly forever.

Future improvements

At the moment, T_LOAD treats pairs of type (integer . integer) specially, as indexes. I should clean this up by making a dedicated type for indexes.

Together, T_LOAD and T_STORE may supersede T_CURRIED, a type of operative that combines current value with stored data in various possible ways. In order to do that, T_LOAD will have to can access the current value.

Eventually I may switch to a more linear spaghetti stack. Then I will have to change the indexing scheme from (integer . integer) to single integer.

I said earlier the stored vectors were somewhat like an environment but different. Eventually I'd like to use essentially this mechanism to optimize well-behaved environments, so we can index rather than always looking up a symbol. Those in contrast would have to be capturable by continuations. They would also need special care in cases where the environment is captured at some point - either skip those cases or make this data part of environments.

History

Originally instead of T_STORE and T_LOAD, I used special instructions that were assign values when the chain was run. This proved complex and difficult to work with.

Tracing in Klink

Tracing in Klink

This week I added improved tracing to Klink.

The new tracing is intended to complement the C code. Rather than report the eval/apply cycle, it reports each time thru the main loop, which is in _klink_cycle.

Klink code is a mixture of pure C and eval cycles. The pure C can't capture continuations, so it can't call arbitrary code. But it is needed in order to run, and it is faster.

But tracing in C is only good for the pure C parts. For the rest, it just reports that _klink_cycle is being visited over and over. Tracing in Kernel, inherited from Tinyscheme, didn't correspond to much of anything. It both skipped over important parts and redundantly traced things that could have been traced in C.

The new tracing code traces _klink_cycle. That exactly complements the C code.

Other enhancements

My new tracing code does two other nice things

Reports the stack depth

It reports how deep the current call is in the spaghetti stack.

In languages with continuations, that's neccessarily imperfect, since you can jump all around there with continuations. Nevertheless, it's quite useful for finding callers. Before, tracing was just a mass of calls and trying to discern which call invoked a call of interest was nearly hopeless.

It names every C operative

Before, traces used to read like:

 (#<OPERATIVE> arg another-arg)
 (#<OPERATIVE> arg this-arg that-arg)
 (#<OPERATIVE> arg arg another-arg)
 (#<OPERATIVE> arg arg)

That didn't tell me much. So just for tracing, there is a special environment that the printer sees that tells it the name of each native C operative.

The way it's done isn't flexible right now, but I hope to make that a parameter to print that tracing will use.

Looks like:

klink> (new-tracing 1)
0
klink> (new-tracing 1)

Decurry 
10: Eval: (C-kernel_eval_aux #,new-tracing (1) #<ENVIRONMENT>)
Decurry 
11: Eval: (C-kernel_mapeval (1) () #<ENVIRONMENT>)
Decurry 
10: Eval: (,(unwrap #,eval) (,(unwrap #,new-tracing) 1) #<ENVIRONMENT>)
Decurry 
10: Eval: (C-kernel_eval_aux ,(unwrap #,new-tracing) (1) #<ENVIRONMENT>)1
klink> (new-tracing 0)

Decurry 
10: Eval: (C-kernel_eval_aux #,new-tracing (0) #<ENVIRONMENT>)
Decurry 
11: Eval: (C-kernel_mapeval (0) () #<ENVIRONMENT>)
Decurry 
10: Eval: (,(unwrap #,eval) (,(unwrap #,new-tracing) 0) #<ENVIRONMENT>)
Decurry 
10: Eval: (C-kernel_eval_aux ,(unwrap #,new-tracing) (0) #<ENVIRONMENT>)1
klink> 

To use it

To use new tracing:

(new-tracing 1) ;;On
(new-tracing 0) ;;Off

The old tracing inherited from Tinyscheme is still there:

(tracing 1) ;;On
(tracing 0) ;;Off

03 April 2011

Profiling in Klink

Profiling in Klink

Klink is now at the stage where it makes some sense to profile it. The C part can just be profiled as C. That makes sense for the C parts, but tells me nothing about the Kernel parts. (Some of Klink is in a Kernel prolog that is loaded first) As far as the C profiler knows, that's merely _klink_cycle being called many times.

What I need to know is how much each combiner of interest causes _klink_cycle to be called. ISTM I won't need to track C-to-C calls even when they are "the same" as Kernel code. The C profiler will do a far better job of profiling C-to-C calls.

Profiling in Guile

Profiling Scheme is not like profiling C. There's no single relevant call-stack. A call might apply continuations, force promises, or cause dynamic guards to be called. It might even be returned to multiple times. There's not neccessarily much relation between the stack below a call and the amount of evaluation that that call caused.

So I looked at how other Schemes do profiling, specifically Guile. Surprise, Guile's statprof just does it the C way! It just walks up "the" stack. Exactly what I didn't want to do.

What I'm doing instead

The stack in Klink is a spaghetti stack (as usual for Scheme). It's also heterogeneous. While call frames are the "main" thing, and they are what is popped when one combiner is evaluated, the stack includes:

  • call frames
  • guards
  • dynamic variables

So to that list I added profile structures. They contain:

  • Pointer to the combiner in question
  • Count of eval-cycles so far.

That much I've already written. For profiling, each _klink_cycle will push the combiner it is using. That's unless the same combiner is already being profiled in this frame, which happens when a combiner recurses as a tail call.

That's because _klink_cycle stores the profiling node above the original frame (ie, one place further away from the leaves of the stack). It has to, because it pops the frame that made that combiner launch. If it stored tail calls too, recursive functions could accumulate many profiling frames. In particular, the REPL loop would have a ton of them, because it's implemented as a recursive combiner.

I may make profiling count distinct tail calls, but for now I'm just treating tail calls as continuances of the original call.

This design will not show calls that never terminate. It's not clear that they could be meaningfully profiled. This specifically includes calls that never continue normally.

This design also doesn't try to handle re-entered continuations. The second time a continuation is used, the elapsed number of _klink_cycle counts bears no relation to how much evaluation the call caused. Again, it's not clear that this situation can be meaningfully profiled. I will probably just ignore returns other than the first time.

It's also not clear what this should do when a call's continuation is stored (say in user environment), and later is used for the first time. It would be misleading to count every _klink_cycle between the two times. So this design is far from perfect too.

Slightly different topic: Plans for the stack

Make it a list of Kernel objects

At the moment, the stack nodes are a different set of types than Klink objects. But it nags at me that they partly overlap and have an extra layer of indirection that was inherited from Tinyscheme. For instance, a combiner on the stack is very nearly a Kernel combiner object. But it also contains an environment, and it has to have a second tag to type it on the stack.

In the future I am hoping to make the stack just a list of Kernel objects. One tag would disambiguate these objects both type-wise and stack-wise.

That would let a number of "hot" stack-handling functions be simpler. But it would also eject the static environment object from stack frames. However, the static environment is mostly re-used every frame. We only need to restore environment occasionally:

  • When continuing abnormally
  • When popping (continuing normally out of) a vau
  • When popping an eval
  • In certain special contexts such as evaluating guards or promises

In some of those contexts I already have to manage environment explicitly. So I may simply move an environment argument into the continuation object, and treat an environment object on stack as meaning "restore this static environment".

Args on stack

At the moment, arglists in Klink are literally lists. There is no attempt to vectorize them, nor keep track of previous type-checks on them.

If the stack directly stored objects, it would be tempting to also use it to store an argvector when that is applicable, and to store intermediate variables on it, and even vectors of environments that we know won't be captured.

However, such objects must be categorically distinguished from all other stack objects. Combiner objects at least risk being falsely recognized as frames. All that might be required is a type that means "The next N slots are reserved, pop past them when popping frame".

COW spaghetti stacks

Months ago I blogged about a design I call COW spaghetti stacks. Basically spaghetti stacks that try to be vectors for sub-portions that allow that. The above changes would make it tempting to implement them in Klink.

25 March 2011

On Daniel Dennett's Freedom Evolves

Dennett on free will

I just finished reading Daniel Dennett's Freedom Evolves. It's about free will and freedom. His overall point is that over time, over evolution, we have acquired more or more powerful free choices.

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6zwBB3PhRU_uHhKDxXe20qZCwPwPP_FyDGIhE7dH05sicxQyaZhk5pvIy5iRg6IH_CGiXsC_K23MqFANQs9_ubP0iHmG3YsuDhIjvaQpPDXSbZkkQCMr1oRBqnhl-d8rqxel75-JTJa4/

"Stop that crow!"

Thruout the book, Dennett seems to worry that his ideas on free will will dispel some useful illusion. He often uses the phrase "Stop that crow!", which he borrows from the Dumbo movie, where the elephant can fly as long as he's holding what he thinks is a magic feather.

Free will

Now, I consider "free will" to be a chimerical concept in its common usage. The term bundles several very different things together:

  • Subjectively making choices
  • Lack of physical determinability. As opposed to observing a person's brain in complete detail and predicting the "free" choice he makes later that day.
  • Status as automaton or not.
  • Moral responsibility, as opposed to "you can't blame me because somebody or something `made' me do it"

Dennett never dissects the strands of meaning in this way. But in chapters 2 and 3, he demonstrates that there is no essential connection between free will and lack of physical determinability. He also refutes the (silly IMO) position that quantum indeterminability somehow makes for free will.

From Physics to Design in Conway's Life World

He motivates the non-connection between choices and physical determinability with the example of Conway's game of Life.

Although Life is "really" a pixelated rectangle with every pixel on equal footing, Life experimenters distinguish and name higher-level patterns - "entities" within the game, one might say. Experimenters also make design choices that often include avoiding harm to their creations, eg by placing blocks where gliders might smash into them. Avoiding is fundamentally a choice. Entities within the game itself could theoretically be designed to react to incoming disruptions by making avoidance choices,

Now, Life is actually Turing Complete - given a large space, you can actually build a Universal Turing Machine in it. And so Life entities avoidance choices could theoretically reach any specified standard of intelligence.

And of course Life is fully pre-determined. So entities in Life could make real choices in a fully pre-determined world.

I blogged about chapter 3 without knowing it

Chapter 3 (Thinking about Determinism) basically repeats Who's Still Afraid of Determinism? Rethinking Causes and Possibilities, a paper by him and Christopher Taylor on causality. By co-incidence, I blogged about it earlier1.

The Evolution of Moral Agency

The chapter I liked best was chapter 7, the Evolution of Moral Agency. In it he advances a theory, new to me but pulled variously from Robert Frank and George Ainsley, that we have moral emotions in order to resist temptation. And why resist temptation? It's actually in our long term interests (Frank). And why not just directly act on our long-term interests? Because temptation increases hyperbolically as it gets nearer (Ainsley), and we can counter it best with a competing stimulus, that being moral emotions (Frank), and that works best if it's not under conscious control (Dennett himself).

This theory has the ring of truth to it.

The Future of Human Freedom

I found his final chapter weak (as Dennett goes). He's concerned with "Holding the line against creeping exculpation". Ie, as we learn more and more about why people make the choices they do, must that mean we can less and less

Given Dennett's past writings, I was surprised that he didn't bring his previous ideas to bear more. Instead, he writes a woolly final chapter, littered with poorly chosen real world examples that he distances himself from.

What I would have said

Rather than merely criticize, I will offer my own answer. I would have said, leaning on Dennett's own earlier ideas on design, that we are blameworthy or praiseworthy exactly for occurences that we designed.

Important points:

  • I did not say "declaratively designed". Don't imagine an inventor with a blueprint and a contraption run amuck, or anything of the like. I mean "design" in the sense that Dennett has used it previously, the sense in which design permeates life. In particular:
    • Lack of declarative faculty is no bar to responsibility. My dog can't describe or reflect on her choices, but she can properly be praised or blamed. A little. To a doggy standard of responsibility, not a human one.
    • Self-deception is no bar to responsibility.
  • An important but subtle distinction: we are responsible for the design that was realized, not for the realized outcome. That means that "wiggle room" between design and realized outcome is no bar to responsibility. In particular, I contemplate:
    • Probabilistic tradeoffs, regardless which outcome is realized. A drunk driver, even thought she happened not to hit anyone, has done a blameworthy thing.
    • Contingent outcomes. So bureaucrats, when one says "I just recommended X" and the other "I just followed the recommendation", don't avoid responsibility. They each partly designed the outcome, contingent on the another's action.

The virtues of this definition:

  • It defeats the creeping exculpation that concerns Dennett.
  • It doesn't bar exculpation where there's really no mens rea.
  • It's fuzzy on just the right things:
    • When it's fuzzy whether an occurence is the realization of some design.
    • (The flip side of the previous) When it's fuzzy whether a design was realized. (again, distinct from whether an outcome was realized)
  • It allows multiple responsibility for the same occurence in situations, like the example that Dennett gives on page 74.
  • It gives the right answer when a designer is a surprising entity such as a group, an organization, or a system.
  • We don't have to know the responsible person or group's inner workings in order to assign responsibility, we can treat him/her/it as a black box that effectuations of designs come out of. Black-boxing generally makes analysis easier.
  • We can understand "attempts" and "incitement" and similar qualified sins in this framework.
  • We can understand lack of responsibility due to being deceived in this framework.
  • It distinguishes our responsibility from that of non-choosers (fire) and weak choosers (animals) in a principled, non-question-begging way.

Footnotes:

1 Short summary of my position: When trying to define causality, we should really be looking at quality of explanation.

24 March 2011

Kernel Suggestions

Kernel Suggestions

Having gotten neck-deep in coding Klink, most recently listloop, I've found a few small areas where IMHO the Kernel specification might be improved.

for-each should cycle like $sequence

John Shutt clearly considered this (6.9.1) and decided the other way: Terminating on circular lists is more important than in-order behavior.

I'm tempted by his rationale. But `for-each' is not like `map', it is very nearly the `apply' counterpart of `$sequence'. The parallel between them persuades me that they should behave the same way on cycles.

Let finders return (found? . value)

Again, John considered this with `assoc' (6.3.6) and decided the other way. But `assoc's case doesn't generalize. Successful `assoc' results can't be nil because they are pairs, but in general, `nil' is a possible object to find.

Already I find myself writing code that wants to use a (bool . value) return, like:

($define! (found . obj) (some-finder))
($if (and? 
        (not? found)
        (some-check))
   ($define! (found . obj) (another-finder))
   #inert)

It's very convenient. This regular form also makes it easy to wrap finders in control constructs that easily understand whether the result of a finder is available.

I would ask that finders in general return a (bool . value) pair, including:

  • assoc, or a reserved name associated with assoc.
  • assq, or again a reserved name associated with it
  • $binds? + (eval SYM ENV)
    • Provided in Klink as `find-binding'
  • The accessors built by make-keyed-dynamic-variable and make-keyed-static-variable. With this, it is easier to determine whether a known keyed variable is bound or not, but raising an error if it's not requires an extra step. Ie, the reverse of the current definition. Either could be defined in terms of the other.

More on List combiners in Klink

Klink list combiners 2

Since the last post, I've implemented a sort of general listloop facility in Klink. It consists of:

  • Two new T_ types
  • A number of predefined styles (Two for now, easily expanded)
  • A C function to create a listloop
  • A C function to iterate a listloop

What it accomplishes

  • It takes the place of a number of specialized loop facilities.
  • It potentially can turn most map-like operations into streaming operations by changing just a few parameters. I haven't written that part yet, because promises are implemented but are not yet available in the C part of Klink.
  • It can potentially build somewhat more powerful list loop functionality almost directly in C, again by just passing parameters. For instance, assoc, find, and position.
  • It appears possible to reason about it in a fairly general way, for optimization purposes.

Dimensions of loop behavior

Last time, I spoke of two dimensions of looping: Family and Argument Arrangement. And a mea culpa: I was wrong to group $sequence with for-each. They are similar but $sequence returns its final value unless the list is circular, `for-each' never does.

On further analysis, I ended up with more dimensions than that. I considered 9 dimensions of loop behavior, and ended up with 6 style parameters and 6 loop instantiation parameters, plus 2 parameters that are always supplied outside.

The dimensions of loop behavior are not independent. There are a few combinations that make no sense.

The dimensions themselves

From my design notes:

  • Termination control
    • Cases:
      • When countdown is zero
      • When arglist is empty
      • When value is #t/is #f
        • But for this case we also need an empty or countdown condition.
        • Could be generalized to any eq? test, possibly useful for find.
      • (Unlikely to need a case for when arglist is empty except for N elements, since neighbors is figured out by count first)
  • The effective operative return value
    • Cases:
      • Map: Cons of value and accumulator
        • Final value is reverse accumulator
      • Boolean/reduce/sequence: Value itself
        • Final value is value
      • Each: No return value
  • Terminator:
    • Cases:
      • Usually none
      • Map: Reverse the value
        • That's just reverse
      • For-each: Give #inert
        • Just curry the K_INERT value
    • All can be done without reference to the loop object itself
  • Treatment of the arglist(s). Cases:
    • Use a single element (car, cdr remains)
    • Use elements from multiple lists (car across, cdr across remains)
    • Use multiple consecutive elements ((car, cadr), cdr remains)
  • Other arguments to the operative
    • Pass the accumulator too (reduce does this)
      • This is really passing the previous value, which is sometimes an accumulator.
    • Pass the index too
    • Pass the "countdown" index too.
    • Make the value the first arg in a list
      • For $sequence, mapeval, etc. Basically those that use a single element.
  • The operative
    • Cases:
      • Freeform (The fully general case)
      • None, value is directly used (Basically for booleans)
        • Could be 0 and be skipped, but early let's use `identity'
      • Eval the arg (Just suitable for 1-list)
        • Could be special, but early let's just use eval.
      • No-K
        • Means we can loop in C.
    • This is just which operative to use, maybe with specializations.
  • The initial value
    • Cases:
      • For reduce this sometimes has to be flexible. Otherwise no.
      • Bools: #t, #f
      • Map: ()
    • Since it passes thru the setup, maybe it could be outside if the setup makes a combiner rather than schedules anything.
  • Checking value's type, error if it's not as expected. This is really a debugging move, but is important in its place.
  • Making a stream or not.

The parameters

Style parameters

combiner
Default combiner used in each iteration. It can be NULL for no default.
collect_p
A boolean whether to collect a list of result values. This actually makes a reversed list, and callers generally pass the result to `reverse'.
step
How to get successive values from the list being processed. There's no attempt to deal with circulatiry here. Possibilities:
1-list
The list consists of many unary arguments
many
The list consists of lists, whose heads collectively form a single n-ary argument. Ie, what `map' sees.
neighbors
The list again consists of many unary arguments, but we pass the current head and the cadr. Ie, we use successive pairs of list neighbors.
many-streams
(Not implemented yet) Like many, but the top elements may be streams instead of lists. This will terminate when any element runs out.
mk_val
A C function how to make the value that we pass to the combiner. Can be NULL, meaning to just use the result of the "step" phase. This is needed for `reduce', so that we can pass the accumulator to the binary combiner.
destructurer
A destructurer, usually of type T_DESTRUCTURE. Used with argselect to turn Kernel arguments into something C mklistloop understands.
arg_select
A set of selections from the arguments destructurer finds, used with destructurer.

Listloop parameters

style
A listloop style object, as above.
list
The actual input list.
combiner
The combiner used for each iteration. If not assigned, the default will be used.
top_length
For `many', the length of the top element in case it happens to be circular. Listloop does not encycle the value to make it structurally correct, that has to be done in the combiner.
countdown
Number of elements left in the list, or a negative number if unused.
countup
Number of elements seen so far. Unused as yet.
stop_on
Stop if the return value is eq? this. For boolean list operations like and?, $and?, every?, and the various predicates.

15 March 2011

List combiners in Klink

Regularities in list combiners in Klink

Yesterday I was writing list combiners for Klink, my interpreter for a Scheme-derived languages called Kernel. I couldn't help noticing many regularities between them. Not just obvious regularities, such as and? / $and? / or? / $or?, but deeper ones that pervaded the set of list-oriented combiners. They seemed to all fall into one of a small set of families, and one of a small set of argument arrangements.

Here's a table of existing (or planned shortly) combiners that followed this pattern:

Args\FamilyAndOrMapEach
Non-opand?or?copy-list
Eval$and?$or?mapeval1$sequence
Operate on donutevery?some?mapfor-each
Neighbors>? et allist-neighbors

The families are as follows:

and
Value must be boolean, stop early on #f.
or
Value must be boolean, stop early on #t.
map
Collect values, no specified order.
each
Evaluate for side-effects, in order.

And the argument arrangements are:

Non-op
Non-operative on given values
Eval
Evaluate the list element as a form
Rectangular
Operate on transposed rectangular args
  • Takes 2 length-counts
  • Unsafe itself but used as a worker by safe combiners.
  • Name format: counted-NAME
Unary
Operate on args to a unary operative
  • Takes 1 length count, the other is always 1.
  • Specializes "rectangular".
  • Name format: NAME1
Donut
Operate on possibly infinite lists ("donuts")
  • Called thru apply-counted-proc
Neighbors
Operate on list-neighbors

Some tempting extensions: argument arrangements

I also envision other argument arrangements that aren't part of the Kernel spec, but which I've found useful over the years:

Ragged

This is the indeterminate-length variant of "Donut" and "Rectangular". It operates on "ragged" transposed args that need not all be the same length, and may in fact be streams.

It produces a stream. Ie, each iteration it returns:

(cons value ($lazy (start the next iteration)))

It terminates when any input stream or list runs out of arguments. It returns:

($lazy (termination-combiner streams))

where `termination-combiner' is a parameter. That is done to reveal the final state, which the caller may be interested in.

It is allowed to never terminate, since a caller need not evaluate the entire stream.

Combiners of this type would have the same name as donut combiners plus "-ragged", eg "map-ragged".

Indexed

Operate on transposed args plus an "index" argument. This is just providing an argument that already exists internally. It doesn't constrain the order of operations.

Blackboard - probably not

Operate on transposed args plus a "blackboard" argument. Intended for providing and collecting results that don't fit neatly as mapped returns. It maintains the overall control structure of visiting each element of a list, but can collect results in more varied ways.

In its favor as an argument arrangement, most of the families combine with it to provide useful behavior: Stop on false, stop on true, or keep going for side effects.

Against it, it's not clear that it can co-operate coherently with the `map' family, which has no specified order of evaluation.

It also seems to require another argument arrangement underneath it.

And finally, it's not clear that it is sufficiently distinct from `reduce' to be worth having. And blackboard is an argument arrangement, while reduce is proposed as a family. That suggests that something is wrong with one or both.

So probably no to "blackboard". Some variant of "reduce" will likely do it instead.

Some tempting extensions: Families

Reduce

In reduce, the value cascades thru each element. Though superficially related to the list-neighbors argument arrangement, it's not the same thing. It considers the elements one by one, not two by two. Its binary combiner gets two arguments only because one is an accumulator.

In its favor as a family, it already comes in counted and donut versions.

Standing against it is that its counted form is abnormally a non-op arrangement instead of a unary or rectangular arrangement. But non-op can be considered a specialization of `unary' with `identity' as the operative.

And it's not clear that an `eval' arrangement could ever make sense. But again, the `eval' arrangement is a specialization of unary, with `eval' as the operative and the single argument as the form.

So I'll wait and see whether reduce can fit in or not. But it looks promising.

Footnotes:

1 mapeval is internal, used by the C code for the `eval' operation. However, I see no problem exposing it and it might be useful.