18 June 2011

Kernel suggestions about modules

Two suggestions about Kernel modules

Foreword

The first part of this was inspired by my desire to bundle testing submodules with Kernel modules. The second part was inspired by my email exchange with John Shutt after I sent the first part to him.

An issue about modules

Klink is at the point where I'm starting to think about built-in test support. (Till now it has relied entirely on an external tester I wrote for emacs) That brought up an issue about modules.

The dilemma

As I see it, the dilemma (multilemma) is this:

  1. IMO as a tester, tests ought to group with modules, for many reasons.
  2. I doubt it makes sense for modules to have canonical names or identities, so I can't tell the tests about the module, I have to tell the module about the tests.
    • So a test harness needs to be able to look at an environment and find the tests in it, if any. This probably implies they live somewhere in that environment.
  3. Reserving a name to always have a special meaning, such as "tests" or "tests", seems wrong for many reasons.
  4. make-keyed-static-variable wants to make a fresh environment.
    • That requires always loading the tests first, which is problematic at best.
    • Even if I can load a module before I load the tests for it, I'd still need to maintain a mapping from module to tests.
    • That makes it impossible to define tests incrementally.
  5. I could bind an object that a test-definer would write into. Say, an environment (In fact I will, to name sub-tests). But I'd still have to always place the binder around the entire module.
    • It's an error opportunity, having to always remember to do that.
    • It's structurally noisy.
    • The same would have to be done for every functionality that wants to let individual modules "say something about themselves".
  6. I could fake it with gensyms but with all the keyed variable support, it'd be a shame.

Possible solutions

Keyed setters

Have make-keyed-static-variable also make a setter, something with semantics similar to:

($vau (value) env ($set! env KEY value))

where KEY refers to the shared secret. If the "binder" return is analogous to `$let', this would be analogous to `$set!'. This would not make a fresh environment.

  • Con: Creates uncertainty and error opportunities about what environment is being defined into.
  • Con: Doesn't cooperate with constructs like `$provide'
Let accessors have defaults

Another possibility is for the accessor to optionally, if it finds no binding, make one and record it. Presumably it'd evaluate something to make a default.

  • Con: Same cons as above.
Smarter modules

Taking horn #5 of the multilemma as a first draft, provide a construction that surrounds a module with all the binders it should have.

Issues:

  • Its identity:
    • Is the entry point `get-module' with a larger mandate?
    • Or is it a separate thing? IMHO no.
    • And should this be available on its own? Meaning "bind all the usual things but don't load anything". IMHO yes.
  • So which binders should it use?
    • Interested ones are somehow registered externally.
  • What happens for binders that are defined after a module is loaded?
    • Are they missing forever? That seems unfortunate.
    • Alternatively, they could behave as if their binders had been used in the first place, since nothing can have accessed them yet (which must have made an error).
      • Pro: This neatly handles circular dependencies and even self-dependencies.
  • How may they be registered?
    • If any combiner of the proper signature can be registered, stray code could be registered and subtly change the behavior of all sorts of modules. That'd be a serious problem.
    • So ISTM registering should be something only make-keyed-static-variable or similar can do. We know it makes a normal, harmless binder.
  • What specifically registers a binder?
    • make-keyed-static-variable, on some optional argument?
    • A relative of make-keyed-static-variable that always registers the binder?
      • `make-keyed-module-variable'

Coda

I lean towards the 3rd solution, smarter modules.

What do you think?

Late addendum

This implies that `make-keyed-module-variable' takes code to make the initial object, so I better mention that. Probably it's a combiner that's run with no arguments in the dynamic extent of the call to `get-module'.

Reloading with secrets intact

ISTM surely there are situations where one wants to reload a module but keep its "secrets" (keyed variables, encapsulation types) as they are. The alternatives seem unacceptable:

  • Somehow prove that there's never a situation where one wants to reload a module that has secrets.
    • Fatal con: There surely are such situations, eg minor bugfixes.
  • Require reloading everything to make sure that every instance of every encapsulation type etc is fully recreated.
    • Con: This makes it very painful to use the interpreter interactively. It would be like having to restart emacs every time you fix an elisp bug.
  • Track everything affected and reload just those things.
    • Con: Seems hugely difficult to track it all.
    • Con: Still might require so much reloading that it's effectively a restart.
  • Name keyed variables etc in the usual Scheme way
    • Fatal con: defeats their purpose.

A possible solution sketched in terms of the high-level behavior

Let secret-makers (make-keyed-static-variable, make-keyed-dynamic-variable, make-encapsulation-type) optionally be passed a symbol and a version-number (integer). Call those that aren't passed this argument "anonymous secret-makers".

Let each module retain, between loads in the same session, a mapping from (symbol version-number) to private info about the respective secret-maker. Anonymous secret-makers don't participate in the mapping.

When a secret-maker is being created, if its symbol and version-number match an earlier version, then the elements that it returns are to be `eq?' to what the earlier version returned, as if the "two" secret-makers were one and the same.

Anonymous secret-makers never satisfy that test. They behave as if they had a new symbol each time.

It is legal for secret-makers to have the same version-number across source-code changes, but then if changes occur within the same session, proper update behavior is not guaranteed.

Rationale: This allows "secret-makers" to usually carry over automatically, yet allows them to be overridden when desirable, eg when their old definition is wrong.

The version-number is separate to avoid making the user create a new name for each version. In principle it could also let an interpreter react intelligently to "going backwards", or warn on missing redefinitions without also giving false warnings for new versions.

We could have instead required the interpreter to treat source-code changes as new versions, but this seems an unreasonable burden and raises issues of code equivalence, and removes control from the user. But an interpreter is allowed to do this, and since it is legal for secret-makers to keep the same version-number across source-code changes, doing so requires nothing special.

To version a secret-maker, this requires changing source code, because the version-number lives in source code. This is less than ideal because it's really a session property that's being expressed, not a source property. But it is generally reasonable.

We require only that the elements that it returns be `eq?'. Requiring that the whole return value be eq? seems unneccessary, though it probably falls out.

Enabling mechanism: Cross-load memoization

The above all can be accomplished by cross-load memoization, which further makes it possible to make all sorts of objects eq? across multiple loads.

This requires mostly:

  • That modules retain an object across repeated loads.
  • That that object, relative to the module being loaded, be accessible for this purpose.
  • That the object be an environment, because it will map from symbol to object.
  • Code that:
    • Takes a (name . version) argument
    • Accesses the above environment relative to current module. It's the only thing that can access it.
    • checks version
    • re-uses old value if appropriate
    • otherwise calls to create new object
    • records current (name . version) and value
  • A recipe for using the secret-makers this way. Maybe simply:
    ($module-preserve (my-unique-name 1) (make-keyed-static-variable)) 
    

How should this object be shared?

Code defined in other modules shouldn't normally see this or use it, so these objects are not shared dynamically. They are shared statically. That implies affecting the environment that `get-module' makes, that loading runs in.

Presumably we'd use `make-keyed-static-variable' and share the binder with `get-module' and the accessor with `$module-preserve'.

Sketch of actual requirements on Kernel

  • The standard environment would contain `$module-preserve', defined as above.
  • `get-module' or similar would take an additional argument, the object, which it would make statically available to `$module-preserve'.
  • Any future `require' type mechanism would map module names to these objects.
    • It would create new ones for new modules.

04 June 2011

Review: The Irish Tenors: Live in Belfast

The Irish Tenors: Live in Belfast

I didn't like this CD quite as much as I liked Ellis Island, but nevertheless it grew on me.

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh679wpb1bmwdzsxA7ezh4RhZiDCrGAoZThtgmM2Dngi2B9Y3HqjhRBQWcUGdalMjMvBp5sndMlYZNlD8RYK50bc10EgDhs-dKqJER4QMB7EDe27sKjZ2paq3BpfzIgtmrwUiAnp_81f-U/

It's by The Irish Tenors. It's Finbar Wright's debut with the group. He replaced John McDermott, but McDermott still sings on 2 tracks.

What I liked

The Percy French medley was a lot of fun. It's got 3 of French's sunniest tunes. That includes The Lay Of the West Clare Railway aka Are Ye Right There, Michael?. I recommend following the link for the story of Percy French and that railway. Rarely has a late train resulted in so much embarrassment for a railway.

I hadn't heard Mary From Dungloe before; it's quite lyrical.

I liked McDermott on The Last Rose Of Summer. It is the right song for him.

Some familiar songs done well: The Fields Of Athenry, Red Is The Rose, The Kerry Dancers, Will Ye Go, Lassie, Go?,

What I didn't like

One disappointing track was Carrickfergus. I've liked the song since I spotted it in a little book of Irish songs 20 years ago. It stands out as lyrical and melodic even among Irish songs.

But I felt that arranger Frank McNamara, whom I normally like, overdid the arrangement, or rather, overdid half of it. The low sections have a nice open feel to them. I liked those. The high sections are accompanied by big orchestral flourishes. I understand what he was trying to do, but IMO those orchestral tuttis were wrong for this lyrical song.

Also, some of the less familiar tracks don't have much to offer. Maybe they'll grow on me too, who knows?

03 June 2011

Fairchy: Further thoughts

Further thoughts on Fairchy

In my earlier post about Fairchy 1, I said that it fixed the FAI boxing issue; not so much solved it as did an end run around it.

That's true but I didn't explain it very well. The crucial idea is not that we've found and removed all avenues for "escaping the box". It's that we've brought our AI's goal out of the messy real world and into a system that we control. It wants to profit in a futarchy-style market. That's simple.

There's always a question of whether its goal matches our own collective human goals. Always, in any workable design of this nature, there must be selective pressure towards our own collective goals and that pressure must always outrun all selective pressure towards misguided "clippy" goals.

In Fairchy, the selection is not done in the messy real world, nor (fundamentally) in the unfathomable inner workings of an AI's rulebase, but in a marketplace that profits by serving human preferences.

What this implies

I partly covered this before, but I want to make it clearer and explore it more thoroughly.

"It wants to profit in a futarchy-style market". So the basic security threat here is that a speculator AI would lose the spirit of this rule and so its goals would change, probably in a harmful direction.

That threat seems to break down into these subheadings:

  1. A speculator finds a way to reward itself despite losing in the market.
  2. A speculator finds profitable behaviors within the market that go against the spirit of the utility function. Perhaps it enacts proposals which compromise the utility metric or compromise the market, thus creating a profit even though they "should" create a loss.
  3. A speculator perversely engages in unprofitable behaviors within the market, enough to create a real problem before it is stopped.
  4. A speculator finds a more direct way to alter the real world which affects the market to the AI's profit. Possibly it makes deals with outside entities.

Addressing these threats

Rewards itself despite losing in the market

This threat is more mechanical than the others. Breaking it down further, it's either:

  • A breakdown in the AI itself.
  • A breakdown in what communicates market results to the AI.
  • A breakdown in the population selection algorithm.

Breakdown in the AI

A short-circuit, though it would probably occur in software. Here the AI tells itself it's doing great in the market when it isn't. For a while it lives in a fool's paradise making crazy bets.

Already, this is handled by the fact that speculators are individuals in a population-style learning algorithm (think genetic algorithm). Speculators that do not heed the market will eventually go broke and be replaced.

But we'd like to handle it faster, if possible. So the selection process might be augmented to actively detect broken AIs. I'll leave that as a future possibility.

Breakdown in what communicates market results

Here, the population algorithm doesn't help us because this might affect all AIs, and because it might not be the fault of the AI affected.

But it's largely a maintenance and channel-robustness problem. The protocols involved should be robust. Presumably we'd design them with such obvious steps as:

  • periodic pinging - are the connections alive?
  • checking - is what we received the same as what was sent?
  • periodic auditing - does the history add up the way it ought to?

Breakdown in population selection

This area is crucial. Fortunately a population selection module would be much simpler than the "real" AIs, which helps security.

Some security measures are obvious:

  • Keep the population selection processes isolated from pretty much everything.
  • Make them robust.
  • Make their decisions inescapable. Whatever machines host speculator processes need to absolutely respect the population selection's decisions to remove some speculators and add others.

There should also be dynamic and reactive security, and the measured security of this area needs to be part of the utility metric.

Finds behaviors within the market against the spirit

This reminds me that I left out a crucial role earlier: Proposer. The Proposer role is what makes the proposals that the market bets on.

A severe threat is lurking here. As I've repeatedly pointed out wrt Futarchy 2, the proposer and speculator roles can collude in ways that can rob the market or enact arbitrary proposals. I call this the Opacity Problem3.

So the proposer and speculator roles need to be separate. Yet those two roles are working from largely shared information and models. They benefit a lot from sharing information. So as before, I propose Speculator support modules to deal with this situation; I'd just extend them to support proposers too.

But keeping them separate isn't enough: if there exists any channel by which proposer and speculator can co-ordinate, the Opacity Problem can happen. So while my design keeps these two roles separate, that will only help a little, it won't suffice.

So my design includes the various means I have proposed of dealing with the Opacity Problem:

  • Measuring uncertainty in meaning via the market, and disadvantaging unclear proposals.
    • I proposed separate markets to measure uncertainty, but my proposal was complex. I now favor what Raph Frank proposed on my futarchy mailing list, separate markets that pay off with exponents near 1.0, for instance X1.1 and X0.9
  • Requiring a certain amount of capitalization before enactment, in addition to price comparisons.
  • Controlled language for proposals
  • A hierarchy of policy "power", with more powerful levels having stronger restrictions on clarity.
  • (Etc. I've discussed anti-Opacity Problem measures at more length on my futarchy mailing list)

Perversely engages in unprofitable behaviors

This is the least of the threats. There's constant selective pressure against this. Other AIs can probably profit by detecting or anticipating such behavior.

But there is a subtle threat, once that you might not expect: Other AIs, since they can profit by detecting or anticipating such behavior, find a way to encourage it, and then of course profit by blocking enactment. They may well find that this scam works best when occasionally the rogue proposal gets enacted, the better to prove their wisdom in betting against it.

In other words, they compete and fight dirty and hurt us in the process.

One approach would be to change the reward function so that part of it reflects group success. But that isn't possible. In a population-style learning algorithm, reward is zero-sum.

Approaches that might work:

  • Like the above, but reward wrt smaller groups, and somehow arrange that opportunities for influencing weaker competitors occur mostly or only within these groups.
  • Like the above, but a low overall reward results in different behavior by the population algorithm. Possibly it results in more mutation in the next generation.
  • Provide a more direct way of taking advantage of weaker competitors, one that does not result in actually enacting proposals that the weaker competitors foolishly support. Possibly the bets of newborn competitors do not help a proposal be enacted.
  • Prevent or weaken any mechanism for colluding to allow rogue proposals to pass. Anonymous betting may suffice.

Finds a way to alter the real world

This is the most direct threat. For a human speculator, it would look like (say) shorting a company's stock and then blowing that company's buildings up and making a profit. And then hopefully being caught and punished, deterring other saboteur profiteers.

For an AI, it's less straightforward. An AI speculator lives in virtual space; it wouldn't have any robot arms with which to blow things up. Its communications would be limited and easy to monitor. Its bank account could only be used for the few legitimate purposes it has: betting, buying information, not much more.

Still, there is a real risk that an AI would co-ordinate with outside entities and profit from destruction. This is essentially like human crime, and might be addressed by the same general types of measures:

  • Prevention
    • Remove the temptation
      • Require proposals to be cast in form such that speculators can't profit from sabotage and extortion.
    • Detect and stop the plans before they are acted on. Since we can monitor speculator AI communications, they are at a disadvantage in this. But they could still use encryption and steganography.
    • Detect AIs at risk for doing this. There's a measure that can be used with AIs, as I proposed before: run the AI with fake temptation and see if it succumbs. That's what the Tempter and Temptation-faller roles are about.
  • Deterrence
    • Investigation and subsequent punishment, presumably removal via the population algorithm.
  • Systemic measures
    • Add to the Security Testers role a sub-role to measure this threat.
    • Add a component to the utility metric that measures risk of such AI-crimes.

Footnotes:

1 And now I wish I'd left more anchors in the first fairchy post, because now I want to link to different parts of it and I can't.

2 And found to my shock and chagrin that Robin Hanson was incapable of understanding the Opacity Problem. I tried eight times and finally gave up explaining it to him when he became insulting.

3 Basically the exploit is that the proposer makes a proposal that only he can decode. Sometimes beneath the opaque exterior it's an honest proposal or a copy of one, sometimes it's "gimme all the money". If others bet against it, he can basically rob them. If not, he can enact it (and so rob everyone).

20 May 2011

Automatic forcing of promises in Klink

Automatic forcing of promises in Klink (addendum)

I had meant to explain also about auto-forcing in `$let' or `$define!', but for some reason I didn't. So I'm adding this now.

Background: In Kernel, combiners like `$let' and `$define!' destructure values. That is, they define not just one thing, but an arbitrarily detailed tree of definiendums.

So when a value doesn't match the tree of definiendums, or only partly matches, but the part that doesn't match is a promise, Klink forces the promise and tries the match again.

Unlike argobject destructuring, this doesn't check type.

Automatic forcing of promises in Klink

Automatic forcing in Klink

As I was coding EMSIP in Kernel, I realized that I was spending entirely too much time and testing to manage the forcing of promises.

I needed to use promises. In particular, some sexps had to have the capability of operating on what followed them, if only to quote the next sexp. But I couldn't expect every item to read its tail before operating. That would make me always read an entire list before acting. This isn't just inefficient, it is inconsistent with the design as an object port, from which objects can be extracted one by one.

What I do instead

So what I have now is automatic forcing of promises. This occurs in two destructuring situations. I've coded one, and I'm about to code the other.

Operatives' typespecs

Background: For a while now, Klink has been checking types before it calls any built-in operative. This operation can check an argobject piece by piece against a typespec piece by piece. That destructures it treewise to arguments that fill an array that is exactly the arguments for to the C call. It's very satisfactory.

Now when an argobject doesn't match a typespec, but the argobject is a promise, the destructure operation arranges for the promise to be forced. After that comes the tricky part. While the destructuring was all in C, it could just return, having filled the target array. But now it has to also reschedule another version of itself, possibly nested, and reschedule the C operation it was working towards.

All fairly tricky, but by using the chain combiners and their support, and by passing destructure suitable arguments, I was able to make it work.

Defining

As in `$let' or `$define!'. I'm about to code this part. I expect it to be along similar lines to the above, but simpler (famous last words).

Status

I haven't pushed this branch to the repo yet because I've written only one of the two parts, the destructuring. That part passes the entire test suite.

I haven't yet tried it with EMSIP to see if it solves the problem.

Does it lose anything?

ISTM this does not sacrifice anything, other than the {design, coding, testing, debugging} effort I've spent on it.

Functionality

It subtracts no functionality. `force' is still available for those situations when manual control is needed.

Restraint

The opposite side of functionality. Does this sacrifice the ability to refrain from an action? No. In every circumstance that a promise is forced, the alternative would be an immediate error. There could never have been a capacity to do the same thing except refraining from forcing.

But does it sacrifice the ability to make other code refrain from an action? No, the other code could have just called `force' at the same points.

Exposure

Does this expose whether a promise has been forced? No, not in any way that wasn't already there. Of course one can deduce that a promise has been forced from the fact that an operation has been done that must force that promise. That's always been the case.

Code size

The init.krn code is actually slightly smaller with this. The C code grew, but largely in a way that it would have had to grow anyways.

13 May 2011

FAIrchy diagram

FAIrchy diagram

I wrote yesterday about FAIrchy, my notion that combines FAI and futarchy. Here is an i* diagram that somewhat captures the system and its rationale. It's far from perfect, but captures a lot of what I was talking about.

Many details are left out, especially for peripheral roles,

Link to this diagram

Some comments on this diagram technically

I felt like I needed another type of i* goal-node to represent measurable decision-market goal components, which are goal-like but are unlike both hard and soft i* goals. Similarly I wanted a link-type that links these to measurement tasks. I used the dependency link, which seemed closest to what I want, but it's not precisely right.

There's some line-crossing. Dia's implementation of i* makes that inevitable for a large diagram.

FAIrchy

FAIrchy1

In this blog post I'm revisiting a comment I made on overcomingbias2. I observed that Eliezer Yudkowsky's Friendly Artificial Intelligence (FAI) and futarchy have something in common, that they are both critically dependent on a utility function that has about the same requirements. The requirements are basically:

  • Society-wide
  • Captures the panorama of human interests
  • Future-proof
  • Secure against loophole-finding

Background: The utility function

Though the utility functions for FAI and futarchy have the same requirements, thinking about them has developed very differently. The FAI (Singularity Institute) idea seems to be that earlier AIs would think up the right utility function. But there's no way to test that the AI got it right or even got it reasonable.

In contrast, in talking about futarchy it's been clear that a pre-determined utility function is needed. So much more thought has gone into it from the futarchy side. In all modesty, I have to take a lot of the credit for that myself. However, I credit Robin Hanson with originally proposing using GDP3. GDP as such won't work, of course, but it is at least pointed in the right general direction.

My thinking about the utility function is more than can be easily summed up here. But to give you a general flavor of it, the problem isn't defining the utility function itself, it's designing a secure, measurable proxy for it. Now I think it should comprise:

  • Physical metrics (health, death, etc)
  • Economic metrics
  • Satisfaction surveys.
    • To be taken in physical circumstances similar to secret-ballot voting, with similar measures against vote-selling, coercion, and so forth.
    • Ask about overall satisfaction, so nothing falls into the cracks between the categories.
    • Phrase it to compare satisfaction across time intervals, rather than attempting an absolute measure.
    • Compare multiple overlapping intervals, for robustness.
  • Existential metrics
  • Metrics of the security of the other metrics.
  • Citizen's proxy metrics. Citizens could pre-commit part of their measured satisfaction metric according to any specific other metric they chose.
    • This is powerful:
      • It neatly handles personal identity issues such as mind uploading and last wills.
      • It gives access to firmer metrics, instead of the soft metric of reported satisfaction.
      • It lets individuals who favor a different blend of utility components effect that blend in their own case.
      • May provide a level of control when we transition from physical-body-based life to whatever life will be in the distant future.
      • All in all, it puts stronger control in individual hands.
    • But it's also dangerous. There must be no way to compel anyone to proxy in a particular way.
      • Proxied metrics should be silently revocable. Citizens should be encouraged, if they were coerced, to revoke and report.
      • It should be impossible to confirm that a citizen has made a certain proxy.
      • Citizens should not be able to proxy all of their satisfaction metric.
  • (Not directly a utility component) Advisory markets
    • Measure the effectiveness of various possible proxies
    • Intended to help citizens deploy proxies effectively.
    • Parameterized on facets of individual circumstance so individuals may easily adapt them to their situations and tastes.
    • These markets' own utility function is based on satisfaction surveys.

This isn't future-proof, of course. For instance, the part about physical circumstances won't still work in 100 years. It is, however, something that an AI could learn from and learn with.

Background: Clippy and the box problem

One common worry about FAI is when the FAI gets really good at implementing the goals we give it, the result for us would actually be disastrous due to subtle flaws in the goals. This perverse goal is canonically expressed as Clippy trying to tile the solar system with paper clips, or alternatively with smiley faces.

Clippy-the-paper clip jpeg

The intuitive solution is to "put the AI in a box". It would have no direct ability to do anything, but would only give suggestions which we could accept or disregard. So if the FAI told us to tile the solar system with paper clips, we wouldn't do it.

This is considered unsatisfactory by most people. To my mind, that's very obvious. It almost doesn't need supporting argument, but I'll offer this: To be useful, the FAI's output would certainly have to be information-rich, more like software than like conversation. That information-richness could be used to smuggle out actions, or failing that, to smuggle out temptations. Now look how many people fall for phishing attacks even today. And now imagine a genius FAI phishing. A single successful phish could set in motion a chain of events that allows the FAI out of the box.

FAIrchy: The general idea

What I propose is this: The most important AIs, rather than directly doing things or even designing and advising, should be traders in a futarchy-like system. As such, they would in effect govern other AIs that design, advise, and directly do things.

At first, they'd be trading alongside humans (as now). Inevitably with Moore's Law they'd dominate trading, and humans would only use the market to hedge. By then, AIs would have organically evolved to do the right (human-satisfying) thing.

Treat these AI traders as individuals in a population-style search algorithm (think genetic programming). Select for the most profitable ones and erase those that overstepped their roles.

Advantages

  • There's a built-in apprenticeship stage, in that the AIs are basically doing their eventual job even in the early stages, so any striking problems will be apparent while humanity can still correct them.
  • We get the advantage of a reasonable satisfaction metric up front, rather than hoping AIs will design it well.
  • These AIs have no incentive to try to get themselves unboxed. Earlier I talked about subtly perverse utility functions. But with these, we understand the utility function: make a profit in the decision markets. They can't go subtly off the track of human happiness, because that's not even the track they're intended to be on. We do need to make sure that corrupting the utility metric can't pay off, of course, but that's not a new issue.
  • The AIs would learn from people's real satisfaction, not just from theoretical projections.

About the separate AI roles

In general

The healthy performance of each role should be a component of the overall utility function.

Separation of roles: Why

Don't allow mingling of AI roles, and especially not the speculators role and the security-tester role. The threat here is that a speculator AI that also moves in the real world may find a way to short-circuit the system for profit. For instance, it might find a way to distort the satisfaction reports, or destroy things corresponding to issues it had shorted.

Put a different way, we don't want the various roles to co-evolve outside of their proper functions. We never want a situation where one role (say, security) is compromised because on the whole, it's more profitable to compromise it and profit somewhere else (say, in speculating)

Effectively, this separation creates a sort of distributed system that includes us and our satisfaction metric. This was never a desideratum but it is encouraging.

Separation of roles: How

Of course we'd use the obvious physical and computational security measures. We'd run the trader AIs in a way that lets us physically monitor their traffic and operations. Probably they'd be run in virtual machines.

I'll talk about other measures in the next section, in regard to specific roles.

The AI roles

Speculator

The role described above. Trades in a big futarchy decision market.

Effector

AIs that "do things". This includes probably every role you pictured an AI in before now.

They, like society, would be governed by the FAIrchy system. This probably cannot be improved on because it subsumes every other conceivable proposal for making rules for them or turning them off.

Sub-speculator

Sub-speculators also trade in a big futarchy decision market. Their purpose is to trade against trivial market patterns, for instance the "january slump", so that the speculators can only focus on meaningful patterns.

Sub-speculators would be deliberately limited in information and processing power, so that they couldn't out-think the real speculators, but must simply react faster and more lightly.

Several possible means of effecting this situation:

  • These might be "brokers" thru which the speculators trade.
  • They might simply be allowed quicker access.
  • They might be run in short bursts so they have no time to think deeply but plenty of CPU power to react fast.

Tempter

To keep roles from mingling, we might use defect seeding: mechanically present AIs with apparent temptations to overstep their role; erase the ones took the bait and subtract fitness from their relatives.

This would eventually be done by other AIs that constructed the temptations and were rewarded accordingly. Running them in a virtual machine helps: clone the AI without the copy knowing it's a clone.

Temptation-faller, etc

To keep the tempters themselves honest, the temptations would also be applied to AIs that looked like speculators, but are known to fall to temptation. They might be copies of speculators that fell and were turned off. If they didn't fall, we'd know the temptations weren't effective.

Less crucially, known honest AIs would be used too and expected not to fall, so that temptations are kept fair.

Temptation-faller designers would also need testing; this can regress quite a bit.

Information provider

We'd remove any incentive to get out into the world to look for better information by feeding all the speculators the best-quality information available.

This would also removes any incentive for outside competition that doesn't follow the non-mingling rules. So we are looking for a self-sustaining monopoly on best information-for-AI-speculators. This condition might form part of the utility function.

Information providers would be rewarded according to both:

  • Overall decision market success
  • Individual orders by speculators

Speculator support modules

When two or more AI speculators find certain computations useful, have them farm off that particular code to a support module. This probably shouldn't be optional. This profits the two speculators because they are now spending half as much computing time for the same information.

A speculator support module becomes a separate entity, obliged to make its results available to any speculator. It would have its own fitness function. If its fitness function fell below 0, it could be reabsorbed by a speculator, if it was still wanted. Its fitness wouldn't fall below zero if two or more speculators are interested enough to run it themselves.

Security testers

These would try to measure the security of the human satisfaction metric and other sensitive parts of the system. They don't try to improve it or safeguard it, just to test it without compromising it.

They must be separate from speculators in the overall market.

Security test testers

Again, the roles regress but dwindle.

Satisfaction extrapolator (Maybe)

This role is inspired by Eliezer's CEV but structured very differently. Like CEV, these AIs would try to predict human satisfaction in particular future conditions. Here, that would be done in a separate futures market.

But they don't replace the satisfaction metric, nor even directly contribute their prediction to the utility function. Rather, when their extrapolated expectation disagrees significantly with the satisfaction that is reported, it triggers an extra-hard look at the security of the human satisfaction metric. If corruption is found, it triggers penalization of responsible parties.

The extra-hard look should include some metrics that were kept secret, or shared only with the security testers.

These AIs are generally rewarded according to the accuracy of their extrapolation. But they are also deliberately insulated from the results of security testing so that they have no incentive to adjust their extrapolation when security is judged to be weak. This might be effected by subtracting the security result metric out so that the net effect of weak security on their estimates is zero.

Physical upkeep

Physical upkeep of the hardware. Sure, people could do that for a long time, but eventually Moore's Law prevails.

Software upkeep

Runs the population algorithm and the various interactions (tempter, etc). Sure, human-written code could do that for a long time, but again Moore's Law prevails. It should prove new versions to be functionality equivalent of old versions, test new versions in sandboxes, etc.

Footnotes:

1 I had originally called this "futurairchy" and then "futairchy", but both seemed clumsy.

2 Which was moved from overcomingbias.com to lesswrong in the great split.

3 He then proposed GDP+, but he just defines that as GDP plus unspecified other components.