24 June 2011

Trade Logic

Background: Prediction markets

Trade Logic has been kicking around in my head since I learned about Robin Hanson's idea of prediction markets1 around 1991.

A prediction market is a way of aggregating good-faith information about an issue. It's basically:

  • define some issue whose answer isn't know yet but will be,
  • take bets on it
  • later when the answer is known, pay off just the bets that were right.

In prediction markets, the bets always occur in opposing pairs, which I call yes and no. I'll illustrate it like this:


In these diagrams, the entire square represents one monetary unit. It's as if the diagram represented a way to partition $1. Not into smaller amounts of money, but into parts called yes and no, which together always add up to $1.

Introducing Trade Logic

As I talked about earlier, I want to add logic operations to prediction markets for certain reasons.

I've already hinted at one piece of the puzzle: each issue represents a separate way to decompose $1. Furthermore, pieces that are the same part of the same decomposition are interchangeable.

In Trade Logic, each issue (each decomposition) also corresponds to a formula. So when I write a or b in the following discussion, it refers to an issue in a prediction market, as well as to a formula and to a way of decomposing $1.

Trade Logic works by assembling the conclusion from parts of the premise(s). There may be parts left over. That is, bettors can buy yes or no of one or more issues and then assemble those pieces in a different way to form another issue.

That's how the logic itself works, but we also have to ask about incentives. Why would traders want to do that? They would if the new issue is trading at a too-high price, to make a profit by arbitrage. As a rule, that sort of situation is deterred and if it does arise it is quickly corrected. So as a consequence of the Efficient Market Assumption, Trade Logic reasons about related issues.

The unary operator not

The unary operator not, in a formula, is equivalent to swapping the sides of the issue. It swaps a yes for a no and vice versa. So a yes of a is the same as a no of ~a.

Basic binary operators to combine issues


Beyond this, we can combine two issues into a third issue. That's equivalent to combining their formulas under a binary operator. To do so, we must define a corresponding decomposition.

I'll call the issues being combined a and b.


This is composed of the a diagram above and a b diagram, which is just about the same and the a diagram except I have drawn it at right angles to a.



The and operator combines a and b into a new issue, a & b. A yes share of a & b pays off exactly when both a and b would pay off yes. A no share of a & b pays off when either a or b is judged no.


Nothing much changes if one of the issues is finalized before the other. A finalized issue just has a fixed value for yes and no payoffs; usually one is worth 1.0 and the other is worth 0.0.

So if, say, b yes was judged true, it's as if the ~b column of the diagram was eliminated and the b column was stretched to cover the entire square.



Similarly, the or operator combines a and b into a new issue, a V b, which pays off yes when either a or b pay off yes.



We're interpreting if as material implication, so

a -> b

means the same as

~a V b

So a -> b is just:


Definitions and variables


To use formulas in a practical way, we'd like to be able to define abbreviations for subformulas that are used frequently. For abbreviations to do the job, they need to be parameterized and the parameters need to be sharable. So we need some sort of definitions and variables.

Variables are instantiatable

We need the issues to be bettable. The motivation is not so much to make them decidable, it is to give appropriate values to issues that turn out to be fuzzily true.

Consider an issue that has always held true (yes trades at nearly $1.0), until a single exception to the rule is found. Should that issue now trade at $0? No. That would make traders afraid to touch any universal rule, which might collapse to $0 at any time. But if it's trading at a high price and is treated as universally quantified, bettors can cheaply "prove" that the exception is false, though it's true. This situation is a money pump.

To avoid that dilemma, the rule is that all variables are ultimately instantiatable by operations that select examples, rather than being universally quantified or existentially quantified. That doesn't mean that issues have specific examples associated with them; they don't. It means that there is a set way for finding an example that doesn't rely on somebody hand-picking one.

A selection operation would generally mean drawing a random element from a given distribution. A distribution can be just a single example; then it's just a literal. There will be rules about how random number generation should be done, and how many examples to select, and how to settle (or partly settle) an issue for a given number of examples. I won't expand on the various rules right now. That's Crypto and Information Economy and Statistics.

They are instantiatable transitively

I said variables are "ultimately" instantiatable. That means that a given variable might not directly be instantiatable by a selection operation, but it's connected to something that is, or to something that's connected to something (etc).

The treatment here is largely borrowed from Mercury and Prolog.

Predicates have modes. A mode maps each parameter in an argument list to either in or out. A predicate is applied in one of its modes.

For every legal mode of a formula, there must exist an ordering of predicate applications where:

  • At the beginning, every variable is free. It will transition to bound at some point. (If it doesn't, that just means it's irrelevant)
  • For each predicate application:
    • A legal mode for that predicate is used.
    • For every parameter that has instantiation type in in that mode, the associated variable is free before that application and bound after it.
    • For every parameter that has instantiation type out in that mode, the associated variable is bound both before and after that application.
  • For each not operator (and more generally, in negative context):
    • Every variable mentioned outside the scope of the not has the same instantiation before and after the not.

Which ordering to use is not predetermined, but the modes of predicate applications in it are predetermined in order to avoid ambiguity, so that we can't instantiate in two different ways, which would risk getting two different results.

This is a fairly blocky way of treating modes and I expect it will be treated in a more fine-grained way later.

What definitions are

In some systems, definitions are really axioms in disguise. In Trade Logic, that would be an intolerable hole; bettors would quick learn to exploit it and ruin the system. So we won't go in that direction.

Our definitions will expand names to parameterized formulas. That is, each definition will map a unique identifier to:

  • A formula
  • A parameter list: A non-repeating list of variables.
  • A set of modes, all legal for that formula with respect to the parameter list.

A definition is used in a formula by giving its name and an argument list of the correct arity in a position where a (sub)formula can appear.

So a definition:

  • is a predicate, not a function. It "returns" a fuzzy boolean, not an object.
  • is not a clause. Unlearn your Prolog for this. A name has one definition, you don't add more clauses later.
  • is total. Its argument list accepts any type of object.
  • is used in some particular instantiation mode.

I don't think we need to require that definitions be in a Tarski hierarchy. I expext Trade Logic to be exposed to many other sources of unclarity besides self-reference. Undecideable issues won't ruin the system. However, we may need to use a Tarski hierarchy for decision markets, which want controlled language and decideable issues.

Selection operations

Selection operations are also predicates. They are used the same way as definitions.

I presuppose a set of primitive selection operations. But that's beyond the scope of what I'm talking about right now.

For purposes of issue decomposition, any primitive selection operation that cannot be satisfied behaves as though it trades at $0 (ie, its yes trades at $0). For instance, "Select a living dodo bird". A selection operation that is satisfied behaves as though it trades at $1.

What a wff is

Having said all that, now I can recursively define a well-formed formula (wff) in Trade Logic as:

  • One of the built-in operations applied to the proper number of wffs:
    • The unary operation (not) applied to a single wff.
    • One of the binary operations applied to two wffs.
  • A predicate application, consisting of:
    • The name of a predicate
    • A list of variables whose length matches that predicate's arity
    • A mode of that predicate

Which formulas are the same?

Trade Logic does implication by assembling the conclusion from parts of the premise(s). So we want to know when two formulas are the same.

Two formulas represent the same issue (and decomposition) just if they are structurally the same, except allowing arbitrary permutations under and and or.

In other words, put subformulas under and or or into some canonical order before comparing formulas. Then mostly ignore the variables, except that the same respective variables have to appear in the same respective positions.


1 Back then he called it idea futures.

22 June 2011

More on Foreseeing Existential Risks

More on Foreseeing Existential Risks

Earlier I wrote about refuge markets 1. Basically, they are an attempt to estimate existential risks, which my Fairchy design needs to measure.

But Fairchy requires more from its measure of existential risks than refuge markets alone can deliver. It needs to measure all significant existential risks, not just the ones that I am thinking of now.

Add more metrics later? Not so simple.

It's tempting to answer "We'll add those things later when we think of them". But who counts as "we"2? Once the system starts, there will be all sorts of players in it. It is not likely that they would all simultaneously agree to a redesign.

You might suppose that an existential risk would be so universally appreciated that everyone would agree to measure it well. History suggests otherwise. For example, regardless of where you stand on global warming, you can agree that one side or the other resists real measures of that existential risk.

What sort of mechanism?

Since we will need to add new existential risk metrics but can't expect to just all agree, we need a mechanism for adding them. This mechanism must have these properties:

  1. Vested interests in seeing the risk as large or small must not affect the outcome.
  2. It should measure the risks with reasonable information economy; it should neither starve for information nor spend more than the Expected Value Of Perfect Information measuring them.
  3. It must be flexible enough to "see" new risks; beyond that, it should aggregate understanding about new risks. This suggests a decision market solution.

This implies that there is some sort of overarching perspective on existential risks that this mechanism leans on. But that is a circular situation: if we can't measure the specific existential risks, how can we hope to measure the general risk? We can hardly hope to make "our continued existence" an issue in a prediction market. For similar reasons, we can't make continued existence part of the general utility metric.

Not an answer: The personal utility metric

You might suppose, based on the central role of the individual satisfaction reports in Fairchy, that the answer simply falls out: people, preferring of course to live, would proxy part of their satisfaction report to measures of existential risk. But this does not satisfy any of the three properties above. It's really no wiser than voting.

Not an answer: Last minute awareness

There is one general source of information about existential risks: last-minute awareness.

The idea is that at the last minute, doomed people would know either "we saw it coming" or "we never saw it coming". Too late, of course. But previously, a prediction market would have bet on the outcomes. Using that, we would predict not only whether we will have seen it coming, but whether existential risks metrics contemplated

But even though bets would be technically be settled before the end of the world, settling them a few days before the end of the world is not much better.

One might say that, since we contemplate refuge markets, the people in the refuges could spend winnings. But that exactly misses. The whole point in needing multiple measures of existential risk is that refuges would save people in some situations, and in other situations, they wouldn't - think an underground bomb shelter in a flood. For each refuge, the situations where it would save people are exactly the situations that a refuge market can already measure.

So last minute awareness adds nothing.

So what are we missing?

But we humans are aware of existential risks. Collectively we're aware of a great many, some serious, some not. We know about them right now with no special social mechanism helping us. Of course, sometimes we're way off; see whichever side of Global Warming you disagree with. But in principle, if not in widespread social practice, we can understand many of these existential risks.

If it's so hard to predict existential risks in general, how do we do it now?

The answer is that we use analysis and logic, of course. We (some of us) think rationally about these things.

Of course, it's not as simple as exhorting Fairchy citizens to "be rational". Nearly everybody thinks they already are quite rational, and sensible, reasonable and every other mental virtue.

Nor can we simply require that analysis be "scientific" or presented in the form of a scholarly paper. See Wrong by David Freedman for why experts are frequently just plain wrong in spite of all scientific posturing. For analysis to work, it must not be something that a priesthood does and presents to the rest of us.

So I believe that analysis (and its "molecular building block", logic) must be intrinsic to the decision system. If we have that, we can simply3 add analysis-predicted survival as a component of the utility function.

Adding logic to the picture

Even if you follow the field, you probably haven't heard logic in connection with prediction markets or decision markets before. Analysis is seen as something that bettors do privately before placing their bets. It's not seen as something the system ought to support.

I thought about this topic years ago - starting about 1991 when Robin Hanson first told me his idea of prediction markets. I think I know how to do it. I call the idea "argument markets". In the next few posts I hope to describe this idea fully.


1 My version, fixing Robin Hanson's design of them.

2 A good general rule I use in thinking about Fairchy is to not picture myself in charge of it all. I don't picture my friends and political allies in charge, either. I picture the dumbest, craziest, and evillest people I know pushing their agendas with all the tools available to them, and I picture a soulless AI following its programming to its logical conclusion. I always assume there are fools, maniacs, villains, and automatons in the mix. So it doesn't appeal to me to make it all up as "we" go along.

3 There are a few free parameters, but those are preferences, not predictions, so they can be set via the individual satisfaction metric or similar.

18 June 2011

Kernel suggestions about modules

Two suggestions about Kernel modules


The first part of this was inspired by my desire to bundle testing submodules with Kernel modules. The second part was inspired by my email exchange with John Shutt after I sent the first part to him.

An issue about modules

Klink is at the point where I'm starting to think about built-in test support. (Till now it has relied entirely on an external tester I wrote for emacs) That brought up an issue about modules.

The dilemma

As I see it, the dilemma (multilemma) is this:

  1. IMO as a tester, tests ought to group with modules, for many reasons.
  2. I doubt it makes sense for modules to have canonical names or identities, so I can't tell the tests about the module, I have to tell the module about the tests.
    • So a test harness needs to be able to look at an environment and find the tests in it, if any. This probably implies they live somewhere in that environment.
  3. Reserving a name to always have a special meaning, such as "tests" or "tests", seems wrong for many reasons.
  4. make-keyed-static-variable wants to make a fresh environment.
    • That requires always loading the tests first, which is problematic at best.
    • Even if I can load a module before I load the tests for it, I'd still need to maintain a mapping from module to tests.
    • That makes it impossible to define tests incrementally.
  5. I could bind an object that a test-definer would write into. Say, an environment (In fact I will, to name sub-tests). But I'd still have to always place the binder around the entire module.
    • It's an error opportunity, having to always remember to do that.
    • It's structurally noisy.
    • The same would have to be done for every functionality that wants to let individual modules "say something about themselves".
  6. I could fake it with gensyms but with all the keyed variable support, it'd be a shame.

Possible solutions

Keyed setters

Have make-keyed-static-variable also make a setter, something with semantics similar to:

($vau (value) env ($set! env KEY value))

where KEY refers to the shared secret. If the "binder" return is analogous to `$let', this would be analogous to `$set!'. This would not make a fresh environment.

  • Con: Creates uncertainty and error opportunities about what environment is being defined into.
  • Con: Doesn't cooperate with constructs like `$provide'
Let accessors have defaults

Another possibility is for the accessor to optionally, if it finds no binding, make one and record it. Presumably it'd evaluate something to make a default.

  • Con: Same cons as above.
Smarter modules

Taking horn #5 of the multilemma as a first draft, provide a construction that surrounds a module with all the binders it should have.


  • Its identity:
    • Is the entry point `get-module' with a larger mandate?
    • Or is it a separate thing? IMHO no.
    • And should this be available on its own? Meaning "bind all the usual things but don't load anything". IMHO yes.
  • So which binders should it use?
    • Interested ones are somehow registered externally.
  • What happens for binders that are defined after a module is loaded?
    • Are they missing forever? That seems unfortunate.
    • Alternatively, they could behave as if their binders had been used in the first place, since nothing can have accessed them yet (which must have made an error).
      • Pro: This neatly handles circular dependencies and even self-dependencies.
  • How may they be registered?
    • If any combiner of the proper signature can be registered, stray code could be registered and subtly change the behavior of all sorts of modules. That'd be a serious problem.
    • So ISTM registering should be something only make-keyed-static-variable or similar can do. We know it makes a normal, harmless binder.
  • What specifically registers a binder?
    • make-keyed-static-variable, on some optional argument?
    • A relative of make-keyed-static-variable that always registers the binder?
      • `make-keyed-module-variable'


I lean towards the 3rd solution, smarter modules.

What do you think?

Late addendum

This implies that `make-keyed-module-variable' takes code to make the initial object, so I better mention that. Probably it's a combiner that's run with no arguments in the dynamic extent of the call to `get-module'.

Reloading with secrets intact

ISTM surely there are situations where one wants to reload a module but keep its "secrets" (keyed variables, encapsulation types) as they are. The alternatives seem unacceptable:

  • Somehow prove that there's never a situation where one wants to reload a module that has secrets.
    • Fatal con: There surely are such situations, eg minor bugfixes.
  • Require reloading everything to make sure that every instance of every encapsulation type etc is fully recreated.
    • Con: This makes it very painful to use the interpreter interactively. It would be like having to restart emacs every time you fix an elisp bug.
  • Track everything affected and reload just those things.
    • Con: Seems hugely difficult to track it all.
    • Con: Still might require so much reloading that it's effectively a restart.
  • Name keyed variables etc in the usual Scheme way
    • Fatal con: defeats their purpose.

A possible solution sketched in terms of the high-level behavior

Let secret-makers (make-keyed-static-variable, make-keyed-dynamic-variable, make-encapsulation-type) optionally be passed a symbol and a version-number (integer). Call those that aren't passed this argument "anonymous secret-makers".

Let each module retain, between loads in the same session, a mapping from (symbol version-number) to private info about the respective secret-maker. Anonymous secret-makers don't participate in the mapping.

When a secret-maker is being created, if its symbol and version-number match an earlier version, then the elements that it returns are to be `eq?' to what the earlier version returned, as if the "two" secret-makers were one and the same.

Anonymous secret-makers never satisfy that test. They behave as if they had a new symbol each time.

It is legal for secret-makers to have the same version-number across source-code changes, but then if changes occur within the same session, proper update behavior is not guaranteed.

Rationale: This allows "secret-makers" to usually carry over automatically, yet allows them to be overridden when desirable, eg when their old definition is wrong.

The version-number is separate to avoid making the user create a new name for each version. In principle it could also let an interpreter react intelligently to "going backwards", or warn on missing redefinitions without also giving false warnings for new versions.

We could have instead required the interpreter to treat source-code changes as new versions, but this seems an unreasonable burden and raises issues of code equivalence, and removes control from the user. But an interpreter is allowed to do this, and since it is legal for secret-makers to keep the same version-number across source-code changes, doing so requires nothing special.

To version a secret-maker, this requires changing source code, because the version-number lives in source code. This is less than ideal because it's really a session property that's being expressed, not a source property. But it is generally reasonable.

We require only that the elements that it returns be `eq?'. Requiring that the whole return value be eq? seems unneccessary, though it probably falls out.

Enabling mechanism: Cross-load memoization

The above all can be accomplished by cross-load memoization, which further makes it possible to make all sorts of objects eq? across multiple loads.

This requires mostly:

  • That modules retain an object across repeated loads.
  • That that object, relative to the module being loaded, be accessible for this purpose.
  • That the object be an environment, because it will map from symbol to object.
  • Code that:
    • Takes a (name . version) argument
    • Accesses the above environment relative to current module. It's the only thing that can access it.
    • checks version
    • re-uses old value if appropriate
    • otherwise calls to create new object
    • records current (name . version) and value
  • A recipe for using the secret-makers this way. Maybe simply:
    ($module-preserve (my-unique-name 1) (make-keyed-static-variable)) 

How should this object be shared?

Code defined in other modules shouldn't normally see this or use it, so these objects are not shared dynamically. They are shared statically. That implies affecting the environment that `get-module' makes, that loading runs in.

Presumably we'd use `make-keyed-static-variable' and share the binder with `get-module' and the accessor with `$module-preserve'.

Sketch of actual requirements on Kernel

  • The standard environment would contain `$module-preserve', defined as above.
  • `get-module' or similar would take an additional argument, the object, which it would make statically available to `$module-preserve'.
  • Any future `require' type mechanism would map module names to these objects.
    • It would create new ones for new modules.

04 June 2011

Review: The Irish Tenors: Live in Belfast

The Irish Tenors: Live in Belfast

I didn't like this CD quite as much as I liked Ellis Island, but nevertheless it grew on me.


It's by The Irish Tenors. It's Finbar Wright's debut with the group. He replaced John McDermott, but McDermott still sings on 2 tracks.

What I liked

The Percy French medley was a lot of fun. It's got 3 of French's sunniest tunes. That includes The Lay Of the West Clare Railway aka Are Ye Right There, Michael?. I recommend following the link for the story of Percy French and that railway. Rarely has a late train resulted in so much embarrassment for a railway.

I hadn't heard Mary From Dungloe before; it's quite lyrical.

I liked McDermott on The Last Rose Of Summer. It is the right song for him.

Some familiar songs done well: The Fields Of Athenry, Red Is The Rose, The Kerry Dancers, Will Ye Go, Lassie, Go?,

What I didn't like

One disappointing track was Carrickfergus. I've liked the song since I spotted it in a little book of Irish songs 20 years ago. It stands out as lyrical and melodic even among Irish songs.

But I felt that arranger Frank McNamara, whom I normally like, overdid the arrangement, or rather, overdid half of it. The low sections have a nice open feel to them. I liked those. The high sections are accompanied by big orchestral flourishes. I understand what he was trying to do, but IMO those orchestral tuttis were wrong for this lyrical song.

Also, some of the less familiar tracks don't have much to offer. Maybe they'll grow on me too, who knows?

03 June 2011

Fairchy: Further thoughts

Further thoughts on Fairchy

In my earlier post about Fairchy 1, I said that it fixed the FAI boxing issue; not so much solved it as did an end run around it.

That's true but I didn't explain it very well. The crucial idea is not that we've found and removed all avenues for "escaping the box". It's that we've brought our AI's goal out of the messy real world and into a system that we control. It wants to profit in a futarchy-style market. That's simple.

There's always a question of whether its goal matches our own collective human goals. Always, in any workable design of this nature, there must be selective pressure towards our own collective goals and that pressure must always outrun all selective pressure towards misguided "clippy" goals.

In Fairchy, the selection is not done in the messy real world, nor (fundamentally) in the unfathomable inner workings of an AI's rulebase, but in a marketplace that profits by serving human preferences.

What this implies

I partly covered this before, but I want to make it clearer and explore it more thoroughly.

"It wants to profit in a futarchy-style market". So the basic security threat here is that a speculator AI would lose the spirit of this rule and so its goals would change, probably in a harmful direction.

That threat seems to break down into these subheadings:

  1. A speculator finds a way to reward itself despite losing in the market.
  2. A speculator finds profitable behaviors within the market that go against the spirit of the utility function. Perhaps it enacts proposals which compromise the utility metric or compromise the market, thus creating a profit even though they "should" create a loss.
  3. A speculator perversely engages in unprofitable behaviors within the market, enough to create a real problem before it is stopped.
  4. A speculator finds a more direct way to alter the real world which affects the market to the AI's profit. Possibly it makes deals with outside entities.

Addressing these threats

Rewards itself despite losing in the market

This threat is more mechanical than the others. Breaking it down further, it's either:

  • A breakdown in the AI itself.
  • A breakdown in what communicates market results to the AI.
  • A breakdown in the population selection algorithm.

Breakdown in the AI

A short-circuit, though it would probably occur in software. Here the AI tells itself it's doing great in the market when it isn't. For a while it lives in a fool's paradise making crazy bets.

Already, this is handled by the fact that speculators are individuals in a population-style learning algorithm (think genetic algorithm). Speculators that do not heed the market will eventually go broke and be replaced.

But we'd like to handle it faster, if possible. So the selection process might be augmented to actively detect broken AIs. I'll leave that as a future possibility.

Breakdown in what communicates market results

Here, the population algorithm doesn't help us because this might affect all AIs, and because it might not be the fault of the AI affected.

But it's largely a maintenance and channel-robustness problem. The protocols involved should be robust. Presumably we'd design them with such obvious steps as:

  • periodic pinging - are the connections alive?
  • checking - is what we received the same as what was sent?
  • periodic auditing - does the history add up the way it ought to?

Breakdown in population selection

This area is crucial. Fortunately a population selection module would be much simpler than the "real" AIs, which helps security.

Some security measures are obvious:

  • Keep the population selection processes isolated from pretty much everything.
  • Make them robust.
  • Make their decisions inescapable. Whatever machines host speculator processes need to absolutely respect the population selection's decisions to remove some speculators and add others.

There should also be dynamic and reactive security, and the measured security of this area needs to be part of the utility metric.

Finds behaviors within the market against the spirit

This reminds me that I left out a crucial role earlier: Proposer. The Proposer role is what makes the proposals that the market bets on.

A severe threat is lurking here. As I've repeatedly pointed out wrt Futarchy 2, the proposer and speculator roles can collude in ways that can rob the market or enact arbitrary proposals. I call this the Opacity Problem3.

So the proposer and speculator roles need to be separate. Yet those two roles are working from largely shared information and models. They benefit a lot from sharing information. So as before, I propose Speculator support modules to deal with this situation; I'd just extend them to support proposers too.

But keeping them separate isn't enough: if there exists any channel by which proposer and speculator can co-ordinate, the Opacity Problem can happen. So while my design keeps these two roles separate, that will only help a little, it won't suffice.

So my design includes the various means I have proposed of dealing with the Opacity Problem:

  • Measuring uncertainty in meaning via the market, and disadvantaging unclear proposals.
    • I proposed separate markets to measure uncertainty, but my proposal was complex. I now favor what Raph Frank proposed on my futarchy mailing list, separate markets that pay off with exponents near 1.0, for instance X1.1 and X0.9
  • Requiring a certain amount of capitalization before enactment, in addition to price comparisons.
  • Controlled language for proposals
  • A hierarchy of policy "power", with more powerful levels having stronger restrictions on clarity.
  • (Etc. I've discussed anti-Opacity Problem measures at more length on my futarchy mailing list)

Perversely engages in unprofitable behaviors

This is the least of the threats. There's constant selective pressure against this. Other AIs can probably profit by detecting or anticipating such behavior.

But there is a subtle threat, once that you might not expect: Other AIs, since they can profit by detecting or anticipating such behavior, find a way to encourage it, and then of course profit by blocking enactment. They may well find that this scam works best when occasionally the rogue proposal gets enacted, the better to prove their wisdom in betting against it.

In other words, they compete and fight dirty and hurt us in the process.

One approach would be to change the reward function so that part of it reflects group success. But that isn't possible. In a population-style learning algorithm, reward is zero-sum.

Approaches that might work:

  • Like the above, but reward wrt smaller groups, and somehow arrange that opportunities for influencing weaker competitors occur mostly or only within these groups.
  • Like the above, but a low overall reward results in different behavior by the population algorithm. Possibly it results in more mutation in the next generation.
  • Provide a more direct way of taking advantage of weaker competitors, one that does not result in actually enacting proposals that the weaker competitors foolishly support. Possibly the bets of newborn competitors do not help a proposal be enacted.
  • Prevent or weaken any mechanism for colluding to allow rogue proposals to pass. Anonymous betting may suffice.

Finds a way to alter the real world

This is the most direct threat. For a human speculator, it would look like (say) shorting a company's stock and then blowing that company's buildings up and making a profit. And then hopefully being caught and punished, deterring other saboteur profiteers.

For an AI, it's less straightforward. An AI speculator lives in virtual space; it wouldn't have any robot arms with which to blow things up. Its communications would be limited and easy to monitor. Its bank account could only be used for the few legitimate purposes it has: betting, buying information, not much more.

Still, there is a real risk that an AI would co-ordinate with outside entities and profit from destruction. This is essentially like human crime, and might be addressed by the same general types of measures:

  • Prevention
    • Remove the temptation
      • Require proposals to be cast in form such that speculators can't profit from sabotage and extortion.
    • Detect and stop the plans before they are acted on. Since we can monitor speculator AI communications, they are at a disadvantage in this. But they could still use encryption and steganography.
    • Detect AIs at risk for doing this. There's a measure that can be used with AIs, as I proposed before: run the AI with fake temptation and see if it succumbs. That's what the Tempter and Temptation-faller roles are about.
  • Deterrence
    • Investigation and subsequent punishment, presumably removal via the population algorithm.
  • Systemic measures
    • Add to the Security Testers role a sub-role to measure this threat.
    • Add a component to the utility metric that measures risk of such AI-crimes.


1 And now I wish I'd left more anchors in the first fairchy post, because now I want to link to different parts of it and I can't.

2 And found to my shock and chagrin that Robin Hanson was incapable of understanding the Opacity Problem. I tried eight times and finally gave up explaining it to him when he became insulting.

3 Basically the exploit is that the proposer makes a proposal that only he can decode. Sometimes beneath the opaque exterior it's an honest proposal or a copy of one, sometimes it's "gimme all the money". If others bet against it, he can basically rob them. If not, he can enact it (and so rob everyone).