Ultimate secure choice
Previously
I wrote about Fairchy, an idea drawn from both decision markets and FAI
that I hope offers a way around the Clippy and the box problem that
FAI has.
Measuring human satisfaction without human frailties
One critical component of the idea is that (here comes a big mental
chunk) the system predictively optimizes a utility function that's
partly determined by surveying citizens. It's much like voting in an
election, but it measures each citizen's self-reported satisfaction.
But for that, human frailty is a big issue. There are any number of potential ways to manipulate such a poll. A manipulator could (say) spray oxytocin into the air at a polling place, artificially raising the reported satisfaction. And it can only get worse in the future. If elections and polls are shaky now, how meaningless would they be with nearly godlike AIs trying to manipulate the results?
But measuring the right thing is crucial here, otherwise it won't optimize the right thing.
But for that, human frailty is a big issue. There are any number of potential ways to manipulate such a poll. A manipulator could (say) spray oxytocin into the air at a polling place, artificially raising the reported satisfaction. And it can only get worse in the future. If elections and polls are shaky now, how meaningless would they be with nearly godlike AIs trying to manipulate the results?
But measuring the right thing is crucial here, otherwise it won't optimize the right thing.
Could mind uploading offer a principled solution?
It doesn't help non-uploads
I'll get this out of the way immediately: The following idea will do
nothing to help people who are not uploaded. Which right now is you
and me and everyone else. That's not its point. Its point is to
arrive before super-intelligent AIs do.
This seems like a reasonable expectation. Computer hardware probably has to get fast enough to "do" human-level intelligence before it can do super-human intelligence.
It's not a sure thing, though. It's conceivable that running human-level intelligence via upload-and-emulating, even with shortcuts, could be much slower than running a programmed super-human AI.
This seems like a reasonable expectation. Computer hardware probably has to get fast enough to "do" human-level intelligence before it can do super-human intelligence.
It's not a sure thing, though. It's conceivable that running human-level intelligence via upload-and-emulating, even with shortcuts, could be much slower than running a programmed super-human AI.
First part: Run a verified mind securely
Enough caveats. On to the idea itself.
The first part of the idea is to run uploaded minds securely.
And how I propose to ensure that this is actually done:
One important aspect of secure computation is that it provides hard-to-forge evidence of compliance. With this in hand, FAIrchy gives us an easy answer: Make this verification a component of the utility function (Further on, I assume this connection is elaborated as needed for various commit logs etc).
This isn't primarily meant to withhold reward from manipulators, but to create incentive to keep the system running and secure. To withhold reward from manipulators, when a failure to verify is seen, the system might escrow a proportionate part of the payoff until the mind in question is rerun and the computation verifies.
The first part of the idea is to run uploaded minds securely.
- Verify that the mind data is what was originally uploaded.
- Verify that the simulated environment is a standard environment, one designed not to prejudice the voter. This environment may include a random seed.
- Poll the mind in the secure simulated environment.
- Output the satisfaction metric.
And how I propose to ensure that this is actually done:
One important aspect of secure computation is that it provides hard-to-forge evidence of compliance. With this in hand, FAIrchy gives us an easy answer: Make this verification a component of the utility function (Further on, I assume this connection is elaborated as needed for various commit logs etc).
This isn't primarily meant to withhold reward from manipulators, but to create incentive to keep the system running and secure. To withhold reward from manipulators, when a failure to verify is seen, the system might escrow a proportionate part of the payoff until the mind in question is rerun and the computation verifies.
Problems
- It's only as strong as strong encryption.
-
How does the mind know the state of the world, especially of his
personal interests? If we have to teach him the state of the
world:
- It's hard to be reasonably complete wrt his interests
- It's very very hard to do so without creating opportunities for distortion and other adverse presentation.
- He can't have and use secret personal interests
-
Dilemma:
-
If the mind we poll is the same mind who is "doing the living":
- We've cut him off from the world to an unconscionable degree.
- Were he to communicate, privacy is impossible for him.
- We have to essentially run him all the time forever with 100% uptime, making maintenance and upgrading harder and potentially unfair.
- Presumably everyone runs with the same government-specified computing horsepower, so it's not clear that individuals could buy more; in this it's socialist.
- Constant running makes verification harder, possibly very much.
- If it isn't, his satisfaction can diverge from the version(s) of him that are "doing the living". In particular, it gives no incentive for anyone to respect those versions' interests, since they are not reflected in the reported satisfaction.
-
If the mind we poll is the same mind who is "doing the living":
- On failure to verify, how do we retry from a good state?
- It's inefficient. Everything, important or trivial, must be done under secure computation.
- It's rigidly tied to the original state of the upload. Eventually it might come to feel like being governed by our two-year-old former selves.
Strong encryption
The first problem is the easy one. Being only as strong as strong
encryption still puts it on very strong footing.
- Current encryption is secure even under extreme extrapolations of conventional computing power.
- Even though RSA (prime-factoring) encryption may fall to Shor's Algorithm when quantum computing becomes practical, some encryption functions are not expected to.
- Even if encryption doesn't always win the crypto "arms race" as it's expected to, it gives the forces of legitimacy an advantage.
Second part: Expand the scope of action
ISTM the solution to these problems is to expand the scope of this
mechanism. No longer do we just poll him, we allow him to use this
secure computation as a platform to:
-
Exchange information
- Surf-wise, email-wise, etc. Think ordinary net connection.
-
Intended for:
- News and tracking the state of the world
- Learning about offers.
- Negotiating agreements
- Communicating and co-ordinating with others, perhaps loved ones or coworkers.
- Anything. He can just waste time and bandwidth.
-
Perform legal actions externally
- Spend money or other possessions
- Contract to agreements
- Delegate his personal utility metric, or some fraction of it. Ie, that fraction of it would then be taken from the given external source; presumably there'd be unforgeable digital signing involved. Presumably he'd delegate it to some sort of external successor self or selves.
- Delegate any other legal powers.
- (This all only goes thru if the computation running him verifies, but all attempts are logged)
-
Commit to alterations of his environment and even of his self.
- This includes even committing to an altered self created outside the environment.
-
Safeguards:
- This too should only go thru if the computation running him verifies, and attempts should be logged.
- It shouldn't be possible to do this accidentally.
- He'll have opportunity and advice to stringently verify its correctness first.
- There may be some "tryout" functionality whereby his earlier self will be run (later or in parallel) to pass judgement on the goodness of the upgrade.
-
Verify digital signatures and similar
- Eg, to check that external actions have been performed as represented.
- (This function is within the secure computation but external to the mind. Think running GPG at will)
Problems solved
This would immediately solve most of the problems above:
- He can know the state of the world, especially of his personal interests, by surfing for news, contacting friends, basically using a net connection.
- Since he is the same mind who is "doing the living" except as he delegates otherwise, there's no divergence of satisfaction.
- He can avail himself of more efficient computation if he chooses, in any manner and degree that's for sale.
- He's not rigidly tied to the original state of the upload. He can grow, even in ways that we can't conceive of today.
- His inputs and outputs are no longer cut off from the world even before he externalizes.
- Individuals can buy more computing horsepower (and anything else), though they can only use it externally. Even that restriction seems not neccessary, but that's a more complex design.
-
Restart: Of course he'd restart from the last known good state.
- Since we block legal actions for unverified runs, a malicious host can't get him into any trouble.
-
We minimize ambiguity about which state is the last known good
state to make it hard to game on that.
- The verification logs are public or otherwise overseen.
- (I think there's more that has to be done. Think Bitcoin blockchains as a possible model)
-
Running all the time:
-
Although he initially "lives" there, he has reasonable other
options, so ISTM the requirements are less stringent:
- Uneven downtime, maintenance, and upgrading is less unfair.
- Downtime is less unconscionable, especially after he has had a chance to establish a presence outside.
- The use of virtual hosting may make this easier to do and fairer to citizens.
-
Although he initially "lives" there, he has reasonable other
options, so ISTM the requirements are less stringent:
-
Privacy of communications:
- Encrypt his communications.
- Obscure his communications' destinations. Think Tor or Mixmaster.
-
Privacy of self:
- Encrypt his mind data before it's made available to the host
- Encrypt his mind even as it's processed by the host (http://en.wikipedia.org/wiki/Homomorphic_computing). This may not be practical, because it's much slower than normal computing. Remember, we need this to be fast enough to be doable before super-intelligent AIs are.
- "Secret-share" him to many independent hosts, which combine their results. This may fall out naturally from human brain organization. Even if it doesn't, it seems possible to introduce confusion and diffusion.
- (This is a tough problem)
Security holes
The broader functionality opens many security holes, largely about
providing an honest, empowering environment to the mind. I won't
expand on them in this post, but I think they are not hard to close
with creative thinking.
There's just one potential exploit I want to focus on: A host running someone multiple times, either in succession or staggered in parallel. If he interacts with the world, say by reading news, this introduces small variations which may yield different results. Not just different satisfaction results, but different delegations, contracts, etc. A manipulator would then choose the most favorable outcome and report that as the "real" result, silently discarding the others.
One solution is to make a host commit so often that it cannot hold multiple potentially-committable versions very long.
There's just one potential exploit I want to focus on: A host running someone multiple times, either in succession or staggered in parallel. If he interacts with the world, say by reading news, this introduces small variations which may yield different results. Not just different satisfaction results, but different delegations, contracts, etc. A manipulator would then choose the most favorable outcome and report that as the "real" result, silently discarding the others.
One solution is to make a host commit so often that it cannot hold multiple potentially-committable versions very long.
- Require a certain pace of computation.
- Use frequent unforgeable digital timestamps so a host must commit frequently.
- Sign and log the citizen's external communications so that any second stream of them becomes publicly obvious. This need not reveal the communications' content.
Checking via redundancy
Unlike the threat of a host running multiple diverging copies of
someone, running multiple non-diverging copies on multiple
independent hosts may be desirable, because:
(Edit: Fixed stray anchor that Blogspot doesn't handle nicely)
- It makes the "secret-share" approach above possible
- A citizen's computational substrate is not controlled by any one entity, which follows a general principle in security to guard against exploits that depend on monopolizing access.
- It is likely to detect non-verification much earlier.
(Edit: Fixed stray anchor that Blogspot doesn't handle nicely)
No comments:
Post a Comment