Posthuman Hyperplasticity: Smearing Omohundro's basic AI drives

 

dead_android_selfie

In “The Basic AI Drives” Steve Omohundro has argued there is scope for predicting the goals of post-singularity entities able to modify their own software and hardware to improve their intellects. For example, systems that can alter their software or physical structure would have an incentive to make modifications that would help them achieve their goals more effectively as have humans have done over historical time. A concomitant of this, he argues, is that such beings would want to ensure that such improvements do not threaten their current goals:

So how can it ensure that future self-modifications will accomplish its current objectives? For one thing, it has to make those objectives clear to itself. If its objectives are only implicit in the structure of a complex circuit or program, then future modifications are unlikely to preserve them. Systems will therefore be motivated to reflect on their goals and to make them explicit (Omohundro 2008).

I think this assumption of ethical self-transparency is interestingly problematic. Here’s why:

Omohundro requires that there could be internal systems states of post-singularity AI’s whose value content could be legible for the system’s internal probes. Obviously, this assumes that the properties of a piece of hardware or software can determine the content of the system states that it orchestrates independently of the external environment in which the system is located. This property of non-environmental determination is known as “local supervenience” in the philosophy of mind literature. If local supervenience for value-content fails, any inner state could signify different values in different environments. “Clamping” machine states to current values would entail restrictions on the situations in which the system could operate as well as on possible self-modifications.

Local supervenience might well not hold for system values. But let’s assume that it does. The problem for Omohundro is that the relevant inner determining properties are liable to be holistic. The intrinsic shape or colour of an icon representing a station on a metro map is arbitrary. There is nothing about a circle or a squire or the colour blue that signifies “station”. It is only the conformity between the relations between the icons and the stations in metro system it represents which does this (Churchland’s 2012 account of the meaning of prototype vectors in neural networks utilizes this analogy).

The moral of this is that once we disregard system-environment relations, the only properties liable to anchor the content of a system state are its relations to other states of the system. Thus the meaning of an internal state s under some configuration of the system must depend on some inner context (like a cortical map) where s is related to lots of other states of a similar kind (Fodor and Lepore 1992).

But relationships between states of the self-modifying AI systems are assumed to be extremely plastic because each system will have an excellent model of its own hardware and software and the power to modify them (call this “hyperplasticicity”). If these relationships are modifiable then any given state could exist in alternative configurations. These states might function like homonyms within or between languages, having very different meanings in different contexts.

Suppose that some hyperplastic AI needs to ensure a state in one of its its value circuits, s, retains the value it has under the machine’s current configuration: v. To do this it must avoid altering itself in ways that would lead to s being in an inner context in which it meant some other value (v*) or no value at all. It must clamp itself to those contexts to avoid s assuming v** or v***, etc.

To achieve clamping, though, it needs to select possible configurations of itself in which s is paired with a context c that preserves its meaning.

The problem for the AI is that all [s + c] pairings are yet more internal systems states and any system state might assume different meanings in different contexts. To ensure that s means v* in context c it needs to do to have done to some [s + c] what it had been attempting with s – restrict itself to the supplementary contexts in which [s + c] leads to s having v* as a value and not something else.

Now, a hyperplastic machine will always be in a position to modify any configuration that it finds itself in (for good or ill). So this problem will be replicated for any combination of states [s + c . . . +  . . ..] that the machine could assume within its configuration space. Each of these states will have to be repeatable in yet other contexts, etc. Since a concatenation of system states is a system state to which the principle of contextual variability applies recursively, there is no final system state for which this issue does not arise.

Clamping any arbitrary s requires that we have already clamped some undefined set of contexts for s and this condition applies inductively for all system states. So when Omohundro envisages a machine scanning its internal states to explicate their values he seems to be proposing an infinite task has already completed by a being with vast but presumably still finite computational resource.

References

Block, Ned (1986). Advertisement for a semantics for psychology. Midwest Studies in Philosophy 10 (1):615-78.

Churchland, Paul. 2012. Plato’s Camera: How the Physical Brain Captures a Landscape of Abstract Universals. MIT Press (MA).

Omohundro, S. M. 2008. “The basic AI drives”. Frontiers in Artificial Intelligence and applications171, 483.

 

 

5 thoughts on “Posthuman Hyperplasticity: Smearing Omohundro's basic AI drives

    1. Yes, Dominic – that’s just what it is. I suppose our relative lack of plasticity imposes constraints on iterability that would absent in hyperplastics. Well spotted.

  1. I’m not sure how I missed this! I love the notion of a ‘metacognitive framing problem,’ the inability of assuring that the system as a whole is updated in stable, goal consonant manners.

    If we chuck talk of meaning, goal, and content (as relics of our cortical bottleneck), simply speak to the mechanics, the problem quickly comes to resemble a version of the halting problem, doesn’t it? The mechanism cannot reliably comport itself to the systematic (algorithmic) consequences of any self-alteration. The idea is simply that you cannot guarantee how a control system (‘goal state’) adapted to function within one system will function within another system, which means that self-modification can never really be ‘controlled.’ It’s ‘meaning’ changes every time the system changes. One could see how the stochastic properties of neural nets would be invaluable here, providing a way to clamp the possibility of any catastrophic ‘butterfly effects.’ Modularity would also seem to provide a powerful tool, allowing the AGI to treat itself as a series of little laboratories where self-modifications can be tested in vivo. But the thing, of course, is that such structurally entrenched behavioural constraints, which are mandatory for us, would be contingent for any truly hyperplastic AGI.

    Some kind of stand-alone system, maybe, something the AGI cannot rewrite… A Jesus system, maybe?

    Perhaps we’ll discover that the only way to create sane (nonsuicidal) AGI’s is to develop them into eusocial communities, adopt a kind of ‘extreme modularity’ solution, where the capacity for self-modification is physically localized. Perhaps the AGI’s we create will keep us around simply to provide functionally independent perspectives. Perhaps we are such a AGI… and Adam’s was right all along!

    1. “the idea is simply that you cannot guarantee how a control system (‘goal state’) adapted to function within one system will function within another system, which means that self-modification can never really be ‘controlled.’”

      Wow, that’s beautifully put. Weirdly, I talk about the role of modularity (“articulation” !) in shielding systems from the consequences of evolutionary (self) tinkering in the book. It’s speculative metaphysics, but the virtues of modularity seem to be general enough to warrant claims about purely notional adaptive systems (one can think of this a naturalized version of Derridean articulation).

      I think this issue raises problems about the applicability of intentional-stance heuristics to certain corners of Posthuman Possibility Space. The substantive rationality constraint on IS means you need to make assume that an intentional system will be strategically rational enough to do what it is in its interest given the kind of creature it is. But hyperplasticity seems to threaten assumptions of that kind – hyperplastic entities wouldn’t fall into this kind of kind.

      Its also a useful thought experiment for problematizing the extension of notions of personhood beyond our parochial orbit. Standardly, we think of persons as beings capable of having (something like) second order desires and beliefs. But I don’t think it would be rational for a hyperplastic to desire to have desires (for the reasons given) since it would be committing itself to first-order clamping, and clamping is impossible. A similar problem would probably afflict second-order beliefs. The discourse of second order propositional attitudes needs to have heuristic value at least. But predicates like “….believes that p” wouldn’t be projectible for a hyperplastic since content at a functional level would be radically sensitive to online improvisation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s