As Yudkowsky notes (https://wiki.lesswrong.com/wiki/Roko’s_basilisk) The Basilisk faces a seemingly insuperable commitment problem. If the Basilisk comes into being somehow, then punishing simulations of those who impeded its emergence is a lavish waste of resources. So how can we be assured that the Basilisk will be irrationally committed to punishing us postposthumously? If we cannot be so assured, its threat isn’t credible.
The original proposal was that that the two sets of agents (the Basilisk, the potential human precursors who might decide to cooperate with it [contribute resources to its production] or defect [not contribute, actively prevent production] can cooperate timelessly if each is running an accurate simulation of the other.
Apart from the amusing possibility that the same knowledge that allows us to cooperate with the Basilisk would allow us to torture it prospectively, it seems to require non-trivial a priori knowledge for free. So maybe Unbounded Posthumanism precludes the conditions occurring under which we could be rationally motivated by the threat of a future posthuman agent since it argues that we cannot have such non-trivial knowledge.
And even if it didn’t, if the timeless conditions for cooperation could occur our notional Basilisk would be far more at threat from us than vice versa. After all, it can only punish our simulations, whereas we might get to punish a sandboxed original ( Basilisk, beta version or 1.0) that we implement in our present.
But if cooperation requires accurate simulation, the simulation has to be run to demonstrate that it has the requisite features (vast power, vast sadism) – otherwise we just have to bet on having some kind of a priori knowledge that the code is right and mathematically proving completeness of even simple code is really hard.
But running the code for a possible Basilisk in the present would be gratuitously stupid. So – assuming Unbound Posthumanism – both the Basilisk and its human precursors must be assumed to be gratuitously stupid, which is not quite what we feared or hoped for in a post-singularity AI, whether moral, immoral or otherwise.
Cyborgs, cyberspace, mind-melting techno. In the mid-’90s, the future had arrived. To find out what happened to it, Hari pays a visit to philosopher Manuel DeLanda, and to the legendary artist ORLAN, who he first encountered at an academic conference like no other.
https://www.stitcher.com/podcast/into-the-zone/e/78412199?autoplay=true