A number of writers in the Speculative Realist blogosphere have cited Ray Brassier’s discussion of Paul Churchland’s attempt to reconcile scientific realism and a Prototype Vector Activation (PVA) theory of content in Chapter 1 of Nihil Unbound (Brassier 2007). Though I am reasonably familiar with the work of Paul and Patricia Churchland, I recall finding the argument in this section tough to disentangle first and second time round. But enough people out there seem convinced by Ray’s position to warrant another look.
This is my first attempt at a reconstruction and evaluation of Ray’s position in Nihil (it does not yet take account of any subsequent changes in his position – I suspect that others will be better placed than me to incorporate these into the discussion). In what follows I’ll briefly summarize the PVA theory in the form familiar to Ray at the time of Nihil’s publication. The second section will then attempt to reconstruct his critique of Churchland’s attempt to reconcile his theory of content with a properly realist epistemology.
1. The Prototype Activation Theory of Content
Firstly, what is the PVA theory of content? As many will already be aware, the term comes from the argot of neural network modeling. Artificial Neural Networks (ANN’s) are a technique for modeling the behaviour of biological nervous systems using software representations of neurons and their interconnections. Like actual neurons, the software neurons in ANN’s respond to summed inputs from other neurons or from the ‘world’ by producing an output. Many ANN’s consist of three layers: an input layer which is given some initial data, a hidden layer that transforms it, and an output layer which presents the network’s ‘response’ to the input.
Learning in Neural Nets usually involves reducing the error between the actual output of the network (initialized randomly) and the desired output, which might well be the allocation of an input pattern to some category like ‘true’ or ‘false’, ‘male’ or ‘female’, ‘combatant’ or ‘non-combatant’ or ‘affiliation unknown’, represented by activations values at the output.
One of the key properties adjusted during the training of ANN’s are the ‘weights’ or connection strengths between neurons since these determine whether a given input generates random noise (always the case prior to training) or useful output. In ANN’s there are supervised learning algorithms that tweak the NN’s weights until the error between the actual output and that desired by the trainers is minimized. Some ANN’s (for example, Kohonen Self-Organizing Feature Maps) use more biologically plausible unsupervised learning algorithms to generate useful output such as pattern identification, without that pattern having to be pre-identified by a trainer. One example is the “Hebb Rule” which adjusts connection weights according to the timing of neuron activations (neurons that fire together, wire together). So ANN’s don’t have to be spoon-fed. They can latch onto real structure in a data set for themselves.
Learning in ANN’s, then, can be thought of as a matter of rolling down the network’s “error surface” – a curve graphing the relationship of error to weights – to an adequate minimum error. An error surface represents the numerical difference between desired and actual output, against relevant variables like the interneuron weights generating the output.
Categories acquired through training are represented as prototype regions within the “activation space” (the space of all possible activation values of its neurons) of the network where the activations representing the items falling under a corresponding category are clustered. For Churchland, prototypes represent a structure-preserving mapping or “homomorphism” from uncategorized input onto conceptual neighborhoods within the n-dimensional spaces of neural layers downstream from the input layer (Churchland 2012, viii, 77, 81, 103, 105). In effect, the neural network learns concepts by “squishing” families of points or trajectories in a high-dimensional input space onto points or trajectories clustered in a lower-dimensional similarity space. Two trained up neural nets, then, can be thought of as having acquired similar concepts if the prototypes in the first net form the vertices of a similar geometrical hypersolid to those in the second net. The Euclidean distances between the prototypes do not need to have the same magnitude but they need to be proportionate between corresponding or nearly -corresponding points. It’s important for Churchland that the distance-similarity metric is insensitive to dimensionality, for this, he argues, allows conceptual similarity to be measured across networks that have different connectivities and numbers of neurons (Churchland 1998).
The resultant theory of content is geometrical rather than propositional and, according to Churchland, internalist rather than externalist; it is also holist rather than atomist. It is geometric insofar as conceptual similarity is a matter of structural conformity between sculpted activation spaces. Such representations can capture propositional structure, but need not represent it propositionally. For one thing, the stored information in the inter-neural weights of the network need not exhibit the modularity that we would expect if that information were stored in discrete sentences. In most neural net architectures all the inter-neural weightings of the trained up network are involved in generating its discrepant outputs (Ramsey, Stich and Garon 1991).
Churchland’s internalism is a little more equivocal, arguably, than his anti-sententialism. The account is internalist insofar as it is syntactic, where the relevant syntactic elements are held to reside inside our skulls. Information about the real world features or structures tracked by prototypes plays no role in measures of conceptual similarity at all. Theoretically, conceptually identical prototypes could track entirely disparate environmental features so long as they exhibited the relevant structural conformity. Thus conceptual content for Churchland is a species of narrow content. However, Churchland regards conceptual narrow content as but one component of the “full” semantic content in PVA. The other components are the abstract but physically embodied universals tracked by sculpted activation spaces:
A point in activation space acquires a specific semantic content not as a function of its position relative to the constituting axes of that space, but rather as a function of (1) its spatial position relative to all the other contentful points within that space; and (2) its causal relations to stable and objective macro-features of the external environment (Churchland 1998, 8).
Fans of active-externalist or embodied models of cognition might argue that this syntactic viewpoint on conceptual similarity might need to be subsumed within a wide process externalist conception to allow for cases in ethology and robotics where the online prowess of a neural representation depends on the presence of enabling factors in an organism or robot’s environment (Wheeler 2005). However, I will not consider this possibility further since it is not directly relevant to Brassier’s discussion.
2. Brassier’s Critique
Brassier argues for two important claims. The first, B1, concerns the capacity of Churchland’s naturalism to express the epistemic norms that might distinguish between competing theories – most relevantly, here, different theories of mental content or processing such as PVA, on the one hand, or folk psychology (FP), on the other.
Brassier claims that Churchland’s attempt to express superempirical criteria for theoretical virtue – “ontological simplicity, conceptual coherence and explanatory power” (Brassier 2007, 18) – in neurocomputational terms leaves his account vacillating between competing theories or ontologies. This is because his revisionary account of the superempirical virtues is either 1) essentially pragmatic, concerned only with functional effectiveness of organisms who instantiate these prototype frameworks in their nervous systems, or 2) a metaphysical account whose claims go beyond mere pragmatic efficacy.
The second, B2, is the more programmatic and general. B2 is the claim that naturalism and empiricism are each unable to provide a normative foundation for the scientific destruction of the “manifest image”. B1 supports B2, according to Brassier, because Churchland – who Brassier regards as one of the most brilliant, radical and revisionary of naturalist metaphysicians – is unable to support his vaulting ontological ambitions without sacrificing his pragmatic scruples. Brassier thus sees Churchland’s philosophy as “symptomatic of a wider problem concerning the way in which philosophical naturalism frames its own relation to science”.
Much of Brassier’s argument in section 1.6 of Nihil Unbound – “From the Superempirical to the Metaphysical” centers on a relatively short text by Churchland on Bas van Fraassen’s constructive empiricism (Churchland 1985). According to Brassier, Churchland uses this text to propose replacing the “normative aegis of truth-as-correspondence” with “‘superempirical’ virtues of ontological simplicity, conceptual coherence, and explanatory power.” (Brassier 2007,18).
In the context of our familiar folk-distinction between epistemic criteria for belief-selection and semantic relationships between beliefs and things, Brassier’s gloss might seem to confuse epistemology and semantics. Superempirical truth is a putative aim of scientific enquiry not a criterion by which we may independently estimate its success (albeit an aim that is question both by Churchland and van Fraassen). This also seems to be Churchland’s position in the van Fraassen essay. The superempirical virtues are, he writes, “some of the brain’s criteria for recognizing information, for distinguishing information from noise.” (Churchland 1985; Brassier 2007, 23)
Churchland’s claim in context is not that these are better criteria for theory choice than truth but that they are preferable to the goal of empirical adequacy favoured by van Fraassen’s constructive empiricism, since the latter is committed to an ultimately unprincipled distinction between modal claims about observables and unobservables. From this we might infer that the superempirical virtues are not alternatives to truth but ways of estimating either truth or the relevant alternatives to truth that could be adopted by post sententialist realisms.
Churchland questions the status of scientific truth not (as in van Fraassen) to restrict sentential truth claims to correlations with their “empirical sub-structures” but because truth is a property of sentences or a property of what sentences express (propositions or statements) and he questions whether sentences are the basic elements of cognitive significance in human and non-human cognizers.
If we are to reconsider truth as the aim or product of cognitive activity, I think we must reconsider its applicability across the board, and not just in some arbitrarily or idiosyncratically segregated domain of ‘unobservables.’ That is, if we are to move away from the more naive formulations of scientific realism, we should move in the direction of pragmatism rather than in the direction of positivistic instrumentalism (Churchland 1985, 45).
Churchland’s claim that sentential or linguaformal representations are not basic to animal cognition is supported by the two claims: 1) that natural selection favours neural constructions attuned to the dynamical organization of adaptive behaviour and 2) that this role is not best understood in sententialist terms.
When we consider the great variety of cognitively active creatures on this planet – sea slugs and octopi, bats, dolphins and humans – and when we consider the ceaseless reconfiguration in which their brains or central ganglia engage – adjustments in the response potentials of single neurons made in the microsecond range, changes in the response characteristics of large systems of neurons made in the seconds-to-hours range, dendritic growth and new synaptic connections and the selective atrophy of old connections effected in the day-upwards range – then van Fraassen’s term “construction” begins to seem highly appropriate. . . . Natural selection does not care whether a brain has or tends towards true beliefs, so long as the organism reliably exhibits reproductively advantageous behaviour. Plainly there is going to be some connection between the faithfulness of the brain’s ‘world model’ and the propriety of the organism’s behaviour, but just as plainly the connection is not going to be direct.
When we are considering cognitive activity in biological terms and in all branches of the phylogenetic tree, we should note that it is far from obvious that sentences and propositions or anything remotely like them constitute the basic elements of cognition in creatures generally. Indeed . . . it is highly unlikely that the sentential kinematics embraced by folk psychology and orthodox epistemology represents or captures the basic elements of cognition and learning even in humans . . . If we are ever to understand the dynamics of cognitive activity, therefore, we may have to reconceive our basic unit of cognition as something other than the sentence or the proposition, and reconceive its virtue as something other than truth (Churchland 1985, 45-6). .
There is nary a mention of concepts derived from theories of neurocomputation in the 1985 text but it is pretty easy to see that the PVA model is at least a candidate for Churchland’s notional alternative to the semantics, epistemology and psychology of folk. Prototypes points or trajectories are cases of dynamical entities called attractors. An attractor is a limit towards with orbits within a region of a phase space tend as some function (an iterative map or differential equation) is applied to them. When a neural network is trained up orbits whose vectors include a large variety of input states will evolve towards some preferred prototypical point – that is just how the network extracts categories from complex data sets. This allows trained up networks to engage in a process that Churchland calls ‘vector completion’: embodying expectations about the organization and category of the input data set which may tend towards a correct assay even when that data set is degraded somehow (Churchland 2007, 102). Since attractors also reflect a flexible, dynamical response to varying input, they are also potential controllers for an organism’s behaviour – with vector completion offering the benefits of graceful degradation in a noisy, glitch-ridden world.
This suggests a potential cognitive and cybernetic advantage over sententialist models. Humans and higher nonhuman animals regularly make skillful, and occasionally very fast, abductive inferences about the state of their world. For example,
- Smoke is coming out of the kitchen – the toast is burning!
- There are voices coming from the empty basement – the DVD has come off pause!
- Artificial selection of horses, pigeons, pigs, etc. can produce new varieties of creature – Evolution is natural selection!
But is our capacity for fast and fairly reliable abduction consistent with the claim that beliefs are “sentences in the head” or functionally independent representations some other kind. Jerry Fodor, for one, concedes that this makes abduction hard to explain because it requires our brains to put a “frame” around the representations relevant to making the abduction – information about the Highway Code or the diameter of the Sun probably won’t be relevant to figuring out that burning toast is causing the smoke in the Kitchen. But within the FP framework, relevance is a holistic property beliefs have in virtue of their relations to lots of other beliefs. But which ones? How do our brains know where to affix the frame in any given situation without first making a costly, unbounded search through all our beliefs, inspecting each for its relevance to the problem?
Churchland thinks that the PVA model can obviate the need for epistemically unbounded search because the holistic and parallel character of neural representation means that all the information stored in a given network is active in the relaxation to a specific prototype (Churchland 2012, 68-9). It’s possible that Churchland is being massively over-optimistic here. For example, can PVA theory convincingly account for the kind of analogical reasoning that is being employed in case of Darwin’s inference to the best explanation? Churchland thinks it can. He argues reasonably that prototype frameworks are the kind of capacious cognitive structure that can be routinely redeployed from the narrow domain in which they are acquired, so as to reorganizes some new cognitive domain. The details of this account are a thin as things stand, but the basic idea seems worth pursuing. Children and adults regularly misapply concepts – e.g. when seeing a dolphin as a fish – with the result that other prototypes (e.g. mammal) end up having to be rectified and adjusted (Churchland 2012, 188-9).
Moreover, according to Churchland, the PVA system provides a semantic substitute for truth in the form of the aforementioned homomorphism or structural conformity between prototype neighborhoods and the structure of some relevant parts of the world.
So the take-home moral of the excursion into the biology of neural adaptation, for Churchland, is that truth is not a necessary condition for the adaptive organization of behaviour and that if we are to understand the relationship between cognitive kinematics and the organization of behaviour we may need to posit units of the cognitive significance other than sentential/propositional ones. This new conception of cognitive significance, he thinks, is liable to be constructive because it will make possible a closer understanding of the connection between the morphogenesis of neuronal systems, the dynamics of representation and the dynamical organization of behaviour.
Strangely, Brassier seems to read Churchland as making a quite different claim in the quoted passage: namely that the superempirical criteria of theory choice or prototype-framework are reducible (somehow) to the adaptive value of trained networks in guiding behaviour:
On the one hand, since ‘folk-semantical’ notions as ‘truth’ and ‘reference’ no longer function as guarantors of adequation between ‘representation’ and ‘reality’, as they did in the predominantly folk psychological acceptation of theoretical adequation – which sees the latter as consisting in a set of word-world correspondences – there is an important sense in which all theoretical paradigms are neurocomputationally equal. They are equal insofar as there is nothing in a partitioning of vector space per se which could serve to explain why one theory is ‘better’ than another. All are to be gauged exclusively in terms of what Churchland calls their ‘superempirical’ virtues; viz. according to the greater or lesser degree of efficiency with which they enable the organism to adapt successfully to its environment. (Brassier 2007, 19)
It is implicit in Churchland’s account that the superempirical virtues must be virtues applicable to neural representational strategies – since these are the more basic elements of cognition to which he alludes in his discussion of van Fraassen. However, it does not remotely follow that these virtues should be identified with “the greater or lesser degree of efficiency with which they enable the organism to adapt successfully to its environment” since, as Churchland emphasizes even here, there is only an indirect relation between “the faithfulness of the brain’s ‘world model’” and its organizational efficacy. For example, the functional value of a prototype scheme for an organism is only indirectly related to its representational prowess or accuracy – factors like speed, ease of acquisition and energy consumption would also need to be factored into any ethological assessment of competing schemes’ costs and benefits. As work in artificial intelligence shows, fast and dirty representational schemes which work in a reliably present-at-hand environmental contexts, while lacking rich representational or conceptual content, seem to be evolutionarily favoured in many instances (See Wheeler 2005).
In fact, there is nothing in this passage that suggests that Churchland thinks that the superempirical virtues must be reduced to evolutionary-functional terms at all – evolutionary theory just does not play this constitutive role in his theory of content or his epistemology.
Of course, it does not follow that Churchland precludes a neurocomputation-friendly understanding the superempirical virtues. He claims that they need to be as applicable to the understanding of epistemological systems that do not incorporate cultural or linguistic components as to those that do. He also implies, as we have seen, that these systems should be understood as engaged in a constructive activity evaluable according to criteria that can be generalized well beyond the parochial sphere of propositional attitude psychology. Churchland states as much when he claims that they are the brain’s “criteria” for distinguishing information from noise: simplicity, coherence and explanatory power need to be interpreted in a generalized manner consilient with the PVA theory of content (See also Brassier 2007, 23).
Churchland thus needs generalized, PVA-friendly account of the superempirical virtues. Brassier agrees but thinks that this requires Churchland to either embrace a neurocomputational version of idealism – which, as a realist, he would not want – or to posit a “pre-constituted physical reality” and thus to “forsake his neurocentric perspective” by adopting a metaphysics which cannot be secured from within a naturalistic framework (Brassier 2007, 20-1).
Well, for sure, no realist worth her salt will want to commit to the claim that reality is constituted by it being a possible representatum of a neurological process. The nearest any contemporary realist comes to this idea is the claim on the part of Ontic Structural Realists that to be is to be a pattern and that a pattern ‘is real’ if the compression algorithm required to encode it requires a smaller number of bits than ‘bit string’ representation of the entire data set in which the pattern resides (Dennett 1991, 34; Ladyman and Ross, 202). But a) this is a far more general constraint on existence than Brassier’s touted neurological variant and should in no way be confused with a commitment to a kind of transcendental subjectivity; b) there is no reason why Churchland has to embrace anything like it (though he might for all I know). From the claim that the superempirical virtues are ascribable, in some form, to neurocomputational structures it does not follow that every constituent of reality must necessarily be accessible to neural coding strategies.
Now, clearly, in order to frame this thought the cognizer must have a concept of reality and a concept of what it is to represent it (e.g. a partial mapping or homomorphism from abstract prototype structure onto abstract world-structure) and these must be embodied in thinker’s neural states, somehow. If we are dualists or if we believe that conceptual content is not a property of neural states, then we will deny that this is possible. However, Brassier does not explain why one should reject the claim that conceptual content is a property of neural states in his critical discussion of Churchland. Indeed he specifically disclaims this critical option earlier on when rejecting Lynn-Rudder Baker’s criticism that Churchland-style eliminativism rejection of propositional attitudes involves a self-vitiating performative contradiction (Brassier 2007, 17).
Does Brassier have any other arrows in his quiver? Well, he argues if the superempirical virtues are “among the brain’s most basic criteria for recognizing information” then all conceptual frameworks that fail to maximize representational adequacy – like FP – would have been eliminated. Thus if simplicity, coherence and explanatory power are constitutive of representational success: “all theories are neurocomputationally equal inasmuch as all display greater or lesser degrees of superempirical distinction” (Brassier 2007, 23). This seems wrong for at least two reasons. Quite obviously, if superempirical distinction is an ordinal concept (as Brassier concedes in this passage) some theories can have more of it than others and will not be neurocomputationally equal. This is a recurrent trope in Churchland’s work: some conceptual frameworks mesh the ontology of natural science with our experience better than others. Learning to discriminate temperatures according to the Kelvin scale, for example, allows us to map our experience more directly onto the regularities expressed in ideal gas laws. Thus Kelvin has greater superempirical distinction than the Fahrenheit and Celsius scales, though, as Churchland amusingly recounts in Plato’s Camera: somewhat less cultural heft in common rooms of the University of Winnipeg (Churchland 2012, 227).
Of course, it is always possible that the empirical and structural virtues of theories might underdetermine theory choice and thus choice of ontology in certain situations. There could, in principle, be theories with disparate ontologies that are equally good by way of whatever variants of simplicity, coherence and explanatory power are applicable to the PVA model. This seems to be right, but this is not the same as all theories being on equal terms. Nor, does this obviously preclude the naturalist framing an ontology that is constrained by these virtues in some way. I conclude that Brassier fails to establish B1. The PVA model does not leave Churchland unable to say why some theories are better than others. And it does not preclude Churchland or the fan of the PVA model from having a naturalistically constrained ontology. But if B1 is not established then B2 – the claim that naturalism is unable to provide a satisfactory account of science – is not established in this reading.
Brassier, Ray (2007), Nihil Unbound: Enlightenment and Extinction, Palgrave-Macmillan.
Churchland, Paul (1985), “The Anti-Realist Epistemology of van Fraassen’s The Scientific Image”, in Images of science, edited by P. M. Churchland and C.A. Hooker, Chicago: University of Chicago Press.
Churchland, Paul (1998) ‘Conceptual similarity across sensory and neural diversity: The Fodor/Lepore challenge answered’ Journal Of Philosophy 95 (1), 5-32.
Churchland, Paul (2007), Neurophilosophy at Work, Cambridge: Cambridge University Press.
Churchland, Paul (2012), Plato’s Camera: How the Physical Brain Captures a Landscape of Abstract Universals, Cambridge Mass: MIT Press.
Ladyman James, Ross Don, (2007), Every Thing Must Go: Metaphysics Naturalized, Oxford: Oxford University Press.
Ramsey, William; Stich, Stephen; P. & Garon, J. (1991), ‘Connectionism, eliminativism, and the future of folk psychology’, In William Ramsey, Stephen P. Stich & D. Rumelhart (eds.), Philosophy and Connectionist Theory. Lawrence Erlbaum.
Wheeler, Michael (2005) Reconstructing the Cognitive World: the Next Step. MIT Press, 2005.