Predictive processing, language, and the scaffolded mind

Abstract

Andy Clark presents a theory of predictive processing which can be summarised as: no representation without prediction.[1] This construes conscious mental contents as reflections of a predictive processing that aims to optimise its accuracy by constantly checking for error against incoming sensory input and representing the most likely hypothesis. One major question for predictive processing is: what exactly are these hypotheses? This essay will explore the place of language in predictive processing, following Lupyan and Clark; and in particular, see how this might clarify the otherwise underdetermined mental ontology implied by the predictive model. Lupyan and Clark explicitly extended this model to include also words and larger verbal constructions which create a “second system” of representation which is projected into the world of perception.[2] This system does not alter the basically determinate appearance of phenomenal experience, rather it reveals aspects and creates objects both concrete and abstract alike. The major benefit of this second system is not only this structuring, which is also evident in the case of pre-linguistic schemas such as tone perception, but rather that it entails a flexible set of predictive priors which can be creatively combined and transmitted between individuals. We will further describe how this aspect of predictive processing can be understood as scaffolded via a cultural inheritance from our predecessors. Finally, this notion of notion of an ‘artificial context’ constructed from linguistic priors and scaffolded via a cultural inheritance will be illustrated with reference to Kuhn’s account of scientific paradigms. To conclude, the predictive model will itself be seen as part of the second system—whereby its indeterminate mental ontology will be traced to the corresponding indeterminacy of ‘hidden layers’ in the neural network model from which it is derived.

Introduction: predictive processing according to Clark

It is generally accepted that the brain exhibits hierarchical characteristics in its inter-relation of mind and world. Where predictive processing differs is in its reversing the typical relationship, wherein the contents of experience are usually thought to flow from world to mind; instead it holds that the mind is constructed by top-down processes.[3] Sensory data is relevant only insofar as it registers as an error which forces the model to cycle to an alternative hypothesis that has been rendered more likely in light of this new piece of information. This cycling behaviour is seen in the case of binocular rivalry as presented by Clark.[4] This entails presenting a different image to each eye: say the image of a house to the left; and the right, a cat. The experience which results is one in which these two images cycle between one another relatively smoothly. The idea, says Clark, is that the brain computes an equal probability of each but whenever one is presented the other registers as an error signal and then selected only to be replaced as a result of the same process. This continues back and forth because the brain recognises that the two cannot be simultaneously present.

But this argument from binocular rivalry can be compared another case outlined by Clark, that of a blurry percept in the distance which could be either a dog or a fox. There the image doesn’t alternate between hypotheses, our experience remains that of a “unitary-coherent whole.”[5] This wholeness is characteristic of immediate perceptual experience under ordinary circumstance—whereas the case of binocular rivalry as outlined above is obviously unusual. Clark also presents an even simpler instance of the same phenomenon as observed in the dog and fox example.[6] This is the image presented below, wherein the central percept can be interpreted as either B (when read from left to right as A, B, C) or 13 (when read from top to bottom as 11, 12, 13). And yet, whichever way we interpret it, the percept itself clearly remains determinate:

A picture containing logo, icon

Description automatically generated[7]

What these two cases seem to indicate is that that perception and understanding reflect separable levels of predictive processing. We can accept that in perception there is no representation without prediction, and yet in the meaning attributed to symbols (as in the cases, as dog/fox or B/13) there is a line between this unitary percept and the indeterminate understanding we may attribute to it. Even here we find that our understanding is a unitary whole. The underlying percept, however, is not transformed by our respective understandings of it. Instead we can see this as an example of cognitive penetration wherein our understanding is read into the percept. Clark explains this as reflecting the various levels of predictions at work in the human brain.[8]

This can be further supported with reference to Tallon-Baudry and Bertrand’s finding of neural oscillations correlated with object construction.[9] They found a consistent pattern of activity—specifically, gamma oscillations in the range of 40Hz as measured by MEG and EEG—is correlated with the perception of objects and their understanding. This activity has also been linked to the association of words with their meaning.[10] Altogether these findings appear to reflect a neurological function which operates in the cross-modal binding of knowledge in perception. This is precisely what the predictive processing theory suggests we should find. We can see in this activity a sign of the construction of a predictive representation exactly as expected. The below images illustrate how this neural activity further reflects not only the recognition of obvious percepts but also the construction of illusory triangles: “There was no statistical difference in the gamma range between the responses to the illusory and real triangles. Induced gamma could thus reflect the spatial binding of the elementary features of the picture into a coherent representation of the triangle.”[11]

Graphical user interface, website

Description automatically generated[12]

We might here turn for a moment to examine cognitive penetration in particular. One difficulty in this area is what exactly counts as ‘cognitive’ and what aspects of perception are ‘penetrated’—or even what this means. Here we will be particular in our description, leaving aside cognitive penetration as a much broader concept and focusing on a specific form. While one response may be to say this particular form is not a case of cognitive penetration at all, that is no difficulty for our argument. Either way, it is important to clarify the specific phenomenon here under consideration. What we have been describing thus can be distinguished from broader accounts of cognitive penetration in two main ways. Firstly, we do not understand this form to entail any alteration of, for instance, the visual percept itself. Instead what we describe is a cognitive structure imposed on the conceptual indeterminacy of perceptual experience. This entails, for instance, possibilities for identification and other action—from as simple as referring to the aspect in language to purposefully interacting with some particular part of a tool. The above case of ‘seeing’ an illusory triangle also falls under this and illustrates how prior knowledge of forms penetrates our perception without actually altering the percept. Instead we perceive something like the superimposed phantasmic form of, for instance, a triangle or other such abstract entity.

Where this first difference applies to the ‘penetration,’ the second involves what is meant by ‘cognitive.’ Macpherson, for instance, seems to have quite a narrow idea of which cognitive aspects count for the purposes of identifying an instance of cognitive penetration—specifically, “belief, knowledge or concepts.”[13] They suggest, for instance, that the lexical processing of linguistic forms may not count as cognitive penetration insofar as these “do not depend on the semantic processing of the lexical forms.”[14] Their thought seems to be that cognitive penetration requires the specific involvement of the aforementioned cognitive modules: beliefs, knowledge, concepts. Instead, we are concerned with a sub-personal form of cognitive penetration wherein the ‘cognitive’ component refers to structures derived from prior knowledge, whether innate or learned. This is evident in the case of the percept which can be read as either B or 13, wherein learned schemas of letters and numbers ‘penetrate’ perception to the extent that they superimpose a meaningful form on the otherwise indeterminate perceptual experience. We will take cases like these as paradigmatic for the phenomenon which is here under investigation.

Indeed, cognitive penetration may be an inappropriate word for the phenomenon. Something like ‘embedded meaning’ or ‘superimposed structure’ might better capture this phenomenon. Whatever we call this, it seems certain that there is some predictive construction of expected structures based on perceptual cues. This predictive processing allows the mind to fill in details which are not apparent in the bare perceptual data so as to infer hidden structures in the world—as, for instance, the illusory triangle in the image above. As Clark has pointed out, this is not a matter of conscious choice but occurs at an unconscious or sub-personal level.[15] It emerges through the interaction between generative models produced by the cognitive mechanisms of predictive processing and error signals propagated upwards from our sensory apparatus. The mental output reflects the interplay of these elements in a way which we have here characterised as particular type of ‘cognitive penetration.’[16] There are two main categories in which this specific phenomenon is evident: percepts, bounded structures superimposed on bare perceptual experience, and concepts, wherein a percept is further identified by a communicable name—we will begin by addressing the former before turning to language.

Language and the structure of perceptual experience

Infants in their first months after birth discriminate a wide range of speech contrasts, both native and non-native. A process of perceptual reorganization takes place over the first year, such that discrimination of most non-native speech sounds deteriorates.[17]

While infants younger than ten months, whatever their language exposure, are equally able to discern tone-contrasts common to Hindi, the ability of English-learning infants fell off between ten and twelve months—whereas that of Hindi-learning infants remained stable.[18] This perceptual reorganisation depends on the prior existence of an innate capacity for learning, known as the Baldwin effect.[19] This can be understood, on the predictive processing model, as an innate capacity for tone distinctions which is then formed in line with the speech sounds to which the infant is exposed. We can see this process of acquiring of language-specific tonal schemas as learning a flexible set of generative models which constitute the basic aural forms of the language—akin to when one learns an alphabet from which many words can be made.

Here we might note, moreover, that the perception of a relevant tone distinction in these studies requires an observable indicator of awareness—e.g., turning their head when they hear the relevant sound. Already we can start to see how perceptual discrimination might entail meaning in the form of possible action. A further example can be drawn from individuals who have absolute pitch—in other words, those few who “can name or identify the pitch for a tone without any external reference.”[20] But people with absolute pitch more quickly identify certain pitches than others, for instance, “the pitches C# and F# are more quickly identified than D# and G#.”[21] These variations have been found to align with the relative frequency of these pitches in a large sample of music, wherein “pitches like C# and F# [were found to] occur more frequently than pitches like D# and G#.”[22]

These cases, absolute pitch and infant tone perception, can each be understood as consistent with the predictive processing model. We would expect that more frequently occurring percepts would be more quickly processed than uncommon ones and this is precisely what is found. The case of absolute pitch differs from that of infant tone perception in that it involves the explicit identification by name of a specific aspect of perceptual experience. While the infant tone perception studies all necessarily involve an action which effectively amounts to identification, this is more readily seen as a primitive form of communication rather than language proper. The identification of pitches, in contrast, clearly involves something of a language—albeit of a musical sort. All in all, these cases are in agreement with predictive processing. In tone perception and absolute pitch alike, familiar perceptual forms were internalised as predictive schemas which can then be deployed in our ordinary perception, cognition, and activity.

For our purposes, however, the most important aspect here is that these involve naming and identifying the relevant percept. Lupyan and Clark note, for instance, that while immediate percepts are always particular:

… words and larger verbal constructions are special kinds of perceptual input. While perceptual experiences of, for example, vehicles are always experiences of specific vehicles, the word “vehicle” is categorical.[23]

This leads them to suggest that language endows the mind with a flexible set of contexts or hypotheses on which our minds may draw to structure our predictive representation of the world. Language can thus be understood as offering metabolically cheap generative models upon which our ordinary predictive processes can be scaffolded. We can see language, on this view, as involving the coupling of a percept in immediate experience with a coherent concept as identified by a specific name. Take the following image, for instance, how many specific colours can you identify?

Shape, circle

Description automatically generated

Some may be able to distinguish more or less specific colours, but this does not entail that they perceive altogether different images. We can describe it as instead involving predictive structuring of perceptual experience. This can be understood, like the prior example of A, B, C or 12, 13, 14, as involving a higher-level generative model which leaves the underlying percept the same while also projecting a meaningful structure upon it. This exercise can be compared to the experience of tasting wine, wherein a subject with some expertise in wine tasting may be able to discern qualities that fade into the background for an unexperienced subject. The same might be found in comparing, for instance, graphic designers or artists to the general populace on the example of colour as above.

But where are the boundaries between the identified colours? It is difficult to say there are any clear lines of demarcation in the above image, though we can nevertheless identify specific colours: red, green, purple, orange, etc. This possibility is precisely what is expected on the predictive processing account of language. To begin, Lupyan and Clark’s line runs counter to the standard view of language—namely, that “although different languages provide their speakers with different ways of talking about things, these differences have nothing to [do] with how we think about or perceive things.”[24] Instead, Lupyan and Clark suggest a theory derived from the cognitive penetrability of perceptual experience made possible by predictive processing:

Language not only functions as a means of communicating our thoughts but plays an active role in shaping them. Rather than passively reflecting the joints of nature, words and larger constructions help to carve joints into nature.[25]

This idea resembles that of the Sapir-Whorf hypothesis, and indeed one of Clark’s own examples can be rejigged to make this point. The following image has been repeated from above, but this time suppose you are only familiar with a base-5 number system: 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, etc.

A picture containing logo, icon

Description automatically generated[26]

While the left to right line will read the same as before (A, B, C) the top to bottom line, while superficially identical, now entails an altogether different meaning. For this individual reading in base-5 from top to bottom the meaning is now equivalent, when translated to base-10, to the series: 7, 8, 9. This can be seen to support the Sapir-Whorf hypothesis, insofar as alternative mathematical languages give rise to incompatible understandings of an identical perceptual experience. While we will not here examine Sapir and Whorf’s argument in any detail, it is worth noting the similarity between the two and, moreover, that a predictive processing model seems capable of providing a mechanism that could readily underlie this hypothesis.

Eskimo, for instance, are said to have a much greater repertoire of words for various shades of snow than can be found in English.[27] Lupyan and Clark suggest this array of specific verbal constructions provides them with the capacity to better distinguish between shades of snow. This makes sense insofar there is little use for such an extensive array of words unless there is also an underlying perceptual structuring that provides the discriminatory capacity required for their meaningful use.

This example also supports the notion that errors are relative to the environmental context in which embodied activity occurs. Similar to how Clark sees predictive processing as involving organism-dependent priors, we might suggest in line with the Sapir-Whorf hypothesis that environmental differences will also result in differences between linguistic communities within a species. For us it is of little importance whether we can discern between various shades of snow, whereas the Eskimo may have good reason for desiring greater discernment in order to predict and communicate further predictive models for climatological conditions which could bear directly on their survival. We can further supplement this line with findings from evolutionary game theory, specifically on the relation between evolutionary dynamics and veridical percepts:

Natural selection can send perfectly, or partially, true perceptions to extinction when they compete with perceptions that use niche-specific interfaces which hide the truth in order to better represent utility. Fitness and access to truth are logically distinct properties. More truthful perceptions do not entail greater fitness.[28]

This line indicates, in other words, that our set of hypotheses are likely to be specifically human and determined by our particular evolutionary history and environmental context. Some of these organism-dependent schemas have been further reified as concepts and words, which—via the mechanism of cognitive penetration that we have outlined—are embedded in coherent aspects of perceptual experience and thereby coupled with their corresponding percepts.

Taken together, we can see that the language of a group can be expected to vary according to its utility in ordinary activity given their context and circumstances. This is precisely what has been found in comparing the Eskimo words for shades of snow with the relatively blurry structure of those acculturated in environments where snow is rare or even absent.[29] What matters here, as Clark repeatedly emphasises, is the possibility for perception and activity which these structural schemas allow.[30] There are, moreover, two centrally important aspects to the specifically linguistic components of this predictive structuring of experience. First, that our linguistic reality provides a set of flexible priors which allow us to more quickly adapt to change in the material environment. Second, that these linguistic priors are communicable and may thereby save individuals and even whole generations from needing to encounter painful errors.

This leads to a picture in which language is a central mechanism in our adaptive activity in the world, one which aids our capacity to distinguish and respond to signals in the perceptual environment and, most significantly, allows us to communicate this knowledge with others in our linguistic community:

Exposure to language (whether shared or self-produced) thus becomes a potent and fundamentally unified means of exploring and exploiting the full potential of our own acquired knowledge about the world—a kind of artificial “second system” enabling us to take full advantage of our own knowledge as well as the knowledge of others.[31]

There are thus two levels of representation in the predictive processing model, that of the unitary-coherent whole and the aspects that can be grasped as abstract entities—this second level is our understanding, not all of which is conscious. Our discussion is limited to this second system with a focus on its linguistic aspect. This is particularly important, as noted in the above excerpt, insofar as the communicability of linguistic priors allows us to draw on the knowledge of others. Following this line, therefore, we will now turn to focus on the ways in which this predictive model of linguistic reality aligns with an image that extends beyond the bounds typically thought to circumscribe the mind.

Cumulative culture and the scaffolded mind

Clark and Lupyan thus present an account of predictive processing and language in which the internalised schemas of linguistic understanding form a predictive structure which is then coupled with identifiable percepts. We can understand this example as reflecting the way in which cognitive development entails acquiring a set of culturally-determined schemas that superimpose structure upon our experience of the world. On the predictive processing view, these schemas can be understood as higher-level priors which provide an efficient and flexible set of hypotheses for structuring the world. This aligns with Kruschke’s argument for the parsimony of hypothesis space:

Entertaining an infinite space of hypothetical values does not imply the need for an information processor of infinite capacity, for infinite belief distributions can be represented with small sets of values.[32]

And indeed, this possibility is exactly what Lupyan and Clark have also suggested:

Words (and larger verbal constructions) become not simply ways to communicate our preexisting thoughts but highly flexible (and metabolically cheap) sources of priors throughout the neural hierarchy.[33]

Most importantly, this is a metabolically cheap method and further aligns with the notion that predictive processing seeks to minimise the use of energy.[34] We should note, moreover, that the ‘free energy principle’ has been used to explain the error minimisation imperative that is often taken as central to predictive processing accounts of the mind. If this is the case, then we might plausibly expect scenarios in which these two aims could contract one another.

One way this might be possible is through what Sterelny calls the ‘scaffolded mind.’[35] This view is particularly relevant for our current purposes, insofar as language and other larger constructions deployed by our predictive processing system are an inter-generational enterprise. Language allows not only a set of flexible priors but, as Lupyan and Clark emphasise, these priors are also communicable between individuals and can be transmitted across generations. We do not, for instance, reinvent the meaning of words anew each generation—though they may vary slightly across generations, within a lifetime, or even between locations. Instead these meanings are preserved and passed on, hence the idea of the mind as scaffolded by the language and linguistic constructions we inherit from our predecessors:

… the cognitive competence of generation N+1 individually and collectively depends on cognitive provisioning by generation N. The most critical, mind-and- brain-shaping environmental supports for cognition are these cumulatively built, collectively provided tools for thinking, tools that are provided to many or all of a generation by many or all of the previous generation.[36]

This notion of the ‘scaffolded mind,’ taken together with the Lupyan and Clark’s characterisation of language as a set of flexible and communicable priors, provides convincing account for how linguistic models by which predictive processing structures much of our representation of the world may constitute a cultural inheritance.

We can see this, moreover, as coherent with something like a ‘free energy principle’ but at the level of a linguistic community or species; it allows the inter-generational transmission of cultural and linguistic priors between individuals and across generations, which can thus be acquired far more cheaply than would be possible were each individuals required to encounter or observe errors on their own. And yet here we may also note the possible tension which was foreshadowed earlier. The ‘scaffolded mind’ hypothesis is similar to what has been described as the ‘ratchet effect’ or ‘cumulative culture.’[37] This is typically seen as a peculiarly human characteristic, one which explains our cultural and technological advancement. All this has been possible because, rather than starting afresh, each generation has been able to build on the efforts of their predecessors. One factor which has been suggested as crucial in this regard is observational learning or, more specifically, imitation. Indeed, some have even described this trait in humans as ‘over-imitation’—wherein we will copy even unnecessary steps that, for instance, a chimpanzee would not.[38] Imitation can be contrasted with emulation, wherein the aim is to replicate the purpose of behaviour without necessarily copying its precise form. Humans, in contrast, tend to copy the form of behaviour even where certain steps seem plainly superfluous. The idea is that this over-imitation has been vital for ensuring the fidelity with which we transmit knowledge from one generation to the next—and hence, our capacity for cumulative culture.[39]

The situated mind and predictive paradigms

Our journey to this point begun with the predictive model presented by Clark, then followed the line which Lupyan and Clark draw from this basic model to the linguistic ‘second system.’ The fulcrum of this system is the phenomenon of ‘cognitive penetration’ wherein the structure of our understanding is embedded in or superimposed upon our experience of the world. We have further described this as a ‘scaffolded’ system, whereby the transmissibility of language and culture enables each generation to build upon the knowledge of their predecessors. As a result, our linguistic reality is historically-determined—which is another sense in which cognition can be accurately characterised as ‘situated.’ Indeed, that minds are somehow situated is emphasised throughout the predictive model—and in particular, within the language extension provided by Lupyan and Clark. While this ‘situated cognition’ is something broader, we will instead enumerate here the various ways in which the mind can be said to be situated on this account. For one, there is the fact that we have yet to find any actual instance of a disembodied mind. Instead everywhere the mind is found not as any disembodied universal but as a particular referent which corresponds to the embodiment of an existing human being. That the mind thus corresponds, whatever the nature of this relation, to a physical body here suggests the second way in which it is situated. This is that minds are always situated at a determinate point in time and space, and though they may move from here to there in time this history is always carried with them.[40] Thirdly, we carry not only our own history but that of our predecessors insofar as the ‘second system’ is scaffolded by the transmissibility of linguistic priors between individuals.[41] And finally, all these individual particularities give rise to the ‘artificial context’ which conditions the representation and activity of an existing human mind.[42]

All this can be seen, for instance, in the activity of individuals within the scientific enterprise and, more specifically, the progress of scientific paradigms as outlined by Kuhn.[43] The work of science is obviously scaffolded, insofar as generations build on the work of their predecessors. This work takes place within a ‘scientific paradigm’ which can be seen as a large-scale structure within the second system of practicing scientists. Kuhn describes, for instance, the Ptolemaic and Copernican models of astronomy as paradigms. These determine the representational structure of the field which is being investigated—defining central concepts, relevant questions, evidential standards, etc. All this provides the basic framework for standard scientific practice and can be understood as predictive hypotheses or larger verbal constructions, in line with the notion of an “artificial context” as in Lupyan and Clark.[44] The predictive aspect of this also explains a crucial fact: that science proceeds first by the positing of hypotheses and only then by the testing for errors. This activity can be seen to reflect the predictive model of mind precisely, insofar each is an essentially predictive activity entailing specific hypothetical representations which condition activity. Of course, what we are concerned with here is not the specific hypothesis which the scientist understands his experiment as testing. Instead we are concerned with the ‘scientific paradigm’ or ‘second system’ in the mind of this particular scientist which entails the predictive priors which condition their scientific activity.

Here we must further differentiate, therefore, between the technical knowledge that is produced by scientists and the practical knowledge that structures their activity, including the production of technical knowledge—writing books, giving lectures, discussing problems, etc. These two aspects are obviously separable only in abstract insofar as to write a paper, for instance, on even the most technical of physical phenomenon takes as basic the practical knowledge of language necessary to do so. But when we are speaking of scientific paradigms, we do not mean these in their technical or textual form as in a history of ideas. Instead we mean the practical knowledge that underlies scientific activity—as much for communication as experiment—and which is internal to the mind of a scientist situated within that paradigm. This difference can be seen most clearly in what Kuhn describes as a ‘scientific revolution’ or ‘paradigm shift’ wherein the movement is not only technical but rather entails a radical reorganisation of the predictive priors which constitute the paradigm upon which scientific practice had thus far been scaffolded:

Examining the record of past research from the vantage of contemporary historiography, the historian of science may be tempted to exclaim that when paradigms change, the world itself changes with them. Led by a new paradigm, scientists adopt new instruments and look in new places. Even more important, during revolutions scientists see new and different things when looking with familiar instruments in places they have looked before.[45]

Conclusion

Predictive processing thus readily accounts for the mechanism by which this second system of linguistic priors is represented by our minds—whereby the predictive system is seen, in other words, as the engine of thought. And yet nevertheless this seems to tell us little about the actual vehicle of language or how it might work. This may partly stem from the nature of Clark’s theoretical enterprise, which is derived from the neural network model of machine learning. It is an account, in other words, of how the mind—understood as a computer—deals with uncertainty in the world, wherein machine learning is taken as the paradigmatic instance of this computational understanding of mind. Some understanding of neural networks, therefore, may be helpful in properly grasping the predictive model. But here we encounter an unexpected problem:

… [a neural network is] a type of artificial intelligence (AI) that is modelled on the brain, and that promised to be better than standard algorithms at dealing with complex real-world situations. Unfortunately, such networks are also as opaque as the brain. Instead of storing what they have learned in a neat block of digital memory, they diffuse the information in a way that is exceedingly difficult to decipher.[46]

This is what computer scientists working with artificial neural networks have described as the ‘black box’ problem. While we can more or less understand the basic computational structure, its internal activity is obscured with only inputs and outputs being susceptible to any clear representation. This is particularly troubling insofar as the cognitive science from which predictive processing emerges was initially motivated by a desire to overcome another black box, that of behaviourism.

That the predictive processing model shares in the opacity of the neural networks from which it is derived is reflected in the indeterminacy with which hypotheses are characterised by theorists. They are described variously as ‘predictive priors’ or ‘generative models’—or broadly, as an “entire spectrum” or “infinite space of hypothetical values.”[47] One way to understand this is to see the neural network model as providing the paradigm which underlies predictive processing. Through this metaphorical framing of the mind we can thus map our concrete understanding of computers—in this case, specifically of artificial neural networks—so as to structure our understanding of the more abstract domain of the human mind.[48] And yet as much as this highlights aspects which are coherent with the structure of an artificial neural network, in that the mind also uses prior knowledge to extract relevant information from noisy data, it also hides those aspects of the underlying domain of mind where the metaphorical structuring maps poorly. One such underdetermined aspect is the predictive ontology of mind, which in the case of artificial networks is limited to obvious inputs and outputs with many ‘hidden layers’ between the two.[49] Hence the indeterminate mental ontology of predictive processing may stem from the analogous indeterminacy that characterises the hidden layers of artificial neural networks.

Nevertheless, this is not an insurmountable problem for predictive processing. The line taken by Lupyan and Clark, for instance, goes some way to determining at least one aspect of this mental ontology. As we have outlined, language can thus be seen as providing a flexible set of predictive priors. The transmissibility of language, moreover, also indicates the source of at least some of the generative models on which predictive processing depends. This solves the problem of whether, in Clark’s top-down predictive model, the entire catalogue of generative models needs to be innate. While some innate aptitude for language is still required, the developmental acquisition of a set of scaffolded priors susceptible to creatively recombination is surely more plausible than the almost Platonic notion that we are endowed at birth with an infinite spectrum of hypotheses corresponding to everything in existence. Some such acquisition as Lupyan and Clark allow seems necessary that we might, for instance, attain the computational metaphor which serves as a basic paradigm for cognitive science. There is reason to suspect, however, that this addition of communicable linguistic priors will require some adjustment to the basic predictive model insofar as the acquisition of language clashes with Clark’s top-down picture of predictive processing. And yet more broadly, this emphasis on linguistic priors further suggests a fruitful line of research whereby the cognitive science of predictive processing might be combined with, for instance, the cognitive linguistics of metaphor to yield a deeper understanding of this second system.[50]


 

References

Castelvecchi, Davide. "Can we open the black box of AI?." Nature News 538, no. 7623 (2016): 20.

Clark, Andy, and David Chalmers. "The extended mind." analysis 58, no. 1 (1998): 7-19.

Clark, Andy. Surfing uncertainty: Prediction, action, and the embodied mind. Oxford University Press, 2015.

Clark, Andy. "Beyond the 'Bayesian blur': predictive processing and the nature of subjective experience." Journal of Consciousness Studies 25, no. 3-4 (2018): 71-87.

Hurford, James R. "The evolution of the critical period for language acquisition." Cognition 40, no. 3 (1991): 159-201.

Huron, David. Sweet anticipation: Music and the psychology of expectation. MIT press, 2008.

Kay, Paul, and Willett Kempton. "What is the Sapir‐Whorf hypothesis?." American anthropologist 86, no. 1 (1984): 65-79.

Kuhn, Thomas S. The structure of scientific revolutions. University of Chicago press, 2012.

Lakoff, George, and Mark Johnson. Metaphors we live by. University of Chicago press, 2008.

Lakoff, George, and Mark Johnson. Philosophy in the flesh: The embodied mind and its challenge to western thought. Vol. 640. New York: Basic books, 1999.

Lewis, Hannah M., and Kevin N. Laland. "Transmission fidelity is the key to the build-up of cumulative culture." Philosophical Transactions of the Royal Society B: Biological Sciences 367, no. 1599 (2012): 2171-2180.

Lupyan, Gary, and Andy Clark. "Words and the world: Predictive coding and the language-perception-cognition interface." Current Directions in Psychological Science 24, no. 4 (2015): 279-284.

Mark, Justin T., Brian B. Marion, and Donald D. Hoffman. "Natural selection and veridical perceptions." Journal of Theoretical Biology 266, no. 4 (2010): 504-515.

Mattock, Karen, and Denis Burnham. "Chinese and English infants' tone perception: Evidence for perceptual reorganization." Infancy 10, no. 3 (2006): 241-265.

Piaget, Jean. The construction of reality in the child. Vol. 82. Routledge, 2013.

Prinz, Jesse. "Ways of mindmaking." Andy Clark and His Critics (2019): 222.

Sterelny, Kim. "Minds: extended or scaffolded?." Phenomenology and the Cognitive Sciences 9, no. 4 (2010): 465-481.

Tallon-Baudry, Catherine, and Olivier Bertrand. "Oscillatory gamma activity in humans and its role in object representation." Trends in cognitive sciences 3, no. 4 (1999): 151-162.

Tennie, Claudio, Josep Call, and Michael Tomasello. "Ratcheting up the ratchet: on the evolution of cumulative culture." Philosophical Transactions of the Royal Society B: Biological Sciences 364, no. 1528 (2009): 2405-2415.

Tomasello, Michael. "Cultural transmission: A view from chimpanzees and human infants." Journal of cross-cultural psychology 32, no. 2 (2001): 135-146.

Werker, Janet F., and Richard C. Tees. "Cross-language speech perception: Evidence for perceptual reorganization during the first year of life." Infant behavior and development7, no. 1 (1984): 49-63.

Wiese, Wanja, and Thomas Metzinger. "Vanilla PP for philosophers: A primer on predictive processing." (2017).

 

 

 



[1] Clark, Surfing Uncertainty.

[2] Lupyan & Clark, ‘Words and the world: predictive coding and the language-perception-cognition interface.’

[3] Clark, Surfing Uncertainty.

[4] Clark, ‘Beyond the Bayesian blur,’ p. 78–80.

[5] Ibid, p. 73.

[6] Ibid, p. 82–83.

[7] Image sourced from Lupyan & Clark, ‘Words and the world,’ p. 282.

[8] Clark, ‘Words and the world,’ p. 281.

[9] Tallon-Baudry & Bertrand, ‘Oscillatory gamma activity in humans and its role in object representation.’

[10] Ibid, p. 157.

[11] Ibid, p. 156.

[12] Image sourced from Tallon-Baudry & Bertrand, ‘Oscillatory gamma activity in humans and its role object representation,’ p. 156.

[13] Macpherson, p. 578.

[14] Macpherson, p. 578.

[15] Clark, Surfing Uncertainty.

[16] Lupyan & Clark, ‘Words and the world,’ p. 281.

[17] Mattock & Burnham, ‘Chinese and English infants’ tone perception,’ p. 241.

[18] Werker & Tees, ‘Cross-language speech perception.’

[19] Huron, p. 62.

[20] Huron, p. 64.

[21] Ibid.

[22] Ibid.

[23] Lupyan & Clark, ‘Words and the world,’ p. 283.

[24] Lupyan & Clark, ‘Words and the world,’ p. 282.

[25] Ibid, p. 282.

[26] Image sourced from Lupyan & Clark, ‘Words and the world,’ p. 282.

[27] Lupyan & Clark, ‘Words and the world,’ p. 282.

[28] Mark, Marion, & Hoffman, ‘Natural selection and veridical perception,’ p. 513.

[29] Lupyan & Clark, ‘Words and the world,’ p. 282.

[30] Ibid, p. 283.

[31] Ibid, p. 283.

[32] Kruschke in Clark, ‘Beyond the Bayesian blur,’ p. 73.

[33] Lupyan & Clark, p. 283.

[34] Wiese & Metzinger, ‘Vanilla PP for philosophers,’ p. 12.

[35] Sterelny, ‘Minds: extended or scaffolded?’

[36] Ibid, p. 479.

[37] Tennie, Call & Tomasello, ‘Ratcheting up the ratchet: on the evolution of cumulative culture.’

[38] Tomasello, ‘Cultural transmission: A view from chimpanzees and human infants.’

[39] Lewis & Laland, ‘Transmission fidelity is the key to the build-up of cumulative culture.’

[40] This is evident, for instance, in the role of prior knowledge in predictive processing—we are thus situated within our learning history.

[41] Sterelny, ‘Minds: extended or scaffolded?’

[42] We can see this in how predictive priors structure our experience, or more specifically in the linguistic relativity described by Lupyan and Clark—as in the Sapir-Whorf Hypothesis.

[43] Kuhn, The Structure of Scientific Revolutions.

[44] Lupyan & Clark, ‘Words and the World,’ p. 283.

[45] Kuhn, p. 111.

[46] Castelvecchi, ‘Can we open the black box of AI?’

[47] Kruschke in Clark, ‘Beyond the Bayesian blur,’ p. 73.

[48] Lakoff & Johnson, Metaphors We Live By, p.

[49] Castelvecchi, ‘Can we open the black box of AI?’

[50] See Lakoff & Johnson, Philosophy in the Flesh.