Commentary on Hurley (2007)

Abstract: 60 words
Main Text: 1265 words
References: 308 words
Total Text: 1675 words

 

Goals are not Implied by Actions, but Inferred from Actions and Contexts

 

Iris van Rooij, Willem Haselager, & Harold Bekkering

Nijmegen Institute for Cognition and Information

Radboud University Nijmegen

P.O. Box 9104, 6500 HE Nijmegen,

The Netherlands

i.vanrooij@nici.ru.nl

w.haselager@nici.ru.nl

h.bekkering@nici.ru.nl

 

Abstract

People cannot understand intentions behind observed actions by direct simulation, because goal inference is highly context dependent. Context-dependency is a major source of computational intractability in traditional information processing models. An embodied embedded view of cognition may be able to overcome this problem, but then the problem needs recognition and explication within the context of the new, layered cognitive architecture.

 

 

Hurley proposes a layered architecture of cognition to model, among other things, the human capacity for understanding actions generated by self and others. We applaud the effort because we believe cognitive science can benefit from pursuing alternatives to the traditional cognitive-sandwich account, especially when it comes to higher cognition (van Rooij, Bongers, & Haselager, 2002, Haselager, Bongers, & van Rooij, 2003). We do see one potential problem with Hurley's conception of how layers 3 and 4 of the shared-circuits model (SCM) implement our ability to understand the goals that drive other people's actions. We discuss the nature of this problem and give pointers to how it may be adressed while staying within the conceptual framework proposed by Hurley.

 

According to the SCM, people understand why people act by "mirroring" what Hurley calls the "means-ends structure of observed actions". From reading the target article it is less than clear what mechanism underlies the activity of "mirroring" but it seems that Hurley has in mind a rather direct, non-inferential mechanism in which goals and actions are directly coupled or associated; when perceiving the actions, so it seems, the implied goals come along for free in the act of mirroring. According to Hurley, this is made possible by the fact that humans can reverse the direction of the goal-action associations generated by their own goal-directed actions. As a result, Hurley argues, "observing movements generates motor signals in the observer that tend to cause similar movements". This occurs at layer 3 of the SCM. When the motor outputs are inhibited to prevent overt copying, which occurs at layer 4 of the SCM, then the system is able to engage in a form of "mirroring [that] simulates in the observer the causes of observed action".

 

This conception of inferred goals and their relationship to observed actions is not unproblematic. It seems implausible that a simple one-to-one association between action and goal can account for the intelligent ways in which human beings infer goals from observed actions. Research shows that the goals that people infer depend in complex ways on the context in which the actions are observed. For example, the action "pushing a button with one's head" can suggest the goal "that the button be pushed" (e.g., when the person's hands are occupied holding a towel), or the goal "that the button be pushed with the head" (when the hands are free to do the pushing as well). It has been found that even infants are sensitive to such contextual factors, leading them to push the button with their hands after seeing an adult push it with her head while holding a towel in her hands, but pushing the button with their heads when the adult’s hands were free during the action (Gergely, Bekkering & Király, 2002). These observations underscore the problematic nature of Hurley’s idea that "observing movements generates motor signals in the observer that tend to cause similar movements". From the perspective of motor plans, after all, pushing a button with the hand is very dissimilar from pushing it with the head, yet infants will “copy” observed actions of adults in dissimilar ways if the context suggests a dissimilar movement may better achieve the inferred goal.

 

Two defenses of the SCM could be formulated at this point.

 

First, one could propose that the action-goal associations in the SCM are not necessarily one-to-one, but can be one-to-many or even many-to-many. That is, multiple goals could become associated with one and the same action (e.g., picking up a pen could be associated with writing, pointing, giving, etc.) and multiple actions could become associated with one and the same goal (the goal to go home from work can be associated with walking, biking, driving, etc.). By “mirroring” one could then retrieve multiple (hypothetical) goals for any given observed action. Although it is conceivable that our brains build complexes of action-goal associations, the question remains how it selects which of the--potentially very many--possible goals is the most plausible or likely goal in the current context. This selection process seems to involve some form of abductive inference (a.k.a. inference to the best explanation). It is known that the high degree of context sensitivity of human abductive inferences can lead traditional information processing models into the problem of computational intractability, be they logicist (Bylander et al., 1991), connectionist (Thagard, 2000), or Bayesian models (Cooper, 1990). It remains a challenge for the SCM, or other layered architectures, to incorporate abductive inference processes that can circumvent this classical intractability problem (see e.g. Cuijpers et al., 2006, for a recent attempt).

 

Second, one could argue that from the perspective of the observer two actions do not constitute one and the same observed action if the context of the actions differs. The argument could go as follows: the notion of "observed action" is to be understood to include relevant parts of the context (in our example above, the hands being occupied or not); then a unique mapping from action-context pairs to goals can possibly be achieved by a mere “mirroring”. Note, however, that such a proposal only serves to move the problem from understanding the role of context in goal inference to the problem of understanding how people decide which aspects of the context are relevant parts of the current action. This is one of the many disguises in which the infamous frame-problem shows itself (Haselager, 1997; Ford & Pylyshyn, 1998; Pylyshyn, 1988): Figuring out the proper demarcation of what constitutes an ‘action’ is computationally no less challenging than finding the most likely goal in a set of possible goals.

 

To be clear, by claiming that goal understanding involves in part an inferential process we do not mean to suggest that the process is necessarily conscious, controlled, or reasoned in any way. By what mechanism goal inference is achieved in humans is an important open question. The mechanism can be highly automatic, unconscious and even build on associative principles. Its implementation may involve the so-called mirror neuron system (Newman-Nordlund et al., 2007), but it may also draw upon different neural systems depending on the nature or complexity of the inferential task (e.g., de Lange et al., submitted). We see it as a challenge for future research to reconcile functional, mechanism and neural implementational level explanations of goal inference in a way that explains how people can effectively and efficiently make plausible inferences about other people’s goals and intentions in contexts of real-world complexity. So far, traditional information processing models have failed in this pursuit, due to the apparently insurmountable problem of computational intractability. Of course this is not the place for a full sketch of our views, but we would like to suggest that the embodied embedded view of cognition may prove useful in addressing this problem. First of all, Hurley’s layered (rather than “sandwiched”) view of the cognitive architecture may invite an alternative, non-traditional conception of the inferential task posed to the brain (e.g. van Dijk et al., in press). And secondly, properties of world and body can serve as cognitive resources that may reduce the computational complexity of the inferential task (van Rooij & Wareham, in press).

 

In sum, Hurley’s model is to be welcomed as a detailed model of a non-traditional approach to action and intention understanding, but the exact mechanisms behind layer 3 and 4 of her model need further clarification in view of the computational problems they are supposed to be solving. An embodied embedded view of cognition may help to provide clues for such clarification, although currently this is more a way to formulate the challenge than to answer it.

 

References

 

Bylander, T., Allemang, D., Tanner, M. C., & Josephson J. R. (1991). The computational complexity of abduction. Artificial intelligence, 49, 25-60.

 

Cooper, G. F. (1990). The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence, 42(2-3), 393-405.

 

Cuijpers, R. H., van Schie, H. T., Koppen, M., Erlhagen, W., & Bekkering, H. (2006). Goals and means in action observation: A computational approach. Neural Networks, 19, 311-322.

 

de Lange, F. P., Spronk, M., Willems, R. M., Toni, I., & Bekkering, H. (submitted). Complementary systems for understanding action intentions.

 

Ford, K. M. & Pylyshyn, Z.W. (Eds.) (1996). The robots dilemma revisited: The frame problem in artificial intelligence. Ablex Publishing.

 

Gergely, G., Bekkering, H., & Király, I. (2002). Rational imitation in preverbal infants. Nature, 415, 755.

 

Haselager, W. F. G. (1997). Cognitive science and folk psychology: The right frame of mind. London: Sage.

 

Haselager, W. F. G., Bongers, R. M., & van Rooij, I. (2003). Cognitive science, representations and dynamical systems theory. In W. Tschacher and J-P. Dauwalder (Eds.), The dynamical systems approach to cognition (pp. 229- 242). Singapore: World Scientific.

 

Newman-Norlund, R. D., van Schie, H. T., van Zuijlen, A. M. J., & Bekkering, H. (2007). The mirror neuron system is more active during complementary compared with imitative action. Nature Neuroscience, 10(7), 817-818.

 

Pylyshyn, Z. W. (Ed.) (1987). The robot's dilemma: The frame problem in artificial intelligence. Ablex Publishing

 

Thagard, P. (2000). Coherence in thought and action. Cambridge, MA: MIT Press.

 

van Dijk, J., Kerkhofs, R., van Rooij, I., & Haselager, P. (in press). Can there be such a thing as embodied embedded cognitive neuroscience? Theory & Psychology.

 

van Rooij, I., Bongers, R. M., & Haselager, W. F. G. (2002). A non-representational approach to imagined action. Cognitive Science, 26(3), 345-375.

 

van Rooij, I. & Wareham, T. (in press). Parameterized complexity in cognitive modeling: Foundations, applications and opportunities. Computer Journal.