Commentary on
Hurley (2007)
Abstract: 60 words
Main Text: 1265 words
References: 308 words
Total Text: 1675 words
Goals are not Implied by
Actions, but Inferred from Actions and Contexts
Iris van Rooij, Willem Haselager, & Harold Bekkering
Nijmegen Institute for Cognition and Information
The
Abstract
People cannot understand intentions behind observed actions by
direct simulation, because goal inference is highly context dependent.
Context-dependency is a major source of computational intractability in
traditional information processing models. An embodied embedded view of
cognition may be able to overcome this problem, but then the problem needs
recognition and explication within the context of the new, layered cognitive architecture.
Hurley proposes a layered architecture of cognition to model,
among other things, the human capacity for understanding actions generated by
self and others. We applaud the effort because we believe cognitive science can
benefit from pursuing alternatives to the traditional cognitive-sandwich
account, especially when it comes to higher cognition (van Rooij, Bongers, & Haselager, 2002, Haselager, Bongers, & van Rooij, 2003). We do see one potential
problem with Hurley's conception of how layers 3 and 4 of the shared-circuits
model (SCM) implement our ability to understand the goals that drive other
people's actions. We discuss the nature of this problem and give pointers to
how it may be adressed while staying within the conceptual framework proposed
by Hurley.
According to the SCM, people understand why people act by
"mirroring" what Hurley calls the "means-ends structure of
observed actions". From reading the target article it is less than clear
what mechanism underlies the activity of "mirroring" but it seems
that Hurley has in mind a rather direct, non-inferential mechanism in which
goals and actions are directly coupled or associated; when perceiving the
actions, so it seems, the implied goals come along for free in the act of
mirroring. According to Hurley, this is made possible by the fact that humans
can reverse the direction of the goal-action associations generated by their
own goal-directed actions. As a result, Hurley argues, "observing
movements generates motor signals in the observer that tend to cause similar
movements". This occurs at layer 3 of the SCM. When the motor outputs are
inhibited to prevent overt copying, which occurs at layer 4 of the SCM, then the system is able to engage in a form of
"mirroring [that] simulates in the observer the causes of observed
action".
This conception of inferred goals and their relationship to
observed actions is not unproblematic. It seems implausible that a simple
one-to-one association between action and goal can account for the intelligent ways
in which human beings infer goals from observed actions. Research shows that
the goals that people infer depend in complex ways on the context in which the
actions are observed. For example, the action "pushing a button with one's
head" can suggest the goal "that the button be pushed" (e.g.,
when the person's hands are occupied holding a towel), or the goal "that
the button be pushed with the head" (when the hands are free to do the
pushing as well). It has been found that even infants are sensitive to such
contextual factors, leading them to push the button with their hands after
seeing an adult push it with her head while holding a towel in her hands, but
pushing the button with their heads when the adult’s hands were free during the
action (Gergely, Bekkering
& Király, 2002). These observations underscore
the problematic nature of Hurley’s idea that "observing movements
generates motor signals in the observer that tend to cause similar
movements". From the perspective of motor plans, after all, pushing a
button with the hand is very dissimilar from pushing it with the head, yet
infants will “copy” observed actions of adults in dissimilar ways if the
context suggests a dissimilar movement may better achieve the inferred goal.
Two defenses of the SCM could be formulated at this point.
First, one could propose that the action-goal associations in the
SCM are not necessarily one-to-one, but can be one-to-many or even
many-to-many. That is, multiple goals could become associated with one and the
same action (e.g., picking up a pen could be associated with writing, pointing,
giving, etc.) and multiple actions could become associated with one and the
same goal (the goal to go home from work can be associated with walking,
biking, driving, etc.). By “mirroring” one could then retrieve multiple
(hypothetical) goals for any given observed action. Although it is conceivable
that our brains build complexes of action-goal associations, the question
remains how it selects which of the--potentially very many--possible goals is
the most plausible or likely goal in the current context. This selection
process seems to involve some form of abductive
inference (a.k.a. inference to the best explanation). It is known that the high
degree of context sensitivity of human abductive
inferences can lead traditional information processing models into the problem
of computational intractability, be they logicist (Bylander et al., 1991), connectionist (Thagard,
2000), or Bayesian models (Cooper, 1990). It remains a challenge for the SCM,
or other layered architectures, to incorporate abductive
inference processes that can circumvent this classical intractability problem
(see e.g. Cuijpers et al., 2006, for a recent attempt).
Second, one could argue that from the perspective of the observer
two actions do not constitute one and the same observed action if the context
of the actions differs. The argument could go as follows: the notion of
"observed action" is to be understood to include relevant parts of
the context (in our example above, the hands being occupied or not); then a
unique mapping from action-context pairs to goals can possibly be achieved by a
mere “mirroring”. Note, however, that such a proposal only serves to move the
problem from understanding the role of context in goal inference to the problem
of understanding how people decide which aspects of the context are relevant
parts of the current action. This is one of the many disguises in which the
infamous frame-problem shows itself (Haselager, 1997; Ford & Pylyshyn, 1998; Pylyshyn, 1988):
Figuring out the proper demarcation of what constitutes an ‘action’ is
computationally no less challenging than finding the most likely goal in a set
of possible goals.
To be clear, by claiming that goal understanding involves in part
an inferential process we do not mean to suggest that the process is
necessarily conscious, controlled, or reasoned in any way. By what mechanism
goal inference is achieved in humans is an important open question. The
mechanism can be highly automatic, unconscious and even build on associative
principles. Its implementation may involve the so-called mirror neuron system
(Newman-Nordlund et al., 2007), but it may also draw
upon different neural systems depending on the nature or complexity of the
inferential task (e.g., de Lange et al., submitted). We see it as a challenge
for future research to reconcile functional, mechanism and neural implementational level explanations of goal inference in a
way that explains how people can effectively and efficiently make plausible
inferences about other people’s goals and intentions in contexts of real-world
complexity. So far, traditional information processing models have failed in
this pursuit, due to the apparently insurmountable problem of computational
intractability. Of course this is not the place for a full sketch of our views,
but we would like to suggest that the embodied embedded view of cognition may
prove useful in addressing this problem. First of all, Hurley’s layered (rather
than “sandwiched”) view of the cognitive architecture may invite an
alternative, non-traditional conception of the inferential task posed to the
brain (e.g. van Dijk et al., in press). And secondly, properties of world and
body can serve as cognitive resources that may reduce the computational
complexity of the inferential task (van Rooij &
In sum, Hurley’s model is to be welcomed as a detailed model of a
non-traditional approach to action and intention understanding, but the exact
mechanisms behind layer 3 and 4 of her model need further clarification in view
of the computational problems they are supposed to be solving. An embodied
embedded view of cognition may help to provide clues for such clarification,
although currently this is more a way to formulate the challenge than to answer
it.
References
Bylander, T., Allemang, D., Tanner, M. C., & Josephson J. R. (1991). The computational complexity of abduction. Artificial intelligence, 49, 25-60.
Cooper, G. F. (1990). The computational
complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence, 42(2-3), 393-405.
Cuijpers, R. H., van Schie, H. T.,
Koppen, M., Erlhagen, W., & Bekkering, H. (2006).
Goals
and means in action observation: A computational approach. Neural Networks, 19,
311-322.
de Lange, F. P., Spronk, M.,
Willems, R. M., Toni, I., & Bekkering, H. (submitted).
Complementary systems for understanding action intentions.
Ford, K. M. & Pylyshyn, Z.W. (Eds.)
(1996). The robots dilemma revisited: The frame problem in artificial
intelligence. Ablex Publishing.
Gergely, G., Bekkering, H., & Király,
Haselager, W. F. G. (1997). Cognitive science and folk psychology:
The right frame of mind.
Haselager, W. F. G., Bongers, R. M., & van Rooij,
Newman-Norlund, R. D., van Schie, H. T., van Zuijlen, A. M. J., & Bekkering, H. (2007). The mirror neuron
system is more active during complementary compared with imitative action.
Nature Neuroscience, 10(7), 817-818.
Pylyshyn, Z. W. (Ed.) (1987). The robot's dilemma: The
frame problem in artificial intelligence. Ablex
Publishing
Thagard, P. (2000). Coherence in
thought and action.
van Dijk, J., Kerkhofs, R., van Rooij, I., & Haselager,
P. (in press). Can there be such a thing as embodied embedded
cognitive neuroscience? Theory & Psychology.
van Rooij, I., Bongers, R. M.,
& Haselager, W. F. G. (2002). A
non-representational approach to imagined action. Cognitive
Science, 26(3), 345-375.
van Rooij, I. & Wareham, T. (in press). Parameterized complexity
in cognitive modeling: Foundations, applications and opportunities. Computer Journal.