In computer science, one often is faced
with problems which, at
first glance, seem to require an amount of computing time that may well exceed
the time span of this universe.
Smart processing
then
refers
to algorithmic methods that enable a computer to solve such problems in
a tractable amount of time. The human brain is not a computer, but it yet
seems
to
do pretty smart processing. Ideas about smart processing are
therefore relevant
not only to computer science but also to cognitive (neuro)science.
If, at first glance
,
a job seems to require an
exponential number of computing steps, say in the order of 2
N
steps for an input of size
N,
then a smart process would typically be a process that requires
only a polynomial number of steps, say in the order of
N2
steps. Smart processing depends heavily
on the way in which to-be-processed information is represented in a
data structure,
and also this aspect seems relevant to cognitive science. Below, this idea is extended to include
transparallel processing, which means that an exponential
number
of items is processed as if only one item were concerned.
Generic forms of processing
Independently of how data is organized, the following generic forms of processing can be distinghuised:
- subserial processing --- items
are
processed one after the other by many processors;
- serial processing --- items
are
processed one after the other by one processor;
- parallel processing --- items
are processed simultaneously by many processors;
- transparallel processing --- items
are processed simultaneously by one processor.
Many
real-life situations involve some hybrid combination of different forms
of processing. For instance, at the checkout in a supermarket, the
cashiers work in parallel; each cashier processes customer carts seriallly;
and the customers process the carts subserially (i.e., the carts are
presented one after the other, each cart by different customer).
Just as parallel processing, also transparallel
processing
implies
that items are processed simultaneously. However, whereas parallel
processing needs many processors, transparallel processing needs only
one processor. This distinction may
be clarified further in computer terms as follows:
- Parallel processing is primarily a hardware
solution to speed up things: Instead of assigning a job to one
processor, the job is divided into subjobs that can be done
simultaneously by different processors (various variations on
this
theme have been invented). This reduces the amount of time needed
to complete a job but not the amount of work in, say,
man-years.
- Transparallel processing is primarily a software
solution to speed up things: Instead of assigning a job to one
processor or delegating subjobs to different processors,
the job
is reduced beforehand by collapsing subjobs into one subjob (e.g., in
the Pencil
selection
example, by considering one bundle of pencils instead of
many separate pencils). This not only reduces the amount of time
needed
to complete a job but also the amount of work in man-years.
Above, one may have missed the notion of
distributed processing. This
notion sounds like referring to a specific form of processing, but it
is understood best as referring to a specific organization of the data
to be processed.
Distributed processing
The term distributed processing is often used to refer to a process
that, instead of being executed by one processor, is divided over a
number of processors. This does not yield a reduction in the work to be
done, but it may yield a proportional reduction in the time needed if
those processors operate in parallel. For instance, in the search for
extraterrestial intelligence (SETI) project, a central computer divides
the sky into parts, and it assigns each part to a different computer
which analyzes this part and which returns its findings to the central
computer. Thus, each of the computers does only part of the total job,
and the total job is done by the computer network as a whole, which
therefore is said to perform distributed processing. Saving time this
way is of course relevant in practice, but theoretically, most
interesting is the division of the sky into parts, which implies that
the central computer maintains a
distributed representation
of the sky. The human brain -- a network of many
interconnected neurons -- most probably also does a lot of distributed
processing using distributed representations of information.
Hence, in both computer science and cognitive science, the crucial
point of distributed processing is the usage of distributed
representations of the information to be processed. Well-known examples
of distributed representations are the Internet (with pieces of
information stored in many interconnected computers) and road maps
(compact representations of many different routes with shared parts).
In general, a distributed representation can be seen as a network that
regulates the interaction between represented pieces of
information. Distributed processing can therefore be defined more
generally (i.e., independently of the number of processors involved) as
referring to a process that operates on a distributed representation of
the data to be processed. Such a process may be executed by one or more
processors performing (sub)serial or (trans)parallel processing --
which of the four generic forms of processing can be performed
depends on the employed kind of distributed representation, but
the usage of distributed representation may as such already lead
to a substantial reduction in the work to be done (and thereby of
course also in the time needed).
For instance, in the 1980s,
parallel
distributed processing
(PDP)
models became popular in cognitive science. In such a model, a process is simulated by an activation flow that spreads
through the links
in a hardwired network with nodes as parallel processors of incoming
and outgoing activation. In cognitive science, a PDP network is hardly
ever
really built but is usually simulated on a computer that
operates on
a
software version of the network. The computer then of course has to do
serially
what is done in parallel in the hardwired network, so that the
computer actually performs
serial
distributed processing,
but this does not
alter
the crucial processing principle of interacting pieces of information.
That is, in both the hardwired network and its software version, the
crucial point is that the employed distributed representation allows
for a substantial reduction in the work to be done.
In computer science, serial distributed
processing has, already since the 1950s, been quite common in smart algorithms. A classical
example is Dijkstra's (1959) shortest path method (SPM). To select a
shortest path from among
an exponential number of paths, the SPM does not check every possible
path separately, but it uses a distributed representation of
the
paths (as in a road map) in a computer algorithm that selects
a shortest path in a polynomial number of steps. The SPM is also an
example
of a serial distributed processing method that can be translated into a PDP method in
a hardwired network (see
Slimy, Hilly, and Pixy).
Hence, the distinction between serial processing by one processor
and parallel
processing by many processors may be relevant for practical time
purposes, but it neither
affects information-processing principles nor the
amount of work to be done. What does affect these things is the usage
of distributed
representations which rely on the processing principle of interacting
pieces of information to achieve, typically, a reduction from
order 2
N to order
N2
in the number
of subjobs to be done -- be they executed by way or serial distributed
processing or, faster, by way of parallel distributed processing. This
idea of smart processing is taken to a new level by a special kind
of distributed representations, called
hyperstrings.
Transparallel processing by hyperstrings
A
hyperstring is a distributed representation of up to 2
N
strings which can
be searched for regularities as if only one string of length
N
were concerned (see
Hyperstrings). Thereby, hyperstrings allow for transparallel
processing as
defined above, that is, they allow for a form of
processing that transcends the traditional dichotomy between serial
and parallel processing (see
Pencil
selection). More specifically, in the number of subjobs to be done,
serial or parallel distributed processing typically implies a reduction from order 2N to order N2;
transparallel processing by
hyperstrings implies a reduction from order 2N to order 1.
In
SIT,
the idea of transparallel processing by
hyperstrings has been implemented in a coding algorithm that solves the
tractability problem of computing guaranteed simplest codes of strings.
This algorithm is neurally plausible in that it implements the
intertwined subprocesses which, in neuroscience, are believed to take
place in the visual hierarchy in the brain, namely, feedforward feature
encoding, horizontal feature binding, and recurrent feature selection.
In the algorithm, these subprocesses first construct an output
space composed of hyperstrings for only the input at hand and then
select an output from this limited, input-dependent, output space. This
contrast with connectionist PDP models which usually
start from a pre-defined network representing an output space for all
possible inputs, so that the process of activation spreading merely
serves to select, for a given input, an output from this total output
space.
In fact, the transparallel algorithm suggests that hyperstrings can be
seen as formal counterparts of transient neural assemblies
which signal their presence by
synchronous firing of the neurons involved. In neuroscience, such
assemblies are believed to mediate binding of similar features in the
input -- which is also what hyperstrings do. This, in turn, suggests
that neuronal synchronization is a manifestation of transparallel
processing, and that those hyperstring-like neural assemblies
are constituents of flexible cognitive architecture implemented in
the relatively rigid neural architecture of the brain.
For a formal account of transparallel processing by hyperstrings, see
Proceedings
of the National
Academy of Sciences USA 2004.
For a demo on its proposed relation to neuronal synchronization, see
Cognitive architecture
For extensive discussions and references on cognitive architecture, see
Cognitive Processing 2012