Storage of sequences in tournament-based neural networks

Get Complete Project Material File(s) Now! »

Biological and information considerations

More and more researchers have adopted the idea that memories are stored in the brain using a distributed representation across numerous cortical and subcortical re-gions [EC01] [Fus09], which means that each memory is stored across thousands of synapses and neurons. Also each neuron or synapse is involved in thousands of mem-ories.
The cortical minicolumn (also called microcolumn) is likely the most basic and consistent information processing unit in the brain [Jon00] [JL07] [CBP+05]. The neurons within a minicolumn encode similar features, as said in [CBP+05]: “current data on the microcolumn indicate that the neurons within the microcolumn receive common inputs, have common outputs, are interconnected, and may well constitute a fundamental computational unit of the cerebral cortex”. As a consequence, a node in our network, the unit being able to send and receive information towards and from the rest of the network, may correspond to a minicolumn, each comprising around 80 neurons in reality. To make it clear, we deliberately call a node as a “fanal” instead of a “neuron”.
Following this logic, we propose to liken clusters in the clique-based networks to the cortical column, generally composed of 50 to 100 minicolumns. The columns are then grouped into macrocolumns, in the analogical way that clusters compose the network. A set of clique-based networks that cooperate with each other would likely correspond to the so-called functional areas of the brain.
The general concept of the brain as an information processing machine underlines the importance of information theory in cognition problems. Many research eﬀorts [Mil56] [Bad94] [Cow00] point out that the chunk of information, an abstract but nevertheless measurable unit of information, is relevant to several human cognitive features, such as the span of immediate memory, rather than the absolute amount of information, that is, the bit. In the clique-based network, the division of information into clusters is in fact a way for chunking. A sparse message may be composed of a number of chunks, which is much smaller than the number of fanals in the whole network. The subsampling of chunks could be furthermore applied to resonate with the experienced limitations on the number of information items that we are able to receive, process and remember. And a hierarchical structure could be envisaged to form the clusters of clusters, as well as the chunks of chunks.

Advantages and drawbacks

This network model based on neural cliques has several main advantages:
1. Quadratic performance: the quantity of information that this model is able to store and retrieve outperforms dramatically the previous state-of-the-art Hop-field neural networks (HNN), or other comparable laws obtained with networks based on weighted connections. In particular, given a density d, the number M of i.i.d. stored messages in such a clique-based network is expressed as: M = log(1 − d) ≈ l2d. (2.4) log 1 − 1 2 l.
This approximation is close to the exact value for low values of d. This simple result shows the performance follows a quadratic law of the number of fanals per clusters, while that of HNN follows a sub linear law. For instance, [Gri11] shows that for the same amount of used memory of 1.8 × 106 bits, the clique-based network with χ = 8 and l = 256 is 250 times superior to HNN in terms of diversity, 20 times superior in terms of capacity, and 20 times superior in terms of eﬃciency. This performance enhancement is achieved by exploiting the sparsity of the network at several levels.
2. Biological plausibility: This model echoes to some biological reality, such as sparse coding, winner-take-all rule and the columnar organization of the brain, etc. This has been explained more in details in Section 2.1.6.
The main interest of this model is that one does not need to find a trade-oﬀ between biological plausibility and retrieval performance. Both aspects can be plainly satisfied without any conflict. However, this model still has limitations:
1. The entire network is involved in the storing and retrieving of every piece of information. This does not correspond directly to the “sparse coding” vision ¨ of mental information [F03] [PB04] [OF04]. It is contradictory to the biologi-cal reality that in the neocortex, the information received will only address a restricted part of the material, rather than the entire neocortex.
2. By conception, the network is only able to store messages with fixed length. This is a consequence of the first point.
3. The network is able to store only static spatial messages, while temporality and sequentiality is omnipresent in human cognitive behavior.
4. Symmetrical connection matrix, in other words, bidirectional synapse is not likely biologically plausible.
The first two points have been addressed in [Ali13] [ABGJ13] and will be re-explained in Section 2.2 with newly introduced decoding algorithms, whereas the last two points will be addressed in Chapter 4 and 5.

Adaptation to sparse structure

As already pointed out previously, parsimony is a prominent characteristic of the brain organization and functioning. The amount of data which is continually conveyed by the nervous system, the visual cortex for instance, is gigantic. And yet, what is retained by the brain for a possible later exploitation requires much less information than the original physical stimuli. These stimuli undertake several filtering operations, from the sensory levels (e.g. points, lines, shapes, colors) to the most cognitive ones (names, concepts, etc.), each level adapting the elements of knowledge to its own mode of representation and storage [Ben09].
Kamary Aliabadi [Ali13] describes how to adapt our network structure to sparse message storage and retrieval, where only a small part of the network is addressed at a time for information acquisition. An illustration is given in Fig. 2-2. Three sparse messages are stored in a network composed of 16 clusters of 8 fanals each. Messages are not necessarily of same order, two of them are of order 4 (on black and on blue), and the other one is of order 5 (on red). The local activation restriction is still respected: only one fanal at most can be active in a cluster. And once again, the weight of a connection shared by several cliques remains 1.

Table of contents :

1 Introduction
1.1 Biological neural networks
1.1.1 Functions of a neuron
1.2 Artificial neural networks
1.2.1 McCulloch-Pitts neuron
1.2.2 Association in neural networks
1.2.3 Sequences in neural networks
1.3 Thesis structure
2 Network of neural cliques
2.1 Structure
2.1.1 Storage
2.1.2 Network density
2.1.3 Retrieval
2.1.4 Measures of performance
2.1.5 Sparsity
2.1.6 Biological and information considerations
2.1.7 Advantages and drawbacks
2.2 Adaptation to sparse structure
2.2.1 Generalized framework of decoding algorithms
2.2.2 Dynamic rules
Sum-of-Sum rule (SoS)
Normalization rule (NORM)
Sum-of-Max rule (SoM)
2.2.3 Activation rules
Global winner-takes-all
Global winners-take-all
Losers-kicked-out
Performance comparison
2.3 Extensions
2.3.1 Blurred messages
2.3.2 Binary errors
2.3.3 Fractal network
3 Chains of cliques and chains of tournaments
3.1 Redundancy reduction in a clique
3.2 Towards unidirectionality: chain of tournaments
3.3 Associative memory with chains of tournaments
3.3.1 One erased cluster
3.3.2 Two erased clusters
4 Storage of sequences in tournament-based neural networks
4.1 Definition of the sequence
4.2 Storing
4.3 Error tolerant decoding
4.4 Sequence retrieval error rate
4.5 Capacity and efficiency
5 Storage of vectorial sequences in generalized chains of tournaments
5.1 Sparse random patterns
5.2 Storing vectorial sequences in looped chains of tournaments
5.3 Storing vectorial sequences in unlooped chain of tournaments
5.3.1 Global decoding schemes
5.3.2 Cluster activity restriction
5.4 Double layered structure
5.4.1 Error avalanche issue
5.4.2 Structure
5.4.3 Biological consideration
5.5 Interference issue with a limited dictionary
5.5.1 Interference issue
5.5.2 Multiple solutions
Segmentation of network
Association with signature
Cluster enlargement
Subsampling
5.6 Hierarchical structure
5.6.1 Performance analysis
6 Conclusion and open problems
6.1 Obtained results
6.2 Perspectives and open problems
Bibliography