Visualizing a Hopfield RNN with Fovea

In this tutorial, we'll dig into the source of some of Fovea's machine learning backend and demonstrate how to use a Hopfield Network for the purposes of optical character recognition. While our example will be rudimentary in scope so as to avoid getting too bogged down in minutiae, the network used can be applied to an extensive variety of real-world scientific challenges.

The Structure of Fovea's Machine Learning Utilities

Fovea comes packaged with a full-fledged implementation of a generic Hopfield network, a type of Recurrent Neural Network. RNNs are distinguishable from vanilla neural networks in that they allow for feedback between the component computational units, also called neurons. This gives the network an internal representation of associative memory, as we'll demonstrate in this example.

The network implementation code is contained in retina/mlearn/hopfield_network.py. The file defines a single class, HopfieldNetwork, which is easily instantiated by a single call of the form myNet = HopfieldNetwork(num_neurons). Optionally, a custom activation function can be supplied (more on this later). The num_neurons argument specifies the number of neurons to be used internally by the network.

The file visuals.py contains the front-end code that defines a VisualHopfield network built on top of the HopfieldNetwork backend. The VisualHopfield network also relies on an internal VisualNeuron class which creates drawings of neurons and the connections between them. In its default form, VisualHopfield runs a detailed visual simulation of the Hopfield network training and learning on whatever data it's supplied. That said, each component of the visualization can easily be separated from the action of the others should your needs require a customized approach.

Here we present the full source code for both the Hopfield network and the associated visuals. We discuss the implementation in more detail below.

In [ ]:
# %load ../../retina/mlearn/hopfield/hopfield_network.py
import numpy as np
import random

class HopfieldNetwork(object):
    """
    (C) Daniel McNeela, 2016

    Implements the Hopfield Network, a recurrent neural network developed by John Hopfield
    circa 1982.

    c.f. https://en.wikipedia.org/wiki/Hopfield_Network
    """
    def __init__(self, num_neurons, activation_fn=None):
        """
        Instantiates a Hopfield Network comprised of "num_neurons" neurons.
        
        num_neurons         The number of neurons in the network.
        _weights            The network's weight matrix.
        _trainers           A dictionary containing the methods available for 
                            training the network.
        _vec_activation     A vectorized version of the network's activation function.
        """
        self.num_neurons = num_neurons
        self._weights = np.zeros((self.num_neurons, self.num_neurons), dtype=np.int_)
        self._trainers = {"hebbian": self._hebbian, "storkey": self._storkey}
        self._recall_modes= {"synchronous": self._synchronous, "asynchronous": self._asynchronous}
        self._vec_activation = np.vectorize(self._activation)
        self._train_act = np.vectorize(self._train_activation)

    def weights(self):
        """
        Getter method for the network's weight matrix.
        """
        return self._weights

    def reset(self):
        """
        Resets the network's weight matrix to the matrix which is identically zero.

        Useful for retraining the network from scratch after an initial round
        of training has already been completed.
        """
        self._weights = np.zeros((self.num_neurons, self.num_neurons), dtype=np.int_)

    def train(self, patterns, method="hebbian", threshold=0, inject = lambda x, y: None):
        """
        The wrapper method for the network's various training algorithms stored in
        self._trainers.

        patterns        A list of the patterns on which to train the network. Patterns 
                        are bipolar vectors of the form 

                        [random.choice([-1, 1]) for i in range(self.num_neurons)].

                        Example of properly formatted input for a Hopfield Network
                        containing three neurons:

                            [[-1, 1, 1], [1, -1, 1]]

        method          The training algorithm to be used. Defaults to "hebbian".
                        Look to self._trainers for a list of the available options.
        threshold       The threshold value for the network's activation function.
                        Defaults to 0.
        """
        try:
            return self._trainers[method](patterns, threshold, inject)
        except KeyError:
            print(method + " is not a valid training method.")

    def recall(self, patterns, steps=None, mode="asynchronous", inject = lambda x, y: None):
        """
        Wrapper method for self._synchronous and self._asynchronous.

        To be used after training the network.

        patterns        The input vectors to recall. 

        steps           Number of steps to compute. Defaults to None.

        Given 'patterns', recall(patterns) classifies these patterns based on those
        which the network has already seen.
        """
        try:
            return self._recall_modes[mode](patterns, steps, inject)
        except KeyError:
            print(mode + " is not a valid recall mode.")

    def energy(self, state):
        """
        Returns the energy for any input to the network.
        """
        return -0.5 * np.sum(np.multiply(np.outer(state, state), self._weights))

    def _synchronous(self, patterns, steps=10):
        """
        Updates all network neurons simultaneously during each iteration of the
        recall process.

        Faster than asynchronous updating, but convergence of the recall method
        is not guaranteed.
        """
        if steps:
            for i in range(steps):
                patterns = np.dot(patterns, self._weights)
            return self._vec_activation(patterns)
        else:
            while True:
                post_recall = self._vec_activation(np.dot(patterns, self._weights))
                if np.array_equal(patterns, post_recall):
                    return self._vec_activation(post_recall)
                patterns = post_recall

    def _asynchronous(self, patterns, steps=None, inject=lambda x:None):
        """
        Updates a single, randomly selected neuron during each iteration of the recall 
        process.

        Convergence is guaranteed, but recalling is slower than when neurons are updated
        in synchrony.
        """
        patterns = np.array(patterns)
        if steps:
            for i in range(steps):
                index = random.randrange(self.num_neurons)
                patterns[:,index] = np.dot(self._weights[index,:], np.transpose(patterns))
            return self._vec_activation(patterns)
        else:
            post_recall = patterns.copy()
            inject(post_recall, 0)
            indicies = set()
            i = 1
            while True:
                index = random.randrange(self.num_neurons)
                indicies.add(index)
                post_recall[:,index] = np.dot(self._weights[index,:], np.transpose(patterns))
                post_recall = self._vec_activation(post_recall)
                inject(post_recall, i)
                if np.array_equal(patterns, post_recall) and len(indicies) == self.num_neurons:
                    return self._vec_activation(post_recall)
                patterns = post_recall.copy()
                i += 1

    def _activation(self, value, threshold=0):
        """
        The network's activation function.

        Defaults to the sign function.
        """
        if value < threshold:
            return -1
        return 1

    def _train_activation(self, value, threshold=0):
        if value == threshold:
            return value
        elif value < threshold:
            return -1
        return 1

    def _hebbian(self, patterns, threshold=0, inject= lambda x, y: None):
        """
        Implements Hebbian learning.
        """
        i = 1
        for pattern in patterns:
            prev = self._weights.copy()
            self._weights += np.outer(pattern, pattern)
            inject(prev, i)
            i += 1
        np.fill_diagonal(self._weights, 0)
        self._weights = self._weights / len(patterns)

    def _storkey(self, patterns):
        """
        Implements Storkey learning.
        """
        pass
In [ ]:
# %load ../../retina/mlearn/hopfield/visuals.py
import numpy as np
import time, warnings
from retina.mlearn.hopfield.hopfield_network import *
from retina.core.axes import Fovea3D
from matplotlib.pyplot import *
from matplotlib import gridspec
from matplotlib.widgets import Button
from sklearn.decomposition import PCA
from scipy.interpolate import griddata as gd

neuron_radius = 1

class VisualNeuron(object):
    """
    Class creates a visual representation of a neuron in a generic Hopfield Network.

    The VisualHopfieldNetwork class presents a collection of visual neurons in a circular
    arrangement, and thus the VisualNeuron position is initialized with polar coordinates.
    """
    def __init__(self, theta, r):
        """
        theta   the polar angle of the neuron's position
        r       the polar radius of the neuron's position
        """
        self.theta = theta
        self.r = r
        self.x = r * np.cos(theta)
        self.y = r * np.sin(theta)
        self.connections = {}

    def __repr__(self):
        """
        Defines a string representation for a neuron giving its position in Cartesian coordinates.
        """
        return "Visual Neuron at " + str((self.x, self.y))

    def draw(self, axes):
        """
        Draws a neuron to the provided Matplotlib axes.
        """
        self.body = Circle((self.x, self.y), radius=neuron_radius, fill=False)
        axes.add_patch(self.body)

    def draw_connection(self, neuron, connection_color, axes):
        """
        Draws a connection between two neurons.

        neuron              the terminal neuron of the connection
        connection_color    the color of the connection line to be drawn
        axes                the Matplotlib axes to which the connection should be drawn
        """
        connection = Line2D((self.x, neuron.x), (self.y, neuron.y), color=connection_color)
        self.connections.update({ neuron :  connection })
        neuron.connections.update({ self : connection })
        axes.add_line(self.connections[neuron])

    def delete_connection(self, neuron):
        """
        Delete the connection between self and neuron. The connection will no longer
        be drawn in the network diagram and will be cleared from memory.

        neuron      the terminal neuron of the connection
        """
        network_lines = self.main_network.lines
        del network_lines[network_lines.index(self.connections[neuron])]

class VisualHopfield(HopfieldNetwork):
    def __init__(self, num_neurons):
        """
        Initializes a VisualHopfield network of num_neurons.
        """
        HopfieldNetwork.__init__(self, num_neurons)
        d_theta = (2 * np.pi) / num_neurons
        self.neurons = [VisualNeuron(i * d_theta, num_neurons) for i in range(num_neurons)]
        self.cs_plot = None

    def run_visualization(self, training_data, recall_data=None):
        """
        Runs the Hopfield Network visualization. Trains the network on training_data and
        recalls on recall_data.
        """
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            ion()
            self.training_data = training_data
            self.recall_data = recall_data
            self._setup_display()
            self._draw_network()
            self._plot_state([-1 for i in range(self.num_neurons)])
            self._plot_weights()
            print("Training...")
            self._set_mode("Training")
            self.train(training_data, inject=self._train_inject)
            self._normalize_network()
            self._plot_energy()
            print("Learning...")
            self._set_mode("Learning")
            for state in recall_data:
                self.recall([state], inject=self._recall_inject)
            print("Finished.")
            self._set_mode("Finished")

    def _train_inject(self, prev_weights, iteration, delay=.01):
        """
        Provides drawing capabilities for the superclass train() method.

        prev_weights        The network weight matrix before a round of recalling
                            is undergone
        iteration           The current iteration count
        delay               The time delay between each iteration. Larger delays
                            slow the rate of visualization and vice versa.
        """
        self.cmap.set_data(self._weights)
        self._update_iter(iteration)
        new_weights = self._train_act(self.weights())
        colors = ['green', 'blue', 'red']
        for ((row, column), value) in np.ndenumerate(new_weights):
            if self.neurons[row] is self.neurons[column]:
                continue
            elif new_weights[row,column] != prev_weights[row,column]:
                connection = self.neurons[row].connections[self.neurons[column]]
                setp(connection, linewidth='4')
                setp(connection, color=colors[new_weights[row,column]])
            else:
                connection = self.neurons[row].connections[self.neurons[column]]
                setp(connection, linewidth='1')
        pause(delay)

    def _set_mode(self, mode):
        """
        Sets the current mode of the network to be displayed in the visualization.

        mode        The current mode. Should be one of "Learning" or "Training."
        """
        self.mode.set_text("Current Mode: " + mode)

    def _update_iter(self, num):
        """
        Update the current network iteration.

        num         The current iteration count.
        """
        self.iteration.set_text("Current Iteration: " + str(num))

    def _recall_inject(self, state, iteration, delay=.05):
        """
        Provides drawing capabilities for the superclass recall() method.

        state       The current state of the network. Provided at each step in the
                    recall process.
        iteration   The current iteration count.
        delay       The time delay between successive iterations of recalling.
        """
        state = np.array(state)
        self.state_plot.set_data(state.reshape(5, 5))
        currentenergy = self.energy(state)
        current_state = self.pca.transform(state)
        if self.cs_plot:
            self.cs_plot.remove()
        self.cs_plot = self.energy_diagram.scatter(current_state[:,0], current_state[:,1], currentenergy,
                                                s=80, c='b', marker='o')
        self._update_iter(iteration)
        pause(delay)

    def _setup_display(self):
        """
        Sets up the Matplotlib figures and axes required for the visualization.
        """
        self.network_fig = figure(figsize=(20, 20))
        self.network_fig.canvas.set_window_title("Hopfield Network Visualization")
        gs = gridspec.GridSpec(2, 4)
        self.main_network = subplot(gs[:,:2])
        self.main_network.set_title("Network Diagram")
        self.main_network.get_xaxis().set_ticks([])
        self.main_network.get_yaxis().set_ticks([])
        self.energy_diagram = subplot(gs[0,2], projection='Fovea3D')
        self.energy_diagram.set_title("Energy Function")
        self.contour_diagram = subplot(gs[0,3])
        self.contour_diagram.set_title("Energy Contours")
        self.state_diagram = subplot(gs[1,2])
        self.state_diagram.set_title("Current Network State")
        self.state_diagram.get_xaxis().set_ticks([])
        self.state_diagram.get_yaxis().set_ticks([])
        self.weight_diagram = subplot(gs[1,3])
        self.weight_diagram.set_title("Weight Matrix Diagram")
        self.weight_diagram.get_xaxis().set_ticks([])
        self.weight_diagram.get_yaxis().set_ticks([])
        self.network_fig.suptitle("Hopfield Network Visualization", fontsize=14)
        self.mode = self.network_fig.text(0.4, 0.95, "Current Mode: Initialization",
                                          fontsize=14, horizontalalignment='center')
        self.iteration = self.network_fig.text(0.6, 0.95, "Current Iteration: 0",
                                               fontsize=14, horizontalalignment='center')

        # Widget Functionality
        view_wf = axes([.53, 0.91, 0.08, 0.025])
        self.view_wfbutton = Button(view_wf, 'Wireframe')
        view_attract = axes([.615, 0.91, 0.08, 0.025])
        self.view_attractbutton = Button(view_attract, 'Attractors')

    def _draw_network(self):
        """
        Draws the network diagram to the Matplotlib canvas.
        """
        connections = set()
        colors = ['green', 'blue', 'red']
        for (index1, neuron) in enumerate(self.neurons):
            neuron.draw(self.main_network)
            connections.add(neuron)
            for (index2, neuron_two) in enumerate(self.neurons):
                if neuron_two in connections:
                    continue
                else:
                    connection_color = colors[int(self.weights()[index1, index2])]
                    neuron.draw_connection(neuron_two, connection_color, self.main_network)
            self.main_network.autoscale(tight=False)

    def _plot_energy(self, num_samples=4, path_length=20):
        """
        Plots the energy function of the network.

        num_samples         The number of samples to be used in the computation of the energy function.
                            The greater the number of samples, the higher the accuracy of the resultant plot.
        path_length         The number of steps to compute in calculating each sample's path of convergence
                            toward the network's attractors.
        """
        attractors = self.training_data
        states = [[np.random.choice([-1, 1]) for i in range(self.num_neurons)] for j in range(num_samples)]
        self.pca = PCA(n_components=2)
        self.pca.fit(attractors)
        paths = [attractors]
        for i in range(path_length):
            states = self.recall(states, steps=1)
            paths.append(states)
        x = y = np.linspace(-1, 1, 100)
        X,Y = np.meshgrid(x, y)
        meshpts = np.array([[x, y] for x, y in zip(np.ravel(X), np.ravel(Y))])
        mesh = self.pca.inverse_transform(meshpts)
        grid = np.vstack((mesh, np.vstack(paths)))
        energies = np.array([self.energy(point) for point in grid])
        grid = self.pca.transform(grid)
        gmin, gmax = grid.min(), grid.max()
        xi, yi = np.mgrid[gmin:gmax:100j, gmin:gmax:100j]
        zi = gd(grid, energies, (xi, yi), method='nearest')
        wireframe = self.energy_diagram.add_layer("wireframe")
        mesh_plot = self.energy_diagram.add_layer("mesh_plot")
        attracts = self.energy_diagram.add_layer("attractors")
        wireframe.add_data(xi, yi, zi)
        mesh_plot.add_data(xi, yi, zi)
        self.energy_diagram.build_layer(wireframe.name, plot=self.energy_diagram.plot_wireframe,
                                        colors=(0.5, 0.5, 0.5, 0.5), alpha=0.2)
        self.energy_diagram.build_layer(mesh_plot.name, plot=self.energy_diagram.plot_surface, cmap=cm.coolwarm)
        self.contour_diagram.contour(xi, yi, zi)
        grid = self.pca.transform(attractors)
        z = np.array([self.energy(state) for state in attractors])
        attracts.add_data(grid[:,0], grid[:,1], z)
        self.energy_diagram.build_layer("attractors", plot=self.energy_diagram.scatter, s=80, c='g', marker='o')
        wireframe.hide()
        attracts.hide()

        def wireframe_click(event):
            wireframe.toggle_display()
            mesh_plot.toggle_display()

        def attractor_click(event):
            attracts.toggle_display()

        self.view_wfbutton.on_clicked(wireframe_click)
        self.view_attractbutton.on_clicked(attractor_click)

    def _normalize_network(self):
        """
        Normalizes the line width of each visual connection in the network.

        To be called between the training and recall steps of the visualization.
        """
        for neuron in self.neurons:
            for line in neuron.connections.values():
                if line.get_linewidth() != 1:
                    setp(line, linewidth=1)

    def _plot_state(self, state):
        """
        Plot state to the state_diagram.
        """
        state = np.array(state)
        self.state_plot = self.state_diagram.imshow(state.reshape(5, 5),
                                                    cmap=cm.binary,
                                                    interpolation='nearest')
        self.state_plot.norm.vmin, self.state_plot.norm.vmax = -1, 1

    def _plot_weights(self):
        """
        Draws a heatmap of the network's weight matrix.
        """
        self.cmap = self.weight_diagram.imshow(self._train_act(self.weights()),
                                               vmin=-1, vmax=1, cmap='viridis',
                                               aspect='auto')
        self.cbar_axes = axes([0.91, 0.1, .017, .3625])
        cbar = self.network_fig.colorbar(self.cmap, cax=self.cbar_axes)

Understanding the Hopfield Network

In order to understand the Hopfield network's action, we must first understand its implementation. The network has three defining characteristics that separate it from other RNNs.

  1. As in all neural networks, the connection between any two neurons is assigned some weight. In the case of the Hopfield network all connections are symmetric, i.e. for any two neurons $N_i$, $N_j$, we have $w_{ij} = w_{ji}$ where $w_{ij}$ and $w_{ji}$ are the weights of the connections between neurons $i$ and $j$ and neurons $j$ and $i$, respectively. What's more, each neuron is connected to every other neuron in the network, although not to itself.

  2. The neurons in a Hopfield network are bipolar. This means that they have two possible output states: 1 if the sum of the input values to the neuron exceeds the threshold given by the network's activation function and -1 otherwise.

  3. Because every Hopfield network is fully connected, we can associate with it a weight matrix which contains as the $ij^{\text{th}}$ entry the weight of the connection between $N_i$ and $N_j$. As a result of the properties enumerated above, this weight matrix is symmetric and consists of -1's and 1's with 0's along the diagonal (since $w_{ii} = 0 \quad \forall i$).

The natural question arising from this definition of the network is the following: How are the weights of the connections between individual neurons assigned?

Training the Hopfield Network

Assigning weights to the connections of a neural network takes place during a process called "training" in which the network is fed input data having some well-defined set of features, features which the network would ideally learn. There are a number of different statistical models of learning that have been proposed and applied over the course of the neural network's development. Our Hopfield network implements the Hebbian learning rule and will support Storkey learning in the near future.

The Hebbian learning rule is one of the oldest learning models in existence, having been developed by Donald Hebb in 1949. It is based on a simple premise: that "neurons that fire together, wire together, [and] neurons that fire out of sync, fail to link." Formally speaking, the rule calculates the weight matrix as

$$W = \frac{1}{n} \sum_{i = 1}^{n} \mu_i \otimes \mu_i$$

Thus the network is presented with a set of $n$ vectors $\{\mu_1 \ldots \mu_n\}$ and sums the outer product of each vector with itself.

How the Network Responds to Input

After training, the network sports a well-defined weight matrix. Subsequently, when presented with an input vector the network generates an output vector based on this weight matrix. In our implementation, this occurs in the function HopfieldNetwork.recall(). The output vector is computed simply by multiplying the input vector by the weight matrix. In this way, the network applies its knowledge of the vectors it learned during training (encoded in the weight matrix) to its generation of an output. Broadly speaking, if the network is told to recall an input vector that it has already seen during training, it should output that very same vector. If the vector was not seen during training, the output should be an idealization of that vector based on the information that the network has seen. This desired behavior does not always occur due to the existence of a phenomenon called spurious states which we will touch on now.

The Energy Function of a Hopfield Network

For each input (or state) vector $[s_1, \ldots, s_n]$ provided to our network, we can define the energy associated with that vector as $$ E = -\frac{1}{2}\sum_{i,j} w_{ij}s_{i}s_{j}$$ It turns out that the action of the Hopfield Network can be completely inferred from the topology of this energy function. When plotted in $n+1$ dimensions, the energy function traces out an energy landscape. Like any bounded manifold, this landscape possesses a series of local minima. It can be proven mathematically that each of the input vectors provided to the network in training become one of these attractor states. Unfortunately, it is also the case that local minima exist which are not associated with the training vectors. These are the aforementioned spurious states. It is always the case that during the recall phase the network converges to the local minima closest to the input vector(s) provided. Therefore, if $d(i, s) < d(i, t_j) \quad \forall j$ where $s$ is some spurious state, each $t_j$ is a training state, and $d$ is a metric defined on your network's domain space, then the network will converge to a spurious state and not to one of the training states.

In the standard Hopfield network, such as the one that our code defines, the number of training vectors or states that the network is capable of remembering reliably is given by the formula

$$M = 0.15N$$

where $M$ is the number of preserved memories, and $N$ is the number of neurons in the network. Since, in general, we seek to preserve our training data as the attractors or "memories" of the network, we should take care to not exceed this upper bound on the number of training samples we provide. If this bound is exceeded, it turns out that attractor minima can merge into single, deeper minimums wherein spurious states are formed. For our tutorial, we will be training on the alphabet using a network composed of 25 neurons. Therefore, we will be able to reliably distinguish between at most three different letters.

The Two Types of Update Processes

When a network is presented with an input vector to recall, it can proceed in one of two ways: via either synchronous or asynchronous updating of the network's internal state.

In asynchronous updating, each component of the input vector (i.e. each neuron) is updated independent of all other components. Since each of these updates occupies a single time step, subsequent updating can be affected by the results of previous updates. In our network, asynchronous updating is enabled by default as it guarantees that the network will converge to some attractor state. It is also what we will use throughout the course of this tutorial.

Implementation Detail: When implementing asynchronous updating in code, it is necessary at each time step to select (in a way which does not noticeably perturb the network's structure) a neuron to update. In our code, this is implemented as a random process which does not converge until each neuron has been updated. You will notice that as a result the recall phase of the network's operation will take somewhere between approximately 70 - 120 iterations to finish. If you are visualizing this process, there will be noticeable periods of apparent inactivity where the network's energy configuration does not change. This means that the network is randomly selecting and updating neurons that have already been updated.

In synchronous updating, all components of the input vector (neurons) are updated in tandem. Naturally, this method has time complexity benefits when compared to asynchronous updating; however, when synchronous updating is used, the network is not guaranteed to converge to a stable state. Our network also supports synchronous updating.

Visualizing the Network

Now that you know the basic theory behind the Hopfield network's operation, lets turn to the task of visualizing these networks using Fovea. The visualization code is contained in the retina.mlearn.hopfield module in the file visuals.py. This file defines two classes: VisualNeuron and VisualHopfield.

The VisualNeuron Class

The simpler of the two classes is VisualNeuron. All calls to VisualNeuron are hidden internally within VisualHopfied, and it is unlikely that you will need to alter this class. That said, if you do decide to adjust the visualization of individual neurons to your own specifications you will need to be familiar with the following class functions.

__init__(self, r, theta): In our construction, a VisualNeuron is instantiated with a pair of polar coordinates $(r, \theta)$. Within the __init__ method, these coordinates are converted to Cartesian form and stored as the instance variables self.x and self.y.

draw(self, axis): This method takes a single argument, namely the Matplotlib axs to which the neuron's visual representation should be drawn. The function creates a neuronal "body" using the Matplotlib Circle patch centered at the coordinates (self.x, self.y). You can change the patch class used, or provide an artist instance of your own should you prefer a different visual representation of each node in the network.

draw_connection(self, neuron, connection_color, axis): This function takes another VisualNeuron instance and draws a line having the specified connection_color between the two. Internally, the Matplotlib Line2D class is used, but of course you can make any alterations you see fit.

delete_connection(self, neuron): As in the above, this function takes a VisualNeuron instance and deletes the connection between the two. This deletion is two-fold. It occurs at the visual level on the connection's axes and at the object-state level. Neither the calling neuron, nor the argument neuron, will be able to access a connection's attributes after it has been deleted.

The VisualHopfield Class

We will now dissect, function by function, the operation of the VisualNeuron class. The core event loop for the default network visualization is contained in the method run_visualization. Let's take a look at its contents.

        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            ion()
            self.training_data = training_data
            self.recall_data = recall_data
            self._setup_display()
            self._draw_network()
            self._plot_state([-1 for i in range(self.num_neurons)])
            self._plot_weights()
            print("Training...")
            self._set_mode("Training")
            self.train(training_data, inject=self._train_inject)
            self._normalize_network()
            self._plot_energy()
            print("Recalling...")
            self._set_mode("Recalling")
            for state in learning_data:
                self.recall([state], inject=self._recall_inject)

We will ignore the code dealing with warnings. These lines just suppress annoying command-line output associated with deprecations involving interactive drawing methods provided by the Matplotlib backends. Therefore, the first relevant line is self._setup_display(). Let's jump to the private _setup_display method and investigate its action.

The _setup_display Function

The task of the _setup_display function is to construct the Matplotlib figure window and arrange its component axes. The figure is assigned to the instance attribute self.network_fig and can be accessed as such. The component axes are the following:

        self.main_network    --  Axes to which VisualNeurons and their connections are drawn.
        self.energy_diagram  --  The 3D axes in which the network's energy function is plotted.
        self.contour_diagram --  Axes to hold the contour plot of the network's energy function.
        self.state_diagram   --  Axes to which a checkerboard representation of the network's binary state is drawn.
        self.weight_diagram  --  Axes to hold a heatmap representation of the network's weight matrix.

The function also creates widgets to toggle layer visibility in the energy diagram and dynamic text labels that change based on the network's current mode (training or recalling) and iteration count within those methods.

The _draw_network Function

This method uses a simple loop to draw the symmetric connections that occur between each of the network's neurons. Probably the most useful information that can be gleaned from the function is the significance of a connection's color. In our setup, connections having the weight 0 are colored green, the weight 1 colored blue, and the weight -1 colored red. You can change these colors to suit your preferences by altering the variable colors.

The _plot_state Function

This method handles all of the plotting necessary for creating the network's state diagram. The network's "state" at any given point in time is the output of its recall function in response to an input vector. Since all network vectors are bipolar, we split its output vector at even intervals and use these vectors as the rows of a matrix. This matrix is converted to a black-and-white patchwork diagram using Matplotlib's imshow function.

IMPORTANT: Each state is a numpy array, and is converted to a matrix using the state.reshape() function. You will need to alter this call should you seek to visualize a network not having exactly 25 neurons as is used for the OCR example.

The _plot_weights Function

This one is similar to _plot_state in its action and construction. The imshow map is used to create a colormap representation of the network's weight matrix. You can modify this method to change the colormap used.

The _set_mode Function

Updates the network's "mode" text label based on the network's current operation: either training or recalling.

The _train_inject Function

This function contains visualization code that is injected into the HopfieldNetwork base class' training method. Alter with care.

The _recall_inject Function

Same as above, except for the network's recalling method. Alter with care, as well.

The _normalize_network Function

This method serves one purpose: resetting the linewidth of those neuronal connections that were manipulated by the network during its training phase so as not to pollute with thickness the visual space.

The _plot_energy Function

The _plot_energy function is probably the most involved method of the entire visualization, as there is a great deal of legwork that goes into creating a 3-dimensional representation of the energy landscape. For networks having $\geq 3$ neurons, the plot of the energy function will reside in 4 or more dimensions. In order to reduce this $n$-dimensional information to a 3-dimensional representation, we perform a 2D PCA transformation on the network's training data. If we treat these PCA axes as our x and y axes and the energy value as the z-axis, we are able to plot the energy function accordingly.

This function, in fact, contains three plot artists in three different layers. The first layer contains the network's attractors, plotted in a scatter plot as green markers. The second layer contains a wireframe plot of the energy landscape onto which the attractor markers can be overlaid. The third layer contains the energy landscape plotted as a surface.

Visualizing Optical Character Recognition

Now that we've fully described both the HopfieldNetwork and VisualHopfield classes, let's turn our attention to the OCR example packaged in the demos/hopfield directory.

The network makes the following imports:

from retina.mlearn.hopfield.alphabet import *
import retina.mlearn.hopfield.visuals as visuals
import numpy as np

Hopefully, the only one of these needing explanation is the import involving the file alphabet.py. This file contains a series of variable definitions corresponding to each of the 26 letters of the English alphabet. Each variable holds a string corresponding to a 5x5 binary representation of its character. For example, here is a sample assignment taken from the file:

F = """
XXXXX
X....
XXXXX
X....
X....
"""

The ocr.py file defines a function, to_state, which takes a letter's string representation and converts it to a binary vector which the hopfield network can use as either training or recall data. The string is flattened into a 1x25 array of X's and .'s and each of these is mapped to the values 1 and -1, respectively. As an example, F as given above would map to:

[1, 1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, 1, -1, -1, -1, -1]

The OCR script converts each of the 26 letters to the above format, and then provides them to the Hopfield Network as training data. It then takes the letter l1 (also defined in alphabet.py) and passes it as input to the network's recall method. This letter looks most similar to the letter A, and the network should (ideally) recognize it as such.

Now, let's take a look at each Axes of the visualization independently and see how we can use these diagrams to inform our understanding of the theoretical properties of the network.

The Network Diagram

In [3]:
%matplotlib notebook
import matplotlib.pyplot as plt
In [4]:
# %load ocr.py
from retina.mlearn.hopfield.alphabet import *
import retina.mlearn.hopfield.visuals as visuals
import numpy as np

def to_state(letter):
    return np.array([1 if char == 'X' else -1 for char in letter.replace('\n','')])

myNet = visuals.VisualHopfield(25)
alphabet = [A, B, C, D, E, F, G, H, I, J, K, L, M, 
        N, O, P, Q, R, S, T, U, V, W, X, Y, Z]
training_data = [to_state(letter) for letter in alphabet]
recall_data = [to_state(l1)]
In [5]:
myNet.network_fig = plt.figure(figsize=(10, 10))
myNet.network_fig.canvas.set_window_title("Hopfield Network Visualization")

myNet.main_network = plt.subplot(111)
myNet.main_network.set_title("Network Diagram")
myNet.main_network.get_xaxis().set_ticks([])
myNet.main_network.get_yaxis().set_ticks([])

myNet.network_fig.suptitle("Hopfield Network Visualization", fontsize=14)

myNet._draw_network()

Above we see the network's 25 neurons, each connected to every other, arranged in a circle. Before training has started the weight of each connection defaults to zero, as is pictured above (green = 0).

Now, let's provide a training vector and watch the network's state connections adapt in response.

myNet.train(to_state(B))

You should see the connections between neurons change such that the final configuration of the network looks like this

Network After Training

As in our discussion of the _draw_network function, blue lines represent connections having weight 1 and red lines those having weight -1.

The Weight Matrix Diagram

Now, if you turn your attention to the network's weight matrix diagram, you should see an image that looks like this:

Weight Matrix Diagram

Upon closer inspection, you can see that the color map really highlights the symmetry of the weight matrix. Every column color pattern in this diagram is identical to that of its correspondingly numbered row, an example of which I've bounded in red, below:

Weight Matrix Symmetry

Of course, yellow blocks of pixels indicate an entry of -1 in that position, and purple blocks of pixels indicate an entry of 1.

The State Diagram

Now, let's train our network on the letters B and I, and then tell it to recall the following letter:

shift_I = """
XXXXX
...X.
...X.
...X.
XXXXX
"""

We would hope that our network recognizes this as the letter I and converges to that state. Let's see what output we get from our state diagram at the end of the recall process:

Network State Diagram

It is indeed the case that our network converges to the expected state, the letter I!

The Energy Function Diagram

Now, let's try to better understand the action of this convergence by looking to our network's energy function diagram. When recall starts, the landscape is plotted as a mesh surface with areas of low energy colored blue and higher energy colored red. It should look more or less like this:

Energy Mesh Surface

As explained in our previous exposition of the network's theoretical properties, the states on which the network trained should be attractors and lie at some local minima of this landscape. However, the network's landscape is varied and complex, and it's not immediately clear at which local minima our attractors should reside. Luckily, Fovea does the computation for you. In order to see the plotted attractors, we need to click both the "Wireframe" and Attractors" buttons. The wireframe button toggles the visibility of both the mesh surface and wireframe layers, switching one off and the other on. In this case, we want to view the opaque wireframe as it will prevent the attractor markers from being covered by the mesh surface. We also click the attractors button in order to see the energies of our original states marked as green dots. When you do this, you will see a blue marker in addition to the green ones. This marker indicates the energy value of the network's current state. This is what the diagram looks like when the wireframe and attractors are shown together:

Stage 1

As you can see below, our energy landscape has our two attractor states positioned at the lowest possible energy levels along with the network's current state positioned near a peak of the landscape. You can also see that there is a rift between the two minima wherein the global maximum is attained. The reasons for this topology are the following:

  1. The two states upon which the data were trained are highly dissimilar in presentation.
  2. As such, only a great deal of "energy" could take attractor B, move it over the global maximum, and transmute it to attractor I, and vice versa.

As you can see, our recall state initially rests near the top of a peak located close to the "I" attractor. Over the course of the recall process, the blue current state marker will fall from this peak and ultimately converge to the energy value of its final state, I. This progression is shown below.

Stage 2

Stage 3

Stage 4

Stage 5

The Energy Contour Diagram

The last, and in this case probably least, part of the visualization is the contour diagram. This provides a simple, 2D contour plot of the 3-dimensional energy landscape. Shown below is an energy function and its corresponding contour drawing.

Contour Plot

Conclusion

And there you have it, a full-scale visualization of a Hopfield RNN created using Fovea's machine learning package. I hope you found this tutorial enlightening and that it informed more fully your understanding of the Hopfield network and how it responds to data. Until next time.