Why connectionism?, by Elman et al., from Rethinking Innateness: A Connectionist Perspective on Development, MIT Press, 1996.
You should use simple-perceptron.py as your starting point for this assignment.
Create a perceptron called n with 2 inputs, and initialize its dataset to the patterns [0, 0], [0, 1], [1, 0], and [1, 1].
Manually set the perceptron's weights to the values 0.1, 0.2, and -0.3 by calling the setWeights method as shown below:
>>> n.setWeights([0.1, 0.2, -0.3])
What outputs does the perceptron give for the four dataset patterns using the above set of weights? Call n.test() to find out.
What happens if you call the computeError() method at this point? Why?
To use computeError, we need to tell the perceptron what the correct response is for each dataset pattern. We do this by setting n.targets to a list of the desired responses. Suppose our desired outputs are
[0, 0] -> 1 [0, 1] -> 0 [1, 0] -> 1 [1, 1] -> 1
After setting n.targets to the list [1, 0, 1, 1], call n.computeError() to find out the total sum squared error (TSS) value for this set of weights, with respect to the target values. Next, try a few other sets of weights. In particular, find out the TSS error values for the following weights:
weights TSS error
------------------------------------
0.1 0.2 0.3
0.0 0.0 0.0
-0.1 -0.2 -0.3
1.2 -4.3 -0.2
-0.2 0.1 0.0
0.2 -0.1 0.1
Set the perceptron's weights to produce maximum error and then call n.test() to verify that the perceptron gets all patterns incorrect. Next, call n.train().
How many training epochs does it take for the perceptron to learn a correct set of weights (assuming a learning rate of 0.1)? What are the resulting weights? Are these the same as the set of weights in the above table that solve the problem correctly?
Change each target to its opposite value by setting n.targets to [0, 1, 0, 0], and retrain the perceptron. How many epochs does it take to learn these new responses correctly?
Our perceptron contains exactly three weights, which we can visualize as a single point in 3-dimensional space (where each axis corresponds to one of the weight parameters). Changing the weight values during learning thus corresponds to following a pathway through this 3-D space, in search of the region that produces the smallest error value.
It is easy to create a simple visualization of this pathway. First, we need to modify train so that it writes the current set of weight values to a file on each training cycle. Then we can view the resulting file as a 3-D graph using the Gnuplot program (which is installed on the robot lab Linux machines). Add the following code to the definition of train:
def train(self, cycles=1000):
...
self.learningRate = 0.001
file = open('pathway.dat', mode='w') # write mode
for c in self.allConnections:
file.write('%12g ' % c.weight)
file.write('\n')
for t in range(1, cycles+1):
...
for c in self.allConnections:
file.write('%12g ' % c.weight)
file.write('\n')
if correct == total:
print 'All patterns learned'
break
file.close()
Now when we call n.train(), a file called pathway.dat will automatically get created that contains each intermediate set of weights (one set per line). Notice also that we make the learning rate small in order to create a larger number of points along the pathway, for better visualization. Test this out by restarting Python with the new code and training your perceptron on the AND task, starting with random weights, as follows:
>>> n.initialize() >>> n.targets = [0, 0, 0, 1] >>> n.train()
This should create the pathway.dat file. Next, open a separate terminal window and start Gnuplot at the Linux prompt by typing gnuplot (type Control-D to exit back to Linux). Use the following Gnuplot commands to view the data as a 3-D plot (click and drag the graph with the mouse to change the viewing angle):
gnuplot> set style data linespoints gnuplot> splot 'pathway.dat' pt 7
Now, in the other window, retrain the perceptron on the OR task by setting n.targets to [0, 1, 1, 1], reinitializing the weights, and calling n.train(). To view the resulting pathway, execute the splot command again in Gnuplot (just hit the up-arrow key). How does the training pathway for OR compare to the previous pathway for AND? What does the pathway look like if you train on the XOR task with targets [0, 1, 1, 0]? Try each of these a few times starting with different random weights.
Create two perceptrons called carry and sum, with 3 inputs each, to learn the 3-bit binary addition problem, which can be expressed by the following truth table:
inputs carry sum 0 + 0 + 0 = 0 0 0 + 0 + 1 = 0 1 0 + 1 + 0 = 0 1 0 + 1 + 1 = 1 0 1 + 0 + 0 = 0 1 1 + 0 + 1 = 1 0 1 + 1 + 0 = 1 0 1 + 1 + 1 = 1 1
Can each perceptron learn a set of weights that enables it to solve its respective task? If so, how many training epochs are required, and what are the resulting sets of weights? If not, why not? Explain your answer clearly, using a geometric argument. Hint: consider the above 8 input patterns to be the vertices of a 3-dimensional cube, where each vertex is labeled by the desired output value for that pattern.
Can you create a visualization of the learning pathway for this problem like in Part 1? If so, do it. If not, why not? Explain clearly.
Write up your answers to the above questions on paper (clearly and legibly!) and turn this in during class. You do not need to turn in any code. If you have questions about anything, just ask!