Documentation for the Java version of the neural network code

Documentation for the original C version of this assignment is also available (in PostScript format).

Although you will be able to use most of the code without modification, you will need to know a little bit about the routines and data structures, so that you can easily implement new output encodings for your networks.  The following sections describe each of the packages in a little more detail, but you should look at BPNN.java, facetrain.java, and PGMimage.java to see how the pieces all fit together.

In fact, it is probably a good idea to look over facetrain.java first, to see how the training process works.  You will notice that the set_target method in BPNN.java is called to set up the target vector for training.  You will also notice the routines show_performance_on_dataset, show_misclassified_images, and correct_response, which evaluate performance and compute error statistics.

You will almost certainly not need to use all of the information in the following sections, so don't feel like you need to know everything the packages do.  You should view these sections as reference guides for the packages, should you need information on data structures and routines.

Another fun thing to do is to use the output2pgm and hidden2pgm utilities to view the weights on connections in graphical form.  See the comments in the source code for information on how to use these programs.

Finally, the point of this assignment is for you to obtain first-hand experience in working with neural networks; it is not intended as an exercise in Java or C hacking.  An effort has been made to keep things as simple as possible.  If you need clarification on how the code works, please don't hesitate to ask.

Main Classes

The Java code for this assignment consists of several classes.  See the comments in the source code for more detailed information.

PGMimage.java - Supports reading and writing of PGM image files and pixel access/assignment.  You will not need to modify any of the code in this class to complete the assignment.

BPNN.java - The neural network class.  Supports three-layer fully-connected feedforward networks, using the backpropagation algorithm for weight tuning.  Provides high-level routines for creating, training, and using networks.  The methods set_target and correct_response are also in this class; you will need to modify these for your face and pose recognizers.

facetrain.java - The top-level program which uses all of the classes above to implement a "glickman" recognizer.  You will need to modify this code to change network sizes and learning parameters, both of which are trivial changes.

hidden2pgm.java, output2pgm.java - Utilities for visualizing network weights.  It's not necessary to modify anything here, although it may be interesting to explore some of the many possible alternate visualization schemes.

facetrain

facetrain has several options which can be specified on the command line.  This section briefly describes how each option works.  A very short summary of this information can be obtained by running facetrain with no arguments.
-n <network file>
This option either loads an existing network file, or creates a new one with the given name.  At the end of training, the neural network will be saved to this file.
-e <number of epochs>
This option specifies the number of training epochs which will be run.  If this option is not specified, the default is 100.
-s <seed>
An integer that will be used as the seed for the random number generator.  The default seed is 123456.  This allows you to reproduce experiments if necessary, by generating the same sequence of random numbers.  It also allows you to try different sequences by changing the seed.
-S <number of epochs between saves>
This option specifies the number of epochs between saves.  The default is 100, which means that if you train for 100 epochs (also the default), the network is only saved when training is completed.
-t <training image list>
This option specifies a text file that contains a list of image pathnames, one per line, that will be used for training.  If this option is not specified, it is assumed that no training will take place (epochs = 0), and the network will simply be run on the test sets.  In this case, the statistics for the training set will be shown as dashes.
-1 <testing set 1 list>
This option specifies a text file which contains a list of image pathnames, one per line, that will be used as a test set.  If this option is not specified, the statistics for test set 1 will be shown as dashes.
-2 <testing set 2 list>
Same as above, but for test set 2.  The idea behind having two test sets is that one can be used as part of the training/testing paradigm, in which training is stopped when performance on the test set begins to degrade.  The other can then be used as a "real" test of the resulting network.
-T
For test-only mode (no training).  Performance will be reported on each of the three datasets specified, and those images misclassified will be listed, along with the corresponding output unit levels.

Interpreting the Output of facetrain

When you run facetrain, it will first read in all the data files.  Once all the data files are loaded, it will begin training.  At this point, the network's training and testing set performance is reported in one line per epoch.  For each epoch, the following performance measures are output:

<epoch> <train%> <trainerr> <t1%> <t1err> <t2%> <t2err>

These values have the following meanings:

<epoch> is the number of the epoch just completed; a value of 0 means that no training has yet been performed.

<train%> is the percentage of examples in the training set that were correctly classified.

<trainerr> is the average, over all training examples, of the error function , where ti is the target value for output unit i and oi is the actual output value for that unit.

<t1%> is the percentage of examples in test set 1 that were correctly classified.

<t1err> is the average, over all examples in test set 1, of the error function described above.

<t2%> is the percentage of examples in test set 2 that were correctly classified.

<t2err> is the average, over all examples in test set 2, of the error function described above.