Lab: Perceptrons

Bio-Inspired Artificial Intelligence — Spring 2023

Lab: Perceptrons

Manually run the Python commands below to create a 2-input perceptron called p with weight values 0.1 and 0.2 and a bias of -0.3:
```
>>> p = Perceptron()
>>> p.set_weights(0.1, 0.2, -0.3)
```
These weights cause the perceptron to compute the AND function, which can be verified by testing it on the four 2-bit input patterns in the patterns list and the four target values in the ANDtargets list:
```
>>> p.test(patterns, ANDtargets)
```
How well does this set of weights perform on the other logical functions OR, NAND, NOR, and XOR? Try calling p.test with the target values for these other functions to see how many of the input patterns are classified correctly in each case. You can also use the total_error method to find the total sum squared (TSS) error produced by the weights with respect to a given set of target values.

Consider the following logical function:

[0, 0] → 1
[0, 1] → 0
[1, 0] → 1
[1, 1] → 1

For each set of weights in the table below, fill in the corresponding TSS error produced by the weights with respect to the above target values:

  weights    bias        TSS error
-----------------------------------
 0.1   0.2    0.3
	             
 0.0   0.0    0.0
	             
-0.1  -0.2   -0.3
	             
 1.2  -2.3   -0.2
	             
-0.3   0.3   -0.3
	             
 0.2  -0.1    0.1

Set the perceptron's weights and bias to the values above that produced the maximum TSS error, and then train the perceptron on the target values 1, 0, 1, 1. With a learning rate of 0.1, how many weight updates does it take for the perceptron to learn a correct set of weights? What are the resulting weight and bias values?
Now retrain the perceptron on the opposite target values 0, 1, 0, 0, starting from the final weight values obtained at the end of the previous exercise. How many weight updates does it take to learn these new responses correctly? What are the new resulting weight values? Can the perceptron be retrained on the other logical functions as well?
Modify the code for the Perceptron class to handle 3 inputs instead of 2. Then create two perceptrons called carry and sum, with 3 inputs each, and train them on the 3-bit binary addition task shown below. The CARRY and SUM columns together form the binary representation of the sum of the three input values. The eight 3-bit input patterns are already provided for you in the list patterns3.
```
  inputs         CARRY   SUM
0 + 0 + 0    =     0      0
0 + 0 + 1    =     0      1
0 + 1 + 0    =     0      1
0 + 1 + 1    =     1      0
1 + 0 + 0    =     0      1
1 + 0 + 1    =     1      0
1 + 1 + 0    =     1      0
1 + 1 + 1    =     1      1
```
Can each perceptron learn an appropriate set of weights that enables it to solve its respective task? If so, how many weight updates are required, and what are the resulting sets of weights? If not, why not? Hint: consider the 8 input patterns above to be the vertices of a 3-dimensional cube, where each vertex is labeled by the desired output value (0 or 1) for that input pattern.
Rewrite your Perceptron class so that it can handle any number of inputs. To do this, add a new parameter called num_inputs to the __init__ constructor that specifies how many inputs the perceptron should have. Instead of storing the weights as separate variables, store them as a single list containing num_inputs weight values. Modify the set_weights method to take a weight_list and a bias value as input parameters. You will also need to modify the methods show_weights, propagate, and adjust_weights appropriately. Make sure your new generalized Perceptron class still works correctly on the previous 2- and 3-input training tasks.
For a more extensive test on some real-world data, download the following data files (on a Mac, right-click and choose Save Link As):
- cancerPatterns.txt
- cancerTargets.txt
This data is based on the Wisconsin Diagnostic Breast Cancer dataset. Each input pattern consists of 30 values describing physical characteristics of cell nuclei taken from diagnostic tissue samples. Each target value specifies whether the pattern represents a benign (0) or malignant (1) instance of breast cancer. The dataset contains a total of 550 patterns. The read_dataset function (already defined for you) can be used to easily read in the data patterns from the files, as follows:
```
samples, diagnosis = read_dataset("cancerPatterns.txt", "cancerTargets.txt")
```
This will create two lists: samples will be a list of 550 input patterns representing tissue samples, each of which is itself a list of 30 floating-point numbers; and diagnosis will be a list of target values (0 or 1) indicating the correct diagnostic classification of the corresponding tissue sample. For example, diagnosis[3] is 1, indicating that samples[3] is malignant.

To test the perceptron's ability to correctly diagnose new samples that it has never been trained on, divide the dataset into a training set and a testing set, each consisting of 275 patterns (one half of the dataset):
```
training_samples, training_targets = samples[0:275], diagnosis[0:275]
testing_samples, testing_targets = samples[275:550], diagnosis[275:550]
```
Next, create a perceptron with 30 input units and train it on the first half of the cancer dataset (the first 275 patterns). How many weight updates does it take to learn the training set? Next, test the performance of the trained perceptron on the remaining 275 testing patterns. What percentage of these novel patterns does the perceptron classify correctly? Perform this experiment five times, each time starting from new randomized weight and bias values, and compute the average of your results.
Testing on the last 275 patterns in the dataset seems to always give 5 wrong patterns. Would this happen if the dataset were randomly shuffled before dividing it into training/testing sets? Try this out, being careful to keep the correct target values associated with the shuffled input patterns. You can use the reshuffle_dataset function (already defined for you) to reshuffle the input and target patterns while keeping them in sync.