CPSC 352 -- Artificial Intelligence
Notes: Algorithms & Approaches: Neural Networks

Introduction

In this lecture we consider the basics of the neural network approach to AI. It too involves some form of knowledge representation and search. The neural network or connectionist approach is well suited for:

An Artificial Neuron

Neuron

McCulloch-Pitts Neurons

If the value of the neuron's activation is less than 0, it returns -1; otherwise it returns +1.

McCulloch-Pitts

Truth Table for the McCulloch-Pitts AND Function
XYX + Y - 2Output: X AND Y
1101
10-1-1
01-1-1
00-2-1

Connectionist Learning

Hebbian Learning (1949):

Repeated stimulation between two or more neurons strengthens the connection weights among those neurons. One problem with this model is it had no way to model inhibition between neurons.

Perceptron Learning (1958)

A perceptron is a single-layer network that calculates a linear combination of its inputs and outputs a 1 if the result is greater than some threshold and a -1 if it is not:

Perceptron

Perceptron learning involves choosing values for its weights. It uses a form of supervised learning. After attempting to solve a problem instance, the perceptron is given the correct result. The perceptron then changes its weights to reduce the error. By doing this for an entire set of training data, the perceptron will minimize the average error over the entire set.

Learning Algorithm

Minsky and Papert (1969) showed that if there is a set of weights that give the correct output for an entire training set, a perceptron will learn it.

Example: Perceptrons can learn models for the following primitive boolean functions: AND, OR, NOT, NAND, NOR. Here's an example for AND:

Perceptron And Function

Limitations of Perceptrons

Minsky and Papert (1969) showed that perceptrons could not model the exclusive-or function, because its outputs are not linearly separable. Two classes of outputs are linearly separable if and only if you can draw a straight line in two dimensions that separates one classification from another.

Perceptron XOR Function

Multi-layer Perceptron

It was known at the time (1969, by some researchers at least) that a multi-layer neural network could learn the XOR function:

The key here is the hidden unit, which makes it possible for the network to learn functions where the outputs are not linearly separable.

Backpropagation Learning

In general, functions whose outputs are not linearly separable can be represented by multi-layer networks.

Backpropagation is a supervised learning algorithm for multi-layer networks.

Initialize the weights in the network (often randomly)
  Do
         For each example e in the training set
              O = neural-net-output(network, e) ; forward pass
              T = teacher output for e
              Calculate error (T - O) at the output units
              Compute delta_wh for all weights from hidden layer to output layer ; backward pass
              Compute delta_wi for all weights from input layer to hidden layer ; backward pass continued
              Update the weights in the network
  Until all examples classified correctly or stopping criterion satisfied
  Return the network

Connectionist Knowledge Representation and Search

Neural networks do not represent knowledge in terms of discrete symbols. Instead knowledge, such as the AND function, is represented as a network of neuronal-like elements. However, this does not eliminate knowledge representation issues. It merely changes them: how many input elements are needed? How many output elements? What should their weights be? How many hidden elements?

Similarly, neural networks do not perform search in the traditional sense. Nevertheless search is involved in learning algorithms, which can be viewed as a search for appropriate weights from among the set of all possible weights. The success of the search will be affected by the learning algorithm and its parameters.

Homework Exercises

  1. Using the above conventions, design a McCulloch-Pitts neuron for the NOR function with 2 inputs -- i.e., neither X nor Y is true.

  2. Draw a truth table for the above XOR network to show that it indeed represents the XOR function.