CPSC 352 -- Artificial Intelligence
Notes: Machine Learning: Neural Networks

Introduction

In this lecture we consider the basics of machine learning in neural networks.

An Artificial Neuron

Neuron

Connectionist Learning

Hebbian Learning (1949):

Repeated stimulation between two or more neurons strengthens the connection weights among those neurons. One problem with this model is it had no way to model inhibition between neurons.

Perceptron Learning (1958):

A perceptron is a single-layer network that calculates a linear combination of its inputs and outputs a 1 if the result is greater than some threshold and a -1 if it is not:

Perceptron

Supervised Perceptron Learning

By repeatedly adjusting weights in this fashion for an entire set of training data, the perceptron will minimize the average error over the entire set.

Minsky and Papert (1969) showed that if there is a set of weights that give the correct output for an entire training set, a perceptron will learn it.

Example: Perceptrons can learn models for the following primitive boolean functions: AND, OR, NOT, NAND, NOR. Here's an example for AND:

Perceptron And Function

Limitations of Perceptrons

Minsky and Papert (1969) showed that perceptrons could not model the exclusive-or function, because its outputs are not linearly separable. Two classes of outputs are linearly separable if and only if you can draw a straight line in two dimensions that separates one classification from another.

Perceptron XOR Function

The Delta Rule (Rumelhart, 1986)

Sigmoid Function

The perceptron activation function is a hard-limiting threshold function. A more general neural network uses a continuous activation function. One popular function is the sigmoidal (s-shaped) function, such as the logistic function:

f(net) = 1/(1 + e-L*net)


where L is lambda, a parameter for "squashing" the function and net is the output or sum of the weights.

The delta rule is a learning rule for a network with a continuous (and therefore differentiable) activation function. It attempts to minimize the cumulative error over a data set as a function of the weights in the network:

Delta(wji) = c(di - Oi)f'(neti)xj


where c is the learning rate, di and Oi are the desired and actual outputs for the ith node, and f'(net) is the derivative of the activation function for the ith node, and xj is the jth input to the ith node.

Key Point: The delta rule is tries to minimize the slope of the cumulative error in a particular region of the network's output function. This makes is susceptible to local minima.

Back propagation Learning for Multilayer Networks

Back Propagation

Back propagation starts at the output layer and propagates the error backwards through the network. The learning rule is often called the generalized delta rule.

Back propagation activation function is the logistic function:

f(net) = 1/(1 + e-L*net)

The logistic function is useful for assigning error to the hidden layers in a multi-layer network because:

The formulas for computing the adjustments of the kth weight of the ith node:

Delta(wik) = -c(di - Oi) * Oi(1 - Oi)xik
for nodes on the output layer

Delta(wik) = -c * Oi(1 - Oi)Sum(-deltaj * wij)xik
for nodes on the hidden layers.

NETtalk System (Sejnowski and Rosenberg, 1987)

Nettalk is a neural network, developed in 1987, that learns to pronounce English text. It learns to associate phonemes with string of text.

NETtalk

Properties of NETtalk

NETtalk Comparison with ID3 (Shavlik, 1991)

Using Encog Java Neural Network Framework

Homework Exercise: Using the links below, download the Encog Framework into a directory on your Linux account. Then perform the exercises.

Downloads Download and unzip each of the following Encog packages from the Encog Download Site:

Exercises

  1. Take a look at the Getting Started Documentation.

  2. Command Line Exercise: Do the Encog Java XORHelloWorld example. Try working through the ANT version. On my system, this is the Java command you need to run from within the .../encog-examples-3.0.1/lib:
    java -cp encog-core-3.0.1-SNAPSHOT.jar:examples.jar org.encog.examples.neural.xor.XORHelloWorld
    

  3. GUI Exercise: Do the Workbench Classification Example.