CPSC 352 -- Artificial Intelligence
Notes: AI Algorithms & Approaches: Genetic Algorithm

Introduction

In this lecture we consider the genetic algorithm, which also involves some form of knowledge representation and search:
What limit can we put to this power, acting during long ages and rigidly scrutinizing the whole constitution, structure and habits of each creature -- favoring the good and rejecting the bad? I can see no limit to this power in slowly and beautifully adapting each form to the most complex relations of life.

-- Charles Darwin, On the Origin of Species

Genetic algorithms are based on the evolutionary model: a population of individuals is shaped through survival of its fittest members.

The Genetic Algorithm

The genetic algorithm can expressed as follows:
    procedure genetic_algorithm {
       set time t = 0;
       initialize the population, P(t);
       while the termination condition is not met do {
           evaluate the fitness of each member of P(t);
           select most fit members from P(t);
           produce the offspring of pairs of members using genetic operator;
           replace the weakest candidates of P(t) with these offspring;
           set time t = t + 1;
       }
    }

Example: The Traveling Salesman Problem

Find the shortest path between N cities, starting from one city, visiting all cities exactly once, and returning to the home city.

How should we represent the path?

A binary representation, where the cities are represented by binary numbers, 0001, 0010, and so on, would be difficult, because if we modify one bit (mutation) of the representation, we no longer will have a possible solution path.

Instead, let's adopt a decimal representation, where the path is represented by numeric names for each of the 9 cities: (1 9 2 4 6 5 7 8 3), for N = 9. Thus all possible paths through the 9 cities is represented by an ordering of the nine digits, 1 to 9.

Suppose our initial population consists of a random set of possible paths.

Genetic Operators

One type of genetic operator is called crossover, in which some portion of two parents are exchanged. For example, if you have:
parent1 = (1 9 2 | 4 6 5 7 | 8 3)

parent2 = (4 5 9 | 1 8 7 6 | 2 3)

You could define two children using the following crossover rule. First start by initializing each child with the middle segment of the parents:

child1 = (x x x | 4 6 5 7 | x x)

child2 = (x x x | 1 8 7 6 | x x)

Then, starting from the second cut point of one parent, copy the cities from the other parent in the same order as they occur in the first parent. For example, the sequence of cities from parent2 is: 2 3 4 5 9 1 8 7 6. Then, remove cities that occur in the first parent. Thus when copying from parent2 to parent1 we have to remove 4 6 5 7. This will give: 2 3 9 1 8, which is then copied into the new child:

child1 = (2 3 9 | 4 6 5 7 | 1 8)

child2 = (8 3 9 | 1 8 7 6 | 2 4)

This crossover rule guarantees that the children are possible solution paths.

Another type of genetic operator is called inversion, in which some the middle segment of the solution path is inverted. This is like a genetic mutation.

child1 = (2 3 9 | 7 5 6 4 | 1 8)

child2 = (8 3 9 | 6 7 8 1 | 2 4)

Evaluating the Individuals

A suitable fitness function for the travelling salesman problem would be to calculate the cost of the path that results. Obviously, those paths that have the lowest cost are most fit to go on. Those paths with the highest cost are least fit.

Selecting the Next Generation

In building the next generation, you want to select a higher percentage of those individuals who were deemed "most fit." But you also want to select a small percentage of individuals that were "least fit." By including some of the least fit, you leave open the possibility that some of them may give rise to very well suited descendants.

Some Links Worth Exploring

Java GA Toolkit (Examples)

Tutorial with Demos

JGAP: Java Package for GAs

Knowledge Representation and Search in Genetic Algorithms

As this example shows, knowledge representation is a major part of designing genetic algorithms. Similarly, designing appropriate fitness functions is very close to what we do in designing heuristic functions for recursive symbolic search.