CPSC 352 -- Artificial Intelligence
Notes: AI Algorithms & Approaches: Genetic Algorithm
Introduction
In this lecture we consider the genetic algorithm, which also involves
some form of knowledge representation and search:
What limit can we put to this power, acting during long ages and rigidly
scrutinizing the whole constitution, structure and habits of each creature --
favoring the good and rejecting the bad? I can see no limit to this power
in slowly and beautifully adapting each form to the most complex relations
of life.
-- Charles Darwin, On the Origin of Species
Genetic algorithms are based on the evolutionary model: a population of individuals
is shaped through survival of its fittest members.
Related Evolutionary Model: Conway's Game of Life
John Conway's Game of Life is
an implementation of a cellular automaton that simulates the birth,
death, etc., of organisms based on certain rules.
The Genetic Algorithm
The genetic algorithm can expressed as follows:
procedure genetic_algorithm {
set time t = 0;
initialize the population, P(t);
while the termination condition is not met do {
evaluate the fitness of each member of P(t);
select most fit members from P(t);
produce the offspring of pairs of members using genetic operator;
replace the weakest candidates of P(t) with these offspring;
set time t = t + 1;
}
}
Example: The Traveling Salesman Problem
Find the shortest path between N cities, starting from one city, visiting
all cities exactly once, and returning to the home city.
How should we represent the path?
A binary representation, where the cities are represented by binary
numbers, 0001, 0010, and so on, would be difficult, because if we
modify one bit (mutation) of the representation, we no longer
will have a possible solution path.
Instead, let's adopt a decimal representation, where the path is
represented by numeric names for each of the 9 cities: (1 9 2 4 6 5 7 8 3),
for N = 9. Thus all possible paths through the 9 cities is represented
by an ordering of the nine digits, 1 to 9.
Suppose our initial population consists of a random set of possible paths.
Genetic Operators
One type of genetic operator is called crossover, in which some
portion of two parents are exchanged. For example, if you have:
parent1 = (1 9 2 | 4 6 5 7 | 8 3)
parent2 = (4 5 9 | 1 8 7 6 | 2 3)
You could define two children using the following crossover rule. First
start by initializing each child with the middle segment of the parents:
child1 = (x x x | 4 6 5 7 | x x)
child2 = (x x x | 1 8 7 6 | x x)
Then, starting from the second cut point of one parent, copy the cities
from the other parent in the same order as they occur in the first parent.
For example, the sequence of cities from parent2 is: 2 3 4 5 9 1 8 7 6. Then,
remove cities that occur in the first parent. Thus when copying from parent2
to parent1 we have to remove 4 6 5 7. This will give: 2 3 9 1 8, which is
then copied into the new child:
child1 = (2 3 9 | 4 6 5 7 | 1 8)
child2 = (8 3 9 | 1 8 7 6 | 2 4)
This crossover rule guarantees that the children are possible solution paths.
Another type of genetic operator is called inversion, in which some
the middle segment of the solution path is inverted. This is like a genetic
mutation.
child1 = (2 3 9 | 7 5 6 4 | 1 8)
child2 = (8 3 9 | 6 7 8 1 | 2 4)
Evaluating the Individuals
A suitable fitness function for the travelling salesman
problem would be to calculate the cost of the path that results. Obviously,
those paths that have the lowest cost are most fit to go on. Those paths
with the highest cost are least fit.
Selecting the Next Generation
In building the next generation, you want to select a higher percentage of those
individuals who were deemed "most fit." But you also want to select a small
percentage of individuals that were "least fit." By including some of the least
fit, you leave open the possibility that some of them may give rise to
very well suited descendants.
Some Links Worth Exploring
Java GA Toolkit (Examples)
Tutorial with Demos
JGAP: Java Package for GAs
Knowledge Representation and Search in Genetic Algorithms
As this example shows, knowledge representation is a major part of
designing genetic algorithms. Similarly, designing appropriate fitness
functions is very close to what we do in designing heuristic functions
for recursive symbolic search.
Homework Exercises
- Suppose you were designing a GA for breaking simple substitution
ciphers. A simple subsitution cipher uses an alphabet that
is a permutation of the normal 26 letter alphabet. For example,
a shifted alphabet might be "DEFG...XYZABC". In this case the
letter 'a' would be encrypted to the letter 'd' and 'b' would be
encrypted to 'e' and 'z' would be encrypted to 'c'. How would
you represent this problem for a GA? What would the crossover
mechanism look like??