Genetic Algorithms with Multiple Crossover Revisited
KIM HEAD
School of Engineering
University of Portland
Portland, OR97203
U.S.A.
BART RYLANDER
School of Engineering
University of Portland
Portland, OR97203
U.S.A.
Abstract: We investigate the efficiency of using multiple crossover points in the canonical genetic algorithm (GA). We conduct our experiments with varying instance sizes of three dissimilar problems. Previous works in this area have concentrated solely on specific instances of specific problems. Contrary to previous work, we show that in all test cases the multiple crossovers slow the GA's convergence and rarely offer a better solution than with single-point crossover.
Key Words: Genetic Algorithm, Multiple Crossovers, NP Problem
1Introduction
The genetic algorithm is an optimization technique based roughly on the theory of biological evolution.[3] It uses a population of chromosomes, each of which can be mapped to potential solutions of the proposed optimization problem. Each one of the chromosomes is rated based on how well it meets the criteria of optimization. The best-rated chromosomes are selected to produce the "offspring" chromosomes of the next generation or population. As an example, two n-length chromosomes are selected as parents. A random number is generated between 1 and n. If, for example, the random number is n/2 then the first part of offspring 1 comes from parent 1 and the second part of offspring 1 comes from parent 2. So if parent 1’s genes are 00000000 and parent 2’s genes are 11111111, the new chromosome’s genes would be 00001111. This is what happens when there is one crossover point. If there were two crossover points, for example, at bit positions three and six, the resulting chromosome would be 00011100.
There has been much previous research in this area. Unfortunately, each previous experiment has been conducted on a unique problem instance. This begs the question of whether the results of these
experiments are really universal. At this point we
must be careful to distinguish between a problem and a problem instance. For example, a problem instance of sort might be to sort a specific set of twenty names. The problem sort is the union of all sort instances. That is, sorting any number of names, sorting any number of numbers, etc. Our research conducts experiments on several different instance sizes of three dissimilar problems.
We have chosen three different problems based on their level of computational complexity.[2][5] In particular, we have chosen a problem that is trivial for a GA to solve, a problem that is in the class P (deterministic polynomial time), and a problem that is NP-complete (non-deterministic polynomial time complete).
2Experiment Parameters
The first problem used was the classic Max One’s problem. In Max One’s, the fitness of a solution is based off of how many ones are in the chromosome. This problem is used because it is easy to determine the accuracy of the result since the best solution will have ‘n’ number of ones in a chromosome with length ‘n’. The second problem examined was the sorting problem. The sorting problem takes an unsorted array and then sorts the array elements in ascending order. Of course, using a GA for this problem is inefficient since the GA must sort its chromosomes within each generation, but the problem demonstrates the accuracy of the GA’s result in a non-trivial problem. The third problem used was three processor scheduling. In this problem, three processors are assigned n tasks to find the shortest time necessary to finish all of the tasks. In this problem, the solution’s fitness is based on how long it takes for the processors to finish the assigned tasks.
In Max One’s and Three Processor Scheduling, we ran the GA’s using string lengths/task lists of 16, 128, and 256. In the sorting algorithm, array sizes of 10, 50, and 100 were used. Population sizes of 10, 100, and 200 chromosomes were used in each test case for all three problems, and cases with one, five, and ten crossover points were tested on every population size/chromosome length combination.
3Experiment Parameters
The results of our experiments were fairly universal. In virtually all cases, using multiple point crossover increased the time to convergence without improving the quality of the solution. Below are the individual test results compiled for each problem examined.
3.1Performance on Max One’s
We ran the Max One’s GA on different string lengths and differing numbers of crossovers with ten chromosomes. The mutation rate was picked based on what seemed to converge within 2 seconds. We let the GA run until 50% convergence. The average solution and number of generations to converge were logged so we could compare the speed and efficiency of the GA on the different crossovers and lengths. We found that for in all cases for this number of chromosomes, changing the number of crossovers consistently increased the number of generations needed to converge, while there was little change in the final solution.
# Genes/ Mutation / Crossovers / Av. Gens to Conv. / Av. Fitness16/5 / 1 / 4 / 16
16/5 / 5 / 8 / 16
16/5 / 10 / 12 / 15
128/3 / 1 / 31 / 110
128/3 / 5 / 30 / 113
128/3 / 10 / 42 / 114
256/1 / 1 / 58 / 200
256/1 / 5 / 70 / 201
256/1 / 10 / 85 / 212
Table 1: Running Max 1’s with 10 chromosomes
The data shows that running multiple crossovers on a small population of chromosomes does not improve the efficiency of the GA. In fact, it has the opposite effect. The high crossover wastes valuable time to calculate, and causes the GA to take longer to converge on the same (or worse) solution as a single crossover. This cost is not offset by an increase in quality of the final solution, so using high crossover rates is not helpful with small populations. This trend was continued in the higher population sizes.
# Genes/ Mutation / Crossovers / Av. Gens to Conv. / Av. Fitness16/5 / 1 / 87 / 16
16/5 / 5 / 93 / 16
16/5 / 10 / 98 / 15
128/3 / 1 / 110 / 127
128/3 / 5 / 122 / 127
128/3 / 10 / 130 / 127
256/1 / 1 / 145 / 249
256/1 / 5 / 165 / 251
256/1 / 10 / 181 / 251
Table 2: Running Max 1’s with 100 chromosomes
# Genes/ Mutation / Crossovers / Av. Gens to Conv. / Av. Fitness16/5 / 1 / 185 / 16
16/5 / 5 / 191 / 16
16/5 / 10 / 196 / 16
128/3 / 1 / 209 / 127
128/3 / 5 / 222 / 127
128/3 / 10 / 240 / 127
256/1 / 1 / 247 / 255
256/1 / 5 / 261 / 255
256/1 / 10 / 285 / 255
Table 3: Running Max 1’s with 200 chromosomes
In fact, as the string length increased with the larger population size, the jump in the number of generations to converge grew as well. There is a small increase in accuracy of the answer found with more crossovers, but this is most likely due to the larger population size, not the higher crossover rate. Even with this higher accuracy, the multiple crossover test cases still fail to provide a better solution than the traditional single point crossover.
3.2 Performance on Sorting Problem
In our sorting GA, we randomly filled our original array with numbers +/- half of the array’s length. Duplicate elements were allowed to further complicate the problem. When evaluating the fitness of a chromosome, each element was checked against every other element in the array. Every time an element was found to be less than an element before it, or greater than an element after it, the fitness was incremented. The GA ran until the most fit chromosome had sorted 95% of the array. Some tests on small arrays were first done to check the accuracy of the sorted array. All solutions were either sorted completely or had a single element out of place. We then ran the GA on our test cases. Since this problem is not quite what a GA is designed to solve, the number of generations needed to finish is much higher than in the other two problems.
On a small population size, increasing the number of crossovers dramatically increased the number of generations needed to finish, even more so than in the Max One’s problem. Each time the array size increased the number of generations to finish increased almost by a factor of ten. Since all cases of the problem are running until 95% of the array is sorted, there is no increase in accuracy between cases. The extra time used in the multiple crossover test cases is wasted.
Array size/ Mutation / Crossovers / Av. Gens to Finish.10/3 / 1 / 180
10/3 / 5 / 307
10/3 / 10 / 443
50/2 / 1 / 9,524
50/2 / 5 / 24,213
50/2 / 10 / 34,954
100/1 / 1 / 90,178
100/1 / 5 / 122,965
100/1 / 10 / 162,285
Table 4: Running the Sorting GA with 10 chromosomes
On larger population sizes, the large increase in finishing time is not seen. The multiple crossover cases still take longer to converge, but there is relatively little increase compared to the factor of ten jumps seen in the tests with a population of ten.
Array size/ Mutation / Crossovers / Av. Gens to Finish.10/3 / 1 / 162,346
10/3 / 5 / 162,405
10/3 / 10 / 162,471
50/2 / 1 / 165,076
50/2 / 5 / 165,390
50/2 / 10 / 165,972
100/1 / 1 / 179,389
100/1 / 5 / 179,668
100/1 / 10 / 179,845
Table 5: Running the Sorting GA with 100 chromosomes
Array size/ Mutation / Crossovers / Av. Gens to Finish.10/3 / 1 / 186,888
10/3 / 5 / 186,940
10/3 / 10 / 186,988
50/2 / 1 / 188,250
50/2 / 5 / 188,336
50/2 / 10 / 188,460
100/1 / 1 / 198,077
100/1 / 5 / 198,281
100/1 / 10 / 198,562
Table 6: Running the Sorting GA with 200 chromosomes
Although the single crossover rate universally works the fastest compared to the other crossover rates, on larger population sizes the multiple crossover rates run in almost the same number of generations. We kept this trend in mind when examining the data for the 3 processor scheduling GA.
3.3 Performance on Three Processor Scheduling
We then ran the GA for Three Processor Scheduling on the same population sizes and crossovers as the Max One’s problem. We first tested the GA on small population sizes and task lists so we could verify the accuracy of the answers. All answers calculated were either the best solution or one task off (i.e. a task was assigned to one processor when it should have been assigned to one that was running a shorter amount of time). We then used the string lengths from the Max One’s problem as the number of tasks to be assigned so each chromosome would be the same length as the Max One’s problem. The mutation rates remained the same as well since the population size and chromosome lengths were the same. The length of time each task would take each processor was read in from a file and stored in an array separate from the chromosomes created. The file used the following four tasks repeatedly until the total number of tasks was reached:
For 4 tasks:
P1P2P3
Task 1 495
Task 2 859
Task 3 623
Task 4 731
So for 8 tasks, the list would be:
P1P2P3
Task 1 495
Task 2 859
Task 3 623
Task 4 731
Task 5 495
Task 6 859
Task 7 623
Task 8 731
This way, the results were consistent for each run so we could still compare the accuracy of the results.
The chromosomes were set up so each position represented the task number, and the element in the position represented which processor was assigned that task. The fitness was calculated as the time it took for the processor that ran the longest to finish. As with the Max One’s GA, we let the GA run until 50% of the population had converged and logged how many generations it took to do so. Once again, the higher crossover rates took longer than the single crossover case and still failed to provide a bettersolution.
16/5 / 1 / 5 / 24
16/5 / 5 / 9 / 28
16/5 / 10 / 14 / 29
128/3 / 1 / 19 / 218
128/3 / 5 / 23 / 221
128/3 / 10 / 30 / 222
256/1 / 1 / 48 / 447
256/1 / 5 / 54 / 449
256/1 / 10 / 62 / 450
Table 7: Running 3 Proc. Scheduling with 10 chromosomes
Unlike in the sorting problem’s data for a population of ten, the three processor scheduling GA already started with a small amount of increase of generations to converge for each larger test case and crossover rate increase. This trend becomes even more pronounced in the larger population sizes. However, it should be noted that the accuracy of the multiple crossover test cases still do not produce a better solution than the single point crossover.
16/5 / 1 / 84 / 21
16/5 / 5 / 92 / 22
16/5 / 10 / 103 / 23
128/3 / 1 / 137 / 200
128/3 / 5 / 148 / 204
128/3 / 10 / 158 / 202
256/1 / 1 / 178 / 418
256/1 / 5 / 189 / 427
256/1 / 10 / 196 / 426
Table 8: Running 3 Proc. Scheduling with 100 chromosomes
# Tasks/ Mutation / Crossovers / Av. Gens to Conv. / Av. Fitness16/5 / 1 / 219 / 20
16/5 / 5 / 225 / 21
16/5 / 10 / 230 / 22
128/3 / 1 / 311 / 191
128/3 / 5 / 317 / 195
128/3 / 10 / 325 / 198
256/1 / 1 / 446 / 410
256/1 / 5 / 449 / 416
256/1 / 10 / 454 / 411
Table 9: Running 3 Proc. Scheduling with 200 chromosomes
Although the single crossover rate universally works the fastest compared to the other crossover rates, on larger population sizes the multiple crossover rates run in almost the same number of generations. More research could be done on other NP problems to see if this finding still holds true and if the accuracy of the multiple crossover solutions found improves. However, our data from the Max One’s, sorting, and Three Processor Scheduling problems suggest that there will not be any increase in accuracy.
4 ConclusionsWe used multiple crossover points and different population sizes when running a Max One’s GA to see if different crossover rates could make the GA converge faster and/or create a better solution than the traditional single point crossover GA. On all cases, the multiple crossover points only slowed convergence, only to get the same (or worse) solution compared to the single crossover test case.
We then used a GA for sorting an unsorted array to test how multiple crossover points affected the running time of the GA. Since the GA was run until the best solution had sorted 95% of the array, there was no accuracy increase between the test cases. Runtimes were compared and multiple crossover points still took longer to finish sorting. A trend was noticed in the larger population sizes that the difference in running times between the single and multiple crossover cases was shortened as the array size grew.
Finally, we then used the same test cases as used in the Max One’s GA in the Three-Processor scheduling GA. The same trend of decreasing differences in number of generations needed to converge was found, but this time it was shown in all population sizes, not just the large ones.
We suggest that further research could be done on different NP-complete problems to see if the trend still holds true and if the solutions found from multiple crossovers are any more accurate than the single point crossover. If multiple crossover points are found to create better answers, they may yet be useful, but current data suggests that the extra convergence and calculation time needed make them less efficient and no more accurate than the traditional single point crossover.
References:
[1] Rylander, B., Foster, J., GA-hard Problems, Proc.On Genetic and Evolutionary Computation Conference, 2000.
[2] Bovet, D., Crescenzi, P. Computational Complexity, Prentice Hall.1994.
[3] Holland, J., Adaptation in Natural and ArtificialSystems, Ann Arbor, MI, University of Michigan Press, 1975.
[4]Ankenbrandt, C., An Extension To the Theory of Convergence and A Proof of the Time Complexity of Genetic Algorithms, Foundations of Genetic Algorithms, Morgan Kaufman 1991, pp. 53-68
[5]Balcàzar, J., Díaz, J., and Gabarró, J. Structural Complexity Theory I, Springer-Verlag, 1988.