The use of optimization methods has been a standard in mechanical engineering design for some time. Optimization methods can be used to perform several tasks. One of the most common tasks is to minimize the cost of a product. The engineer wants a product that functions efficiently and is also economically efficient. Another kind of optimization task might be to build a model to simulate some function. An example of this is a four-bar path generation problem. The four-bar model should simulate the desired path as close as possible.
Many researchers have used optimization methods in this sense. There is an unlimited number of problems in which optimization methods can be applied. The objective of all optimization problems is to find a minimum or maximum objective function value. Considering minimum values, usually a problem may have more than one minimum objective function value. It is the engineer's goal to find the values of the design variables which will yield the lowest objective function value of all the minimums. The design variables are those factors that influence the objective function. Some times the engineer is not satisfied with a local minimum.
There are many traditional deterministic algorithms available to solve optimization problems for a local minimum. Some of these methods include the reduced gradient (Wolfe, 1963), the generalized reduced gradient, GRG (Abadie and Carpentier 1969), and the gradient projection (Rosen 1960 & 1961) methods. There are many other algorithms available in addition to these few. The above methods all require the evaluation of gradient information in order to solve the problem. Gradient evaluations can become difficult and time consuming when complex objective functions are present. These methods always look for the closest optima weather it is a local or global one. The algorithms find a valley and converge to the bottom of it.
The set of methods discussed in this paper are known as probabilistic global methods. By probabilistic it is meant that the algorithms will sometimes accept a bad result according to a given probability. Thus these methods are also referred to as "hill climbing" methods in the sense that they can climb out of a local minimum valley. The aforementioned deterministic methods will never accept a worse off solution and thus can not climb out of a valley.
Simulated Annealing (Kirkpatrick et. al., 1983) and Genetic Algorithms (Holland, 1975) are used in this paper. Both of these are probabilistic global optimization methods which do not require gradient information. A third global optimization technique proposed by Kota and Chiou (1992) combine the use of Orthogonal Arrays with the GRG algorithm. These three global optimization techniques are illustrated and compared in this paper using a simple model problem.
A model problem is utilized to demonstrate how each method solves or attempts to solve for a global minimum. A relatively simple problem was chosen to illustrate the optimization algorithms. A simple problem was beneficial since the accuracy of each method was being examined. It was desired that the actual global minimum be known. This way one could determine if the algorithms were in fact operating correctly and solving for the global minimum. A multimodal problem was desired to demonstrate that the algorithms could escape local minimums. That is a problem with one global minimum and a couple of local minimums. The problem chosen, known as Himmelblau's Function, is:

where x1 and x2 are the design
variables. Thus, this problem contains two design variables with
an objective function value of f(x1,x2).
This is an unconstrained problem in which the objective function
is to be minimized. Figure 1 below is a contour plot of Himmelblau's
function. In this figure four distinct minima can easily be seen. Solving the function analytically, three local minima and one global minimum are found. The true global minimum is located at f(3,2)=0. Of course, for complex problems an analytical solution is near impossible. The analytical solution will be used to demonstrate how closely the different global optimization techniques converge to the true global minimum at f(3,2).
Simulated annealing and genetic algorithms were used to optimize
Himmelblau's function. Orthogonal arrays with GRG were a third
technique implemented. Since an unconstrained problem is assumed,
there is no need to consider such factors as penalty functions.
Simulated Annealing
Simulated annealing receives its name because of its resemblance to the natural process of the annealing of solids. When a solid material is annealed, it is heated to its melting point. When this point is reached, the solid is then allowed to cool slowly enough so that thermal equilibrium is maintained. By cooling at this slow rate, the atoms in the solid material arrange themselves as to attain a minimum energy state. This phenomenon of nature was first applied to numerical calculations in 1953 by Metropolis and his coworkers. Kirkpatrick (1983) applied this technique to engineering design. He used simulated annealing methods on the optimization of VLSI circuit designs.
In order to use simulated annealing, an initial design is chosen. Since this is a global optimization method, the starting point should not influence the result of the algorithm. A new design is then generated from the neighborhood of the initial points. If the new design is better, in terms of the objective function, than the old one, it is accepted. If it is worse, it may or may not be accepted. This decision is made according to the Boltzmann Probability function:
where E is the change in the objective function and T is the "temperature." If a randomly generated number between 0 and 1 is less than the Boltzmann Probability, the worse design is accepted. The "temperature" is a control parameter and sets the Boltzmann Probability. A low temperature is similar to quenching a solid, in which the lowest energy state is not achieved due to a fast cooling rate. The simulated annealing algorithm would then solve for the closest minimum to the starting point. However, with a high starting temperature, the algorithm looks over the whole objective function for a global minimum.
Several parameters were chosen for the model problem presented. The cooling schedule used was adapted from Vanderbilt and Louie (1984) and is shown below. In this
equation T0 is the current temperature, TNEW is the new temperature, and xt is the temperature reduction factor. In order to reach "thermal equilibrium," the temperature must first be held constant for a number of algorithm iterations. Vanderbilt and Louie (1984) suggest performing [15 * number of design variables] iterations at the starting temperature. Thus, the parameters for the simulated annealing algorithm are shown below.
The temperature was varied experimentally between the values shown
above. The simulated annealing algorithm was implemented using
the FORTRAN subroutine ambesa (Press et.al., 1994). A FORTRAN
program was written to drive this routine.
Genetic Algorithms
Genetic algorithms (Holland, 1975) are different from traditional methods in that they are a probabilistic method that uses a population of designs rather than a single design at a time. Holland's genetic algorithms are based on the natural selection process in the evolution of living organisms. In nature, the fittest members in a population have the greatest probability of surviving. This is customarily known as "survival of the fittest." Since starting designs are irrelevant, initial population members are chosen randomly.
The FORTRAN code written for this problem used binary strings to represent the population members. Included in these binary strings are the "genes" or characteristics of a design variable on the objective function. To evaluate each member, a fitness function is implemented. This fitness function is based on the problem objective function, equation 1. In order to prevent one member from dominated the population, a linear normalized fitness function was used. Members for the next generation are chosen based on these fitness values. A member with a greater fitness value will have a higher probability of being chosen for the next generation than one with a lower fitness value.
The genetic algorithm used in this project used crossover and mutation techniques. Crossover is the exchange of bits of string between parents to create children. Mutation occurs when a bit in a string changes randomly from 0 to 1 or 1 to 0. Both operators are given a certain probability which will cause them to occur. This is shown below in Fig. 2.
| Population = 50 Members | |
| Mutation Rate = 10 % | |
| Crossover: | |
Figure 2: Genetic Algorithm Parameters
Orthogonal Arrays with GRG
This global optimization technique used orthogonal arrays to find a suitable starting point for the generalized reduced gradient method. The technique was applied by Kota and Chiou (1992) to a four-bar path generation problem. The first step here is to choose an orthogonal array. The array chosen was a L9 (Taguchi, 1989) inner array. This array requires nine different evaluations of the objective function as shown below in Table 1. In the array, columns 1 and 2 contain three levels of the design variables. No
Table 1: L9 Orthogonal Array
single combination is repeated in the table. Columns 3 and 4 are blank since only two design variables are present. F is the objective function.
To determine which level is the best for each design variable, the level effects are considered. For example, consider x2. The effect of level 1 on x2 is F1+F4+F7, level 2 is F2+F5+F8, and level 3 is F3+F6+F9. Level effects on x1 are calculated in the same manner. Lets assume level 1 returns the smallest F. The new levels would then be:
(level 1)new = (level 1)old
(level 2)new = 0.5*((level 1)old + (level 2)old) (5a,b,c)
(level 3)new = (level 2)old
If level 2 is the smallest:
(level 1)new = (level 2)old - 0.5*((level 2)old - (level 1)old)
(level 2)new = (level 2)old (6a,b,c)
(level 3)new = (level 2)old + 0.5*((level
3)old - (level 2)old)
If level 3 is the smallest:
(level 1)new = (level 2)old
(level 2)new = 0.5*((level 2)old + (level 3)old) (7a,b,c)
(level 3)new = (level 3)old
The above procedure is continued until either the objective function
is less than 0.01 or the maximum difference in effects is less
than 0.005 for each design variable. To implement this method
a FORTRAN program was written. The results from the orthogonal
array were used as starting points for the GRG algorithm. The
optimization package OPTDES (Parkinson et.al.,1992) was used to
implement the GRG algorithm.
Each of the above global optimization methods were used on the sample problem. The results for each method are now discussed followed by a comparison.
Simulated Annealing
Equation 4 of the previous section showed the various starting conditions which were used for the simulated annealing algorithm. Table 2 contains the results for several trials of the simulated annealing algorithm. Being a probabilistic method, simulated
| Starting Point | |||||
Table 2: Simulated Annealing Results
annealing will sometimes converge to a local minimum instead of the global minimum. All three conditions, starting point, starting temperature, and constant iterations at the initial condition were varied. For each set of conditions in Table 2, twenty tests were run. For every test run one of the four minima was reached. The column labeled "Percent Time Optimum Reached" is the percentage in which the global optimum was reached for the respective conditions. The simulated annealing algorithm was not very successful in solving for the global minimum. The percentage of success for a realistic condition was only about 50%. For a real life engineering problem, it would not be wise to choose a low starting temperature since it is not desirable to cool fast and quench. These two plots use two different starting points and compare the "percent time global optimum reached" with temperature.
Genetic Algorithms
The genetic algorithm was implemented on Himmelblau's Function, Equation 1, as per the conditions discussed previously and summarized in Figure 2. A population of 50
members was chosen randomly and the algorithm implemented. A summary of the results is shown on the following page in Table 3 which shows the five best members in each generation. For the initial 50 members, Generation #0, Himmelblau's function, f(x1,x2), had relatively large values. Optimally, this value should be as close to zero as possible and x1 and x2 should be equal to 3 and 2 respectively. The initial generation contains one local minimum and the rest of the members are not near any minimum. The fact that one of the members is near a local minimum is just coincidence since these values were chosen randomly. As the generations evolved, the members approached the global minimum. Since an elitist parameter was not used in this algorithm, the initial local minimum is not
Table 3: Best Five Members for each Genetic Algorithm Generations
carried to future generations. After 10 generations, no change was seen in the majority of the members.
Orthogonal Arrays with GRG
For this method, orthogonal arrays were used to find a suitable starting point for the GRG algorithm of OPTDES. As previously mentioned, the initial levels of both x1 and x2 were -6, 0, and +6. These three levels are also referred to as levels 1, 2, and 3, respectively. Using equation 7 and the L9 array shown in Table 1, the results of the orthogonal arrays are shown below in Table 4. As per the termination criteria previously
Table 4: Orthogonal Array Results
mentioned, the above values took twelve iterations to obtain. It is obvious that the values of x1 and x2 (2.97,2.06) are very close to the global minimum of (3, 2).
These values were now used as the starting points for the GRG algorithm. Table
5 contains these results. Six tests, or sets of starting points, are shown in the table. Test 1
| Starting Points | Optimum Points | Optimum Points | ||||||||||
Table 5: Results of GRG Algorithm
uses the starting points from the orthogonal arrays. Implementing GRG with these points yield the global minimum. This makes sense since the starting points are in the vicinity of the global minimum. The remaining tests use various starting points from the various initial levels (-6, 0, +6). Of these tests, test 3 and 6 also yield the global minimum, however, they cause GRG to take more time to complete. For example, in test 1, only 18 analysis calls and 5 designs are needed. For test 6, the GRG algorithm requires 48 analysis calls and 8 designs. Thus, the use of orthogonal arrays is more efficient in two aspects: they guarantee convergence on a global minimum, and second, they require less CPU time for the GRG algorithm to terminate.
Comparison
Now that the results of all three methods have been presented, it is time to compare them with each other. Table 6, located on the following page, contains each method and its results. Also included in this table is the solution of orthogonal arrays alone. Both orthogonal arrays with GRG and simulated annealing yield the global minimum, however, as previously discussed, simulated annealing does not return the global minimum 100 percent of the time. Therefore, for this example the best method
* Only reached a percentage of the time
Table 6: Comparison of Methods
to use is orthogonal arrays as a starting point for the GRG algorithm.
The model problem presented in this paper illustrated three different
global optimization methods. Each method solved the simple problem
in a different manner. Using orthogonal arrays as a starting point
for the GRG algorithm did yield a global minimum. In fact, any
conventional optimization routine such as SQP or SLP could have
been used after the orthogonal arrays to produce a global minimum.
Also, since genetic algorithms did not yield an exact answer but
rather one in the neighborhood of the global minimum, conventional
methods can be used to descend to the global minimum from this
point. For this problem simulated annealing was not a reliable
method. It is not yet determined if there is an error in the FORTRAN
subroutine for simulated annealing. When solving an engineering
problem. It is not desirable to know that an answer has only a
70 percent chance of being correct. The results obtained in this
paper can be only stated confidently about the problem examined.
For different problems, different methods may work better. That
is why there are so many optimization techniques available. Future
work may include applying some of the techniques discussed to
solve real life engineering problems in which the answer is not
yet known.
Abadie, J. and Carpentier, J., "Generalization of the Wolfe
Reduced Gradient
Method to the case of Nonlinear Constraints," in Optimization, Gletcher, R.
(Ed.), Academic Press, New York, 1969.
Holland, J. H., "Adaptation in Natural and Artificial Systems," University of
Michigan Press, Ann Arbor, Michigan, 1975.
Kirkpatrick, S., Gelatt, C.D. and Vecchi, M.P., "Optimization by Simulated
Annealing," Science, Vol. 220, pp. 671-680, 1983.
Kota, S. and Chiou, S., "Use of Orthogonal Arrays in Mechanism Synthesis,"
Mechanical Machine Theory, Vol. 28, pp. 777-794, 1993.
Parkinson, A., Balling, R, and Free, J, "OPTDES," Brigham Young
University, 1992.
Press, W, Teukolsky, S, Vetterling, W, and Flannery, B, "Numerical Recipes
in FORTRAN," Cambride Press, Boston, 1994.
Recklaitis, "Engineering Optimization," John Wiley
and Sons, New York.
Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming Part I,
Linear Constraints, " SIAM Journal of Applied Mathematics, Vol. 8, pp.
181-217, 1960.
Rosen, J. B., "The Gradient Projection Method for Nonlinear Programming Part
II, Nonlinear Constraints, " SIAM Journal of Applied Mathematics, Vol. 9,
pp. 514-532, 1961.
Taguchi, G., "Introduction to Quality Engineering," Asian Productivity Center
1989.
Vanderbilt, D. and Louie, S. G., "A Monte Carlo Simulated Annealing Approach
to Optimization over Continuous Variables," Journal of Computational Physics,
Vol. 56, pp. 259-271, 1984.
Wolfe, P., "Methods of Nonlinear Programming," on Recent Advances in
Mathematical Programming, Graves, R. L., and Wolfe, P. (Eds.), McGaw-Hill
New York, pp. 67-86, 1963.