Comparison of Di erent Neighbourhood Sizes in Simulated Annealing Xin Yao Department of Computer Science University College, University of New South Wales Australian Defence Force Academy Canberra, ACT, Australia 2600 Abstract Neighbourhood structure and size are important parameters in local search algorithms. This is also true for generalised local search algorithms like simulated annealing. It has been shown that the performance of simulated annealing can be improved by adopting a suitable neighbourhood size.

However, previous studies usually assumed that the neighbourhood size was xed during search. This paper presents a simulated annealing algorithm with a dynamic neighbourhood size which depends on the current emperature value during search. A method of dynamically deciding the neighbourhood size by approximating a continuous probability distribution is given. Four continuous probability distributions are used in our experiments to generate neighbourhood sizes dynamically, and the results are compared. combinatorial optimisation.

A method of generating dynamic neighbourhood sizes by approximating continuous probability distributions is given in this section. Section 4 compares the experimental results of using di erent continuous probability distributions to generate dynamic neighbourhood sizes. Finally, Section 5 concludes with some remarks and directions of future research. 2 General Simulated Annealing Although SA can be used in both continuous and discrete cases, this paper only considers combinatorial optimisation by SA unless otherwise indicated explicitly.

A combinatorial optimisation problem can be informally described as nding an optimal con guration X from a nite or in nite countable con guration space S . Each con guration X 2 S can be represented by its n (> 0) components, i. e. , X = (x1; x2; ; xn ), where xi 2 Xi , i = 1; 2; ; n. An excellent discussion of combinatorial optimisation and its complexity can be found in Garey and Johnsons book 8]. A general model of SA, which is applicable to both continuous and discrete problems, can be described by Figure 1, where function generate (X; Tn) is decided by the generation robability gXY (Tn ), which is the probability of generating con guration Y from con guration X at temperature Tn , function accept (X; Y; Tn) is decided by the acceptance probability aXY (Tn ), which is the probability of accepting con guration Y after it has been generated at temperature Tn , and function update (Tn ) decides the rate of the temperature decrease. These three functions determine the convergence of general SA 5, 6, 9], but parameters in general SA, such as the initial temperature, initial con guration, inner-loop stop criterion, and outer1 Introduction Simulated Annealing (SA) algorithms can nd very good near optimal solutions to a wide range of hard problems, but at the high computational cost. Various methods have been proposed to speed up its convergence, which can roughly be divided into three categories: (1) Optimising functions and parameters in SA 1]; (2) Combining SA with other search algorithms 2, 3]; and (3) Parallelising SA 4]. This paper falls into the above rst category. Section 2 of this paper describes a general SA algorithm 5, 6] which uni es di erent variants of the classical one 7].

Section 3 presents SA with a dynamic neighbourhood size and its application in Published in Proc. of Fourth Australian Conf. on Neural Networks, ed. P. Leong and M. Jabri, pp. 216{219, 1993, Melbourne, Australia. generate initial con guration X at random; generate initial temperature T0; REPEAT REPEAT Y = generate(X; Tn); IF accept(X; Y; Tn) THEN X = Y ; UNTIL inner-loop stop criterion satis ed; Tn+1 = update (Tn ); n = n + 1; UNTIL outer-loop stop criterion satis ed Figure 1: General simulated annealing. loop stop criterion, can have signi cant impact on its nite-time behaviour.

That is, the computation time in practice depends on the three functions as well as these parameters. Most research on SA has concentrated on the update and accept function and various algorithmic parameters, only limited attention has been paid to the generate function. However, the generate function decides an important part | the neighbourhood structure and size | of a local search algorithm regardless of whether it is a deterministic one or a stochastic one like SA. The neighbourhood NX of a con guration X is de ned by con guration.

The xed-size neighbourhood clearly does not conform with the basic search strategy behind SA. It is appealing to have a neighbourhood size which can adjust itself in the di erent search stages. Fast SA 12] can be regarded as an example of SA with a dynamic neighbourhood size, but it is only used in the continuous case. The application of dynamic neighbourhood size in combinatorial optimisation, to our best knowledge, has not been well-studied. 3 Dynamic Neighbourhood Size in Simulated Annealing This section gives a method of dynamically deciding the neighbourhood size in SA according to the temperature parameter 5, 6].

In the high temperature stages, SA algorithms have high acceptance probability for both good and ad moves, i. e. , exploration plays a major role in search, and thus a large neighbourhood size is used to enhance such exploration. In the low temperature stages, exploitation plays a major role in search, and thus a smaller neighbourhood is more suitable. In the following discussion, we say that the Hamming distance between two con guration X = (x1 ; x2; ; xn ) and Y = (y1; y2 ; ; yn ) is if there are exactly di erent elements between them.

Let f (x) be the continuous density function which is used to generate the Hamming distance between the current con guration and the next one. Denote the set of con gurations which are distant from the current con guration X as SX ( ), SX ( ) = fY 2 S; gXY (Tn) > 0g where X 62 NX , and X 2 NY i Y 2 NX . NX = fY jY (1) ing search once de ned for a problem. Goldstein and Waterman 10] and Cheh et al. 11] carried out some experiments on comparing SA with di erent neighbourhood sizes, but the sizes are still xed once decided.

A limitation of SA with a xed neighbourhood size is its inability to perform search at di erent scales in di erent stages of search. As indicated in our previous study 5], SA can be viewed as an attempt to combine exploration of a space and exploitation of a sub-space into the same algorithm, i. e. , coarse-grained search in the high temperature stages explores the con guration space and tries to locate promising regions, while ned-grained search in the low temperature stages exploits the promising regions and tries to nd a good near optimal gXY (Tn ) = 1=jNX j, where jNX j is the size of NX , i. . , the number of con gurations in NX , and is the same for all X in S . Moreover, jNX j is xed dur- Previous research on SA normally assumed that j Y 2 S; dXY = g (2) The probability of generating con guration Y , which is dXY distant from con guration X , is dened as 1 = jS (1 )j P rob dXY ? 2 < X dXY Z dXY + 1 2 f (x)dx = jS (1 )j 1 X dXY dXY ? 2 f (dXY ) jSX (dXY )j 2 gXY (Tn ) dXY + 1 2 (3) Suppose the maximum Hamming distance allowed for one move is dmax 1 , then the normalised generation function is f (dXY ) / jSX (dXY )j gXY (Tn ) = (4) FX (Tn ) where FX (Tn ) = X X f (d ) max XZ jSX (dXZ )j dXZ =1 Z 2S 4 Experimental Results We adopt the Traveling Salesman Problem (TSP) as a benchmark to evaluate our SA algorithms because of its clear mathematical de nition and high computational complexity. Goldstein and Waterman 10] and Cheh et al. 11] have experimented with TSPs using di erent but xed neighbourhood sizes and found that a small neighbourhood size is better than a large neighbourhood size. That is, the SA algorithm performs the best when dXY = 1. TSPs with 40 cities are used in our experiment and are generated at random.

The same initial conguration, inner-loop stop criterion, out-loop stop criterion, and temperature decreasing rate are used in our experiments in order to evaluate the impact of the neighbourhood size on the performance of SA algorithms. Our experiments, albeit preliminary, have demonstrated that SA with a dynamic neighbourhood size outperforms SA with a xed neighbourhood size. Table 1 gives the results of four typical runs of two kinds of SA algorithms. Table 2 gives the results of using di erent distributions to generate neighbourhood sizes. roblem instance 1 2 3 4 initial value 15080 12260 13760 15820 NorSA 2540 2140 2560 2300 CSA 3120 2520 2880 2460 Table 1: Comparison of SA with a xed neighbourhood size (CSA) and SA with a dynamic neighbourhood size (NorSA). Normal distribution is used to generate the neighbourhood size. (5) Theorem 3. 1 ( 5]) Suppose the acceptance function in an SA algorithm is aXY (Tn ) = min 1; exp ? ? cY T cX n : (6) and the generation function is (4), where f (x) in (4) can be anyone of the following, (a) the Normal function N (0; Tn), i. e. , 1 exp ? d2 XY f (dXY ) = p 2Tn 2 Tn (b) the exponential function E (Tn ), i. . , f (dXY ) = ! 1 exp ? dXY Tn Tn (c) the Cauchy function C (Tn ), i. e. , 1 T f (dXY ) = 2 n 2 dXY + Tn (d) the stable function with index 1 13], i. e. , 2 f (dXY ) = q exp ? 2d1 XY 2 d3 XY 1 5 Concluding Remarks Neighbourhood size is an important parameter in local search algorithms, but only a xed size was adopted in previous application of SA to combinatorial optimisation problems. This paper proposes a method of using a dynamic neighbourhood size in SA based on our analysis of SA search. Preliminary experiments have demonstrated the advantage of a dynamic neighbourhood size in SA.

The idea of a dynamic neighbourhood size could also be introduced into other local search algorithms. It is, in fact, related to a more profound 3 Then the SA algorithm converges to global minima if the cooling rate is Tn = ln n + n0 ; n = 1; 2; (7) where and n0 are positive constants. It is set to n, the number of elements in a con guration, in our experiments. 1 problem instance initial value CauSA NorSA ExpSA StableSA 1 17800 2480 2540 2640 3760 2 15500 3000 3340 3180 4420 3 16600 3300 2920 3460 4500 4 14780 3000 2980 3280 3760 References 1] P. J. M. van Laarhoven and E. H. L.

Aarts, Simulated Annealing: Theory and Applications, D. Reidel Publishing Co. , 1987. 2] D. H. Ackley, A Connectionist Machine for Genetic Hillclimbing, Kluwer Academic Publishers, Boston, 1987. 3] X. Yao, Optimization by genetic annealing, In M. Jabri, editor, Proc. of ACNN91, pages 94{97, Sydney, 1991. 4] D. R. Greening, Parallel simulated annealing techniques, Physica D, 42:293{306, 1990. 5] X. Yao, Simulated annealing with extended neighbourhood, International J. of Computer Math. , 40:169{189, 1991. 6] X. Yao and G. -J. Li, General simulated annealing, J. of Computer Sci. & Tech. 6:329{ 338, 1991. 7] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Optimization by simulated annealing, Science, 220:671{680, 1983. 8] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman Co. , San Francisco, 1979. 9] S. Anily and A. Federgruen, Ergodicity in parameteric nonstationary Markov chains: an application to annealing methods, Oper. Res. , 35:867{874, 1987. 10] L. Goldstein and M. Waterman, Neighborhood size in the simulated annealing algorithm, Amer. J. of Math. and Management Sci. , 8:409{423, 1988. 11] K. M. Cheh, J.

B. Goldberg, and R. G. Askin, A note on the e ect of neighborhood structure in simulated annealing algorithm, Computers and Oper. Res. , 18:537{547, 1991. 12] H. H. Szu and R. L. Hartley, Nonconvex optimization by fast simulated annealing, Proc. of IEEE, 75:1538{1540, 1987. 13] W. Feller, An Introduction to Probability Theory and Its Applications, volume 2, John Wiley & Sons, Inc. , 2nd edition, 1971. 4 Table 2: SA with a dynamic neighbourhood size which is generated by the Cauchy function (CauSA), Normal function (NorSA), Exponential function (ExpSA), and Stable function with index 1=2 (StableSA). esearch issue in search theory, i. e. , the issue of exploration versus exploitation or global search versus local search. Although local search based on some heuristics can be quite e cient under many circumstances, the problem of local optima is very hard to deal with. Some kind of global search has to be used if a global optimum or near optimum is required. However, the computational cost of global search is often prohibitively high for most real-world applications due to the vast search space.

It is bene cial to combine global and local search together. An open question here is how to decide when global or local search should be performed. It is also di cult to draw the line strictly between local and global search in practice. Dynamic neighbourhood size offers a way to deal with the problem by transferring from global search to local search smoothly based on a control parameter, temperature in SA. However, more work has to be done on deciding which kind of generation functions is most suitable for an application, i. e. what is the optimal rate of reducing the neighbourhood size. As indicated before, Fast SA 12] o ers a big improvement over classical SA 7] due to the adoption of Cauchy distribution. An interesting topic is to investigate whether the discrete version of Fast SA can o er similar improvement over classical SA. Our preliminary experiments seem to give a negative answer. Acknowledgement | The author is grateful to Drs. B. Marksjo and R. Sharpe for their support of his work while he was with CSIRO Division of Building, Construction and Engineering.

However, previous studies usually assumed that the neighbourhood size was xed during search. This paper presents a simulated annealing algorithm with a dynamic neighbourhood size which depends on the current emperature value during search. A method of dynamically deciding the neighbourhood size by approximating a continuous probability distribution is given. Four continuous probability distributions are used in our experiments to generate neighbourhood sizes dynamically, and the results are compared. combinatorial optimisation.

A method of generating dynamic neighbourhood sizes by approximating continuous probability distributions is given in this section. Section 4 compares the experimental results of using di erent continuous probability distributions to generate dynamic neighbourhood sizes. Finally, Section 5 concludes with some remarks and directions of future research. 2 General Simulated Annealing Although SA can be used in both continuous and discrete cases, this paper only considers combinatorial optimisation by SA unless otherwise indicated explicitly.

A combinatorial optimisation problem can be informally described as nding an optimal con guration X from a nite or in nite countable con guration space S . Each con guration X 2 S can be represented by its n (> 0) components, i. e. , X = (x1; x2; ; xn ), where xi 2 Xi , i = 1; 2; ; n. An excellent discussion of combinatorial optimisation and its complexity can be found in Garey and Johnsons book 8]. A general model of SA, which is applicable to both continuous and discrete problems, can be described by Figure 1, where function generate (X; Tn) is decided by the generation robability gXY (Tn ), which is the probability of generating con guration Y from con guration X at temperature Tn , function accept (X; Y; Tn) is decided by the acceptance probability aXY (Tn ), which is the probability of accepting con guration Y after it has been generated at temperature Tn , and function update (Tn ) decides the rate of the temperature decrease. These three functions determine the convergence of general SA 5, 6, 9], but parameters in general SA, such as the initial temperature, initial con guration, inner-loop stop criterion, and outer1 Introduction Simulated Annealing (SA) algorithms can nd very good near optimal solutions to a wide range of hard problems, but at the high computational cost. Various methods have been proposed to speed up its convergence, which can roughly be divided into three categories: (1) Optimising functions and parameters in SA 1]; (2) Combining SA with other search algorithms 2, 3]; and (3) Parallelising SA 4]. This paper falls into the above rst category. Section 2 of this paper describes a general SA algorithm 5, 6] which uni es di erent variants of the classical one 7].

Section 3 presents SA with a dynamic neighbourhood size and its application in Published in Proc. of Fourth Australian Conf. on Neural Networks, ed. P. Leong and M. Jabri, pp. 216{219, 1993, Melbourne, Australia. generate initial con guration X at random; generate initial temperature T0; REPEAT REPEAT Y = generate(X; Tn); IF accept(X; Y; Tn) THEN X = Y ; UNTIL inner-loop stop criterion satis ed; Tn+1 = update (Tn ); n = n + 1; UNTIL outer-loop stop criterion satis ed Figure 1: General simulated annealing. loop stop criterion, can have signi cant impact on its nite-time behaviour.

That is, the computation time in practice depends on the three functions as well as these parameters. Most research on SA has concentrated on the update and accept function and various algorithmic parameters, only limited attention has been paid to the generate function. However, the generate function decides an important part | the neighbourhood structure and size | of a local search algorithm regardless of whether it is a deterministic one or a stochastic one like SA. The neighbourhood NX of a con guration X is de ned by con guration.

The xed-size neighbourhood clearly does not conform with the basic search strategy behind SA. It is appealing to have a neighbourhood size which can adjust itself in the di erent search stages. Fast SA 12] can be regarded as an example of SA with a dynamic neighbourhood size, but it is only used in the continuous case. The application of dynamic neighbourhood size in combinatorial optimisation, to our best knowledge, has not been well-studied. 3 Dynamic Neighbourhood Size in Simulated Annealing This section gives a method of dynamically deciding the neighbourhood size in SA according to the temperature parameter 5, 6].

In the high temperature stages, SA algorithms have high acceptance probability for both good and ad moves, i. e. , exploration plays a major role in search, and thus a large neighbourhood size is used to enhance such exploration. In the low temperature stages, exploitation plays a major role in search, and thus a smaller neighbourhood is more suitable. In the following discussion, we say that the Hamming distance between two con guration X = (x1 ; x2; ; xn ) and Y = (y1; y2 ; ; yn ) is if there are exactly di erent elements between them.

Let f (x) be the continuous density function which is used to generate the Hamming distance between the current con guration and the next one. Denote the set of con gurations which are distant from the current con guration X as SX ( ), SX ( ) = fY 2 S; gXY (Tn) > 0g where X 62 NX , and X 2 NY i Y 2 NX . NX = fY jY (1) ing search once de ned for a problem. Goldstein and Waterman 10] and Cheh et al. 11] carried out some experiments on comparing SA with di erent neighbourhood sizes, but the sizes are still xed once decided.

A limitation of SA with a xed neighbourhood size is its inability to perform search at di erent scales in di erent stages of search. As indicated in our previous study 5], SA can be viewed as an attempt to combine exploration of a space and exploitation of a sub-space into the same algorithm, i. e. , coarse-grained search in the high temperature stages explores the con guration space and tries to locate promising regions, while ned-grained search in the low temperature stages exploits the promising regions and tries to nd a good near optimal gXY (Tn ) = 1=jNX j, where jNX j is the size of NX , i. . , the number of con gurations in NX , and is the same for all X in S . Moreover, jNX j is xed dur- Previous research on SA normally assumed that j Y 2 S; dXY = g (2) The probability of generating con guration Y , which is dXY distant from con guration X , is dened as 1 = jS (1 )j P rob dXY ? 2 < X dXY Z dXY + 1 2 f (x)dx = jS (1 )j 1 X dXY dXY ? 2 f (dXY ) jSX (dXY )j 2 gXY (Tn ) dXY + 1 2 (3) Suppose the maximum Hamming distance allowed for one move is dmax 1 , then the normalised generation function is f (dXY ) / jSX (dXY )j gXY (Tn ) = (4) FX (Tn ) where FX (Tn ) = X X f (d ) max XZ jSX (dXZ )j dXZ =1 Z 2S 4 Experimental Results We adopt the Traveling Salesman Problem (TSP) as a benchmark to evaluate our SA algorithms because of its clear mathematical de nition and high computational complexity. Goldstein and Waterman 10] and Cheh et al. 11] have experimented with TSPs using di erent but xed neighbourhood sizes and found that a small neighbourhood size is better than a large neighbourhood size. That is, the SA algorithm performs the best when dXY = 1. TSPs with 40 cities are used in our experiment and are generated at random.

The same initial conguration, inner-loop stop criterion, out-loop stop criterion, and temperature decreasing rate are used in our experiments in order to evaluate the impact of the neighbourhood size on the performance of SA algorithms. Our experiments, albeit preliminary, have demonstrated that SA with a dynamic neighbourhood size outperforms SA with a xed neighbourhood size. Table 1 gives the results of four typical runs of two kinds of SA algorithms. Table 2 gives the results of using di erent distributions to generate neighbourhood sizes. roblem instance 1 2 3 4 initial value 15080 12260 13760 15820 NorSA 2540 2140 2560 2300 CSA 3120 2520 2880 2460 Table 1: Comparison of SA with a xed neighbourhood size (CSA) and SA with a dynamic neighbourhood size (NorSA). Normal distribution is used to generate the neighbourhood size. (5) Theorem 3. 1 ( 5]) Suppose the acceptance function in an SA algorithm is aXY (Tn ) = min 1; exp ? ? cY T cX n : (6) and the generation function is (4), where f (x) in (4) can be anyone of the following, (a) the Normal function N (0; Tn), i. e. , 1 exp ? d2 XY f (dXY ) = p 2Tn 2 Tn (b) the exponential function E (Tn ), i. . , f (dXY ) = ! 1 exp ? dXY Tn Tn (c) the Cauchy function C (Tn ), i. e. , 1 T f (dXY ) = 2 n 2 dXY + Tn (d) the stable function with index 1 13], i. e. , 2 f (dXY ) = q exp ? 2d1 XY 2 d3 XY 1 5 Concluding Remarks Neighbourhood size is an important parameter in local search algorithms, but only a xed size was adopted in previous application of SA to combinatorial optimisation problems. This paper proposes a method of using a dynamic neighbourhood size in SA based on our analysis of SA search. Preliminary experiments have demonstrated the advantage of a dynamic neighbourhood size in SA.

The idea of a dynamic neighbourhood size could also be introduced into other local search algorithms. It is, in fact, related to a more profound 3 Then the SA algorithm converges to global minima if the cooling rate is Tn = ln n + n0 ; n = 1; 2; (7) where and n0 are positive constants. It is set to n, the number of elements in a con guration, in our experiments. 1 problem instance initial value CauSA NorSA ExpSA StableSA 1 17800 2480 2540 2640 3760 2 15500 3000 3340 3180 4420 3 16600 3300 2920 3460 4500 4 14780 3000 2980 3280 3760 References 1] P. J. M. van Laarhoven and E. H. L.

Aarts, Simulated Annealing: Theory and Applications, D. Reidel Publishing Co. , 1987. 2] D. H. Ackley, A Connectionist Machine for Genetic Hillclimbing, Kluwer Academic Publishers, Boston, 1987. 3] X. Yao, Optimization by genetic annealing, In M. Jabri, editor, Proc. of ACNN91, pages 94{97, Sydney, 1991. 4] D. R. Greening, Parallel simulated annealing techniques, Physica D, 42:293{306, 1990. 5] X. Yao, Simulated annealing with extended neighbourhood, International J. of Computer Math. , 40:169{189, 1991. 6] X. Yao and G. -J. Li, General simulated annealing, J. of Computer Sci. & Tech. 6:329{ 338, 1991. 7] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Optimization by simulated annealing, Science, 220:671{680, 1983. 8] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman Co. , San Francisco, 1979. 9] S. Anily and A. Federgruen, Ergodicity in parameteric nonstationary Markov chains: an application to annealing methods, Oper. Res. , 35:867{874, 1987. 10] L. Goldstein and M. Waterman, Neighborhood size in the simulated annealing algorithm, Amer. J. of Math. and Management Sci. , 8:409{423, 1988. 11] K. M. Cheh, J.

B. Goldberg, and R. G. Askin, A note on the e ect of neighborhood structure in simulated annealing algorithm, Computers and Oper. Res. , 18:537{547, 1991. 12] H. H. Szu and R. L. Hartley, Nonconvex optimization by fast simulated annealing, Proc. of IEEE, 75:1538{1540, 1987. 13] W. Feller, An Introduction to Probability Theory and Its Applications, volume 2, John Wiley & Sons, Inc. , 2nd edition, 1971. 4 Table 2: SA with a dynamic neighbourhood size which is generated by the Cauchy function (CauSA), Normal function (NorSA), Exponential function (ExpSA), and Stable function with index 1=2 (StableSA). esearch issue in search theory, i. e. , the issue of exploration versus exploitation or global search versus local search. Although local search based on some heuristics can be quite e cient under many circumstances, the problem of local optima is very hard to deal with. Some kind of global search has to be used if a global optimum or near optimum is required. However, the computational cost of global search is often prohibitively high for most real-world applications due to the vast search space.

It is bene cial to combine global and local search together. An open question here is how to decide when global or local search should be performed. It is also di cult to draw the line strictly between local and global search in practice. Dynamic neighbourhood size offers a way to deal with the problem by transferring from global search to local search smoothly based on a control parameter, temperature in SA. However, more work has to be done on deciding which kind of generation functions is most suitable for an application, i. e. what is the optimal rate of reducing the neighbourhood size. As indicated before, Fast SA 12] o ers a big improvement over classical SA 7] due to the adoption of Cauchy distribution. An interesting topic is to investigate whether the discrete version of Fast SA can o er similar improvement over classical SA. Our preliminary experiments seem to give a negative answer. Acknowledgement | The author is grateful to Drs. B. Marksjo and R. Sharpe for their support of his work while he was with CSIRO Division of Building, Construction and Engineering.