In this paper, a new method is proposed to overcome the problem of local optima traps in a class of evolutionary algorithms, called estimation of distribution algorithms (EDAs), in real-valued function optimization. The Duple-EDA framework is proposed in which not only the current best solutions but also the search history are modeled, so that long-term feedback can be taken into account. Sample Density Balancing (SDB) is proposed under the framework to alleviate the drift phenomenon in EDA. A selection scheme based on Pareto ranking considering both the fitness and the historical sample density is adopted, which prevents the algorithm from repeatedly sampling in a small region and directs it to explore potentially optimal regions, thus helps it avoid being stuck into local optima. An MBOA (mixed Bayesian optimization algorithm) version of the framework is implemented and tested on several benchmark problems. Experimental results show that the proposed method outperforms a standard niching method in these benchmark problems.
This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited existing sentiment resources in Chinese. On the one hand, all available Chinese sentiment lexicons - individual and combined - are evaluated under our proposed framework. On the other hand, the domain dependent sentiment noise words are identified and removed using unlabeled data, to improve the classification performance. To the best of our knowledge, this is the first such attempt. Experiments have been conducted on three open datasets in two domains, and the results show that the proposed algorithm for sentiment noise words removal can improve the classification performance significantly.