An Analysis of Particle Swarm Optimization with Data Clustering- Technique for Optimization in Data Mining

Anusha Chaudhary, IMSEC

Particle Swarm Optimization (PSO), Fuzzy C-Means Clustering (FCM), Data Mining, Data Clustering

Data clustering is an approach for automatically finding classes, concepts, or groups of patterns. It also aims at representing large datasets by a few number of prototypes or clusters. It brings simplicity in modelling data and plays an important role in the process of knowledge discovery and data mining. Data mining tasks require fast and accurate partitioning of huge datasets, which may come with a variety of attributes or features. This imposes computational requirements on the clustering techniques. Swarm Intelligence (SI) has emerged that meets these requirements and has successfully been applied to a number of real world clustering problems. This paper looks into the use of Particle Swarm Optimization for cluster analysis. The effectiveness of Fuzzy C-means clustering provides enhanced performance and maintains more diversity in the swarm and allows the particles to be robust to trace the changing environment. Data structure identifying from the large scale data has become a very important in the data mining problems. Cluster analysis identifies groups of similar data items in large datasets which is one of its recent beneficiaries. The increasing complexity and large amounts of data in the data sets that have seen data clustering emerge as a popular focus for the application of optimization based techniques. Different optimization techniques have been applied to investigate the optimal solution for clustering problems. This paper also proposes two new approaches using PSO to cluster data. It is shown how PSO can be used to find the centroids of a user specified number of clusters.
    [1] R. Eberhart, and J. Kennedy, (1995) A New Optimizer Using Particles Swarm Theory, Proc. Sixth International Symposium on Micro Machine and Human Science (Nagoya, Japan), IEEE Service Center, Piscataway, NJ, pp. 39-43. [2] J. Kennedy, and R Eberhart, (1995), Particle Swarm Optimization, IEEE Conference on Neural Networks, pp. 1942-1948, (Perth, Australia), Piscataway, NJ, IV, 1995. [3] J. Kennedy and R. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers, Inc., San Francisco, CA, 2001. [4] A. P. Engel Brecht. (2005), Fundamentals of Computational Swarm Intelligence. Wiley, 2005. [5] Kennedy, J.; Eberhart, R.C. (1997), A discrete binary version of the particle swarm algorithm, IEEE Conference on Systems, Man, and Cybernetics, 1997. [6] M. Fatih Tasgetiren. & Yun-Chia Liang, (2007), A Binary Particle Swarm Optimization Algorithm for Lot Sizing Problem Journal of Economic and Social Research vol 5. Elsevier pp. 1-20. [7] Wen-liang Zhong, Jun Zhang, Wei-neng Chen, (2007), A novel discrete particle swarm optimization to solve traveling salesman problem, Evolutionary Computation, 2007. CEC 2007. IEEE Congress on, Singapore, Sept. 2007, pp. 3283-3287. [8] J. Sadri, and Ching Y. Suen, (2006), A Genetic Binary Particle Swarm Optimization Model, IEEE Congress on Evolutionary Computation, Vancouver, BC, Canada, 2006. [9] A.K. Jain, M.N. Murty, P.J. Flynn, Data Clustering: A Review, ACM Computing Surveys, vol. 31(3), 264-323,1999. [10] A.K. Jain, R. Duin, J. Mao, Statistical Pattern Recognition: A Review, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22 (1), 4-37, 2000. [11] D. Judd, P. Mckinley, A.K. Jain, Large-scale Parallel Data Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20 (8), 871-876, 1998. [12] H.M. Abbas, M.M. Fahmy, Neural Networks for Maximum Likelihood Clustering, Signal Processing, vol. 36(1), 111-126, 1994. [13] G.B. Coleman, H.C. Andrews, Image Segmentation by Clustering, Proc. IEEE, vol. 67, 773-785, 1979. [14] S. Ray, R.H. Turi, Determination of Number of Clusters in K-Means Clustering and Application in Color Image Segmentation, Proceedings of the 4th International Conference on Advances in Pattern Recognition and Digital Techniques (ICAPRDT'99), Calcutta, India, 137-143, 1999. [15] C. Carpineto, G. Romano, A Lattice Conceptual Clustering System and Its Application to Browsing Retrieval, Machine Learning, vol. 24(2), 95- 122, 1996. [16] C.-Y. Lee, E.K. Antonsson, Dynamic Partitional Clustering Using Evolution Strategies, In The Third Asia-Pacific Conference on Simulated Evolution and Learning, 2000. [17] G. Hamerly, C. Elkan, Learning the K in K-means, 7th Annual Conference on Neural Information Processing Systems, 2003. [18] H. Frigui and R. Krishnapuram, A Robust Competitive Clustering Algorithm with Applications in Computer Vision, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21(5), 450-465, 1999. [19] Y. Leung, J. Zhang, Z. Xu, Clustering by Space-Space Filtering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22(12), 1396-1410, 2000.-12. [20] M. Halkidi, Y. Batistakis, M. Vazirgiannis, On Clustering Validation Techniques, Intelligent Information Systems Journal, Kluwer Pulishers, vol. 17(2-3), 107-145, 2001.-13. [21] Mahamed G.H. Omran, Andries P Engelbrecht, and Ayed Salman Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification PWASET Volume 9 November 2005 ISSN 1307-6884.
Paper ID: GRDJEV02I050109
Published in: Volume : 2, Issue : 5
Publication Date: 2017-05-01
Page(s): 141 - 144