Optimized Centroid-Based Clustering of Dense Nearly-square Point Clouds by the Hexagonal Pattern

Authors

DOI:

https://doi.org/10.2478/ecce-2023-0005

Keywords:

Centroid-based clustering, hexagonal pattern, initialization, square cloud

Abstract

An approach to optimize centroid-based clustering of flat objects is suggested, which is practically important for efficiently solving metric facility location problems. In such problems, the task is to find the best warehouse locations to optimally service a given set of consumers. An example is assigning mobiles to base stations of a wireless communication network. We suggest a hexagonal-pattern-based approach to partition flat nodes into clusters quicker than the k-means algorithm and its modifications do. First, a hexagonal cell lattice is applied to nodes to approximately determine centroids of the clusters. Then the centroids are used as initial centroids to start the k-means algorithm. The suggested method is efficient for centroid-based clustering of dense nearly-square point clouds of 0.1 million points and greater by using no fewer than 6 lattice cells along an axis. Compared to k-means, our method is at least 10 % faster and it is about 0.01 to 0.07 % more accurate in regular Euclidean distances. In squared Euclidean distances, the accuracy gain is 0.14 to 0.21 %. Applying a hexagonal cell lattice determines an upper bound of the clustering quality gap.

References

V. Srivastava and B. Biswas, “An optimization based framework for region wise optimal clusters in MR images using hybrid objective,” Neurocomputing, vol. 541, Jul. 2023, Art. no. 126286. https://doi.org/10.1016/j.neucom.2023.126286

M. Woźniak and D. Połap, “Object detection and recognition via clustered features,” Neurocomputing, vol. 320, pp. 76–84, Dec. 2018. https://doi.org/10.1016/j.neucom.2018.09.003

N. Dong, B. Ren, H. Li, X. Zhong, X. Gong, J. Han, J. Lv, and J. Cheng, “A novel anomaly score based on kernel density fluctuation factor for improving the local and clustered anomalies detection of isolation forests,” Information Sciences, vol. 637, Aug. 2023, Art. no. 118979. https://doi.org/10.1016/j.ins.2023.118979

M. Nicholson, R. Agrahari, C. Conran, H. Assem, and J. D. Kelleher, “The interaction of normalisation and clustering in sub-domain definition for multi-source transfer learning based time series anomaly detection,” Knowledge-Based Systems, vol. 257, Dec. 2022, Art. no. 109894. https://doi.org/10.1016/j.knosys.2022.109894

S. C. Basak, V. R. Magnuson, G. J. Niemi, and R. R. Regal, “Determining structural similarity of chemicals using graph-theoretic indices,” Discrete Applied Mathematics, vol. 19, no. 1–3, pp. 17–44, Mar. 1988. https://doi.org/10.1016/0166-218X(88)90004-2

K. Schatz, F. Frieß, M. Schäfer, P. C. F. Buchholz, J. Pleiss, T. Ertl, and M. Krone, “Analyzing the similarity of protein domains by clustering Molecular Surface Maps,” Computers & Graphics, vol. 99, pp. 114–127, Oct. 2021. https://doi.org/10.1016/j.cag.2021.06.007

K. Mohammadpour, A. Rashki, M. Sciortino, D. G. Kaskaoutis, and A. D. Boloorani, “A statistical approach for identification of dust-AOD hotspots climatology and clustering of dust regimes over Southwest Asia and the Arabian Sea,” Atmospheric Pollution Research, vol. 13, no. 4, Apr. 2022, Art. no. 101395. https://doi.org/10.1016/j.apr.2022.101395

M. Balcilar, A. H. Elsayed, and S. Hammoudeh, “Financial connectedness and risk transmission among MENA countries: Evidence from connectedness network and clustering analysis,” Journal of International Financial Markets, Institutions and Money, vol. 82, Jan. 2023, Art. no. 101656. https://doi.org/10.1016/j.intfin.2022.101656

A. M. Dichiarante, N. Langet, R. A. Bauer, B. P. Goertz-Allmann, S. C. Williams-Stroud, D. Kühn, V. Oye, S. E. Greenberg, and B. D. E. Dando, “Identifying geological structures through microseismic cluster and burst analyses complementing active seismic interpretation,” Tectonophysics, vol. 820, Dec. 2021, Art. no. 229107. https://doi.org/10.1016/j.tecto.2021.229107

V. V. Romanuke, “Fast-and-smoother uplink power control algorithm based on distance ratios for wireless data transfer systems,” Studies in Informatics and Control, vol. 28, no. 2, pp. 147–156, 2019. https://doi.org/10.24846/v28i2y201903

V. V. Romanuke, “An uplink power control routine for quality-of-service equalization in wireless data transfer networks constrained to equidistant power levels,” KPI Science News, no. 2, pp. 46–56, 2019. https://doi.org/10.20535/kpi-sn.2019.2.160199

Z. Zhang, Q. Feng, J. Huang, and J. Wang, “Improved approximation algorithms for solving the squared metric k-facility location problem,” Theoretical Computer Science, vol. 942, pp. 107–122, Jan. 2023. https://doi.org/10.1016/j.tcs.2022.11.027

S. Li, “A 1.488 approximation algorithm for the uncapacitated facility location problem,” in Automata, Languages and Programming. Lecture Notes in Computer Science, L. Aceto, M. Henzinger, and J. Sgall, Eds., vol. 6756. Springer, Berlin, Heidelberg, 2011, pp. 77–88. https://doi.org/10.1007/978-3-642-22012-8_5

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Information Sciences, vol. 622, pp. 178–210, Apr. 2023. https://doi.org/10.1016/j.ins.2022.11.139

M. E. Celebi, H. A. Kingravi, and P. A. Vela, “A comparative study of efficient initialization methods for the k-means clustering algorithm,” Expert Systems with Applications, vol. 40, no. 1, pp. 200–210, Jan. 2013. https://doi.org/10.1016/j.eswa.2012.07.021

M. Mahajan, P. Nimbhorkar, and K. Varadarajan, “The planar k-means problem is NP-hard,” in WALCOM: Algorithms and Computation. Lecture Notes in Computer Science, S. Das and R. Uehara, Eds., vol. 5431. Springer, Berlin, Heidelberg, 2009, pp. 274–285. https://doi.org/10.1007/978-3-642-00202-1_24

T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu, “A local search approximation algorithm for k-means clustering,” Computational Geometry: Theory and Applications, vol. 28, no. 2–3, pp. 89–112, Jun. 2004. https://doi.org/10.1016/j.comgeo.2004.03.003

P. Fränti and S. Sieranoja, “How much can k-means be improved by using better initialization and repeats?” Pattern Recognition, vol. 93, pp. 95–112, Sep. 2019. https://doi.org/10.1016/j.patcog.2019.04.014

V. V. Romanuke, “Optimization of a dataset for a machine learning task by clustering and selecting closest-to-the-centroid objects,” Herald of Khmelnytskyi National University. Technical Sciences, vol. 1, no. 6, pp. 263–265, 2018.

R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy, “The effectiveness of Lloyd-type methods for the k-means problem,” in Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), Berkeley, CA, USA, Oct. 2006, pp. 165–174. https://doi.org/10.1109/FOCS.2006.75

H. A. Yehoshyna and V. V. Romanuke, “Constraint-based recommender system for commodity realization,” Journal of Communications Software and Systems, vol. 17, no. 4, pp. 314–320, Oct. 2021. https://doi.org/10.24138/jcomss-2021-0102

A. Vattani, “k-means requires exponentially many iterations even in the plane,” Discrete and Computational Geometry, vol. 45, no. 4, pp. 596–616, Mar. 2011. https://doi.org/10.1007/s00454-011-9340-1

A. Chakrabarty and D. Swagatam, “On strong consistency of kernel k-means: A Rademacher complexity approach,” Statistics & Probability Letters, vol. 182, Mar. 2022, Art. no. 109291. https://doi.org/10.1016/j.spl.2021.109291

J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A k-means clustering algorithm,” Journal of the Royal Statistical Society, Series C, vol. 28, no. 1, pp. 100–108, 1979. https://doi.org/10.2307/2346830

J. Cartensen, “About hexagons,” Mathematical Spectrum, vol. 33, no. 2, pp. 37–40, 2000–2001.

R. Fletcher, Practical Methods of Optimization (2nd ed.). J. Wiley and Sons, Chichester, 1987.

S. A. Vavasis, “Complexity issues in global optimization: A survey,” in Handbook of Global Optimization. Nonconvex Optimization and Its Applications, R. Horst and P. M. Pardalos, Eds., vol. 2. Springer, Boston, MA, 1995, pp. 27–41. https://doi.org/10.1007/978-1-4615-2025-2_2

Downloads

Published

01.06.2023

How to Cite

Romanuke, V., Merinova, S., & Yehoshyna, H. (2023). Optimized Centroid-Based Clustering of Dense Nearly-square Point Clouds by the Hexagonal Pattern. Electrical, Control and Communication Engineering, 19(1), 29-39. https://doi.org/10.2478/ecce-2023-0005