Clustering is a technique that involves partitioning data to group similar data points in a larger dataset with unknown groups. It is very useful in managing large datasets that have no predetermined classification, making it a crucial tool for data analysis. Based on clustering algorithm, the results of clustering or object segmentation depend on the combination of distance and linkage methods used. Therefore, it is important to investigate which clustering method(s) outperform others for different distance measures and linkage methods. A simulation study for 5000 iteration is conducted in evaluating the statistical performance of several distance and linkages methods. The average value of cophenetic correlation coefficient is computed for two different distributions: multivariate normal distribution and gamma distribution using MATLAB programming in measuring the quality of clustering solution. In addition, simulated data are deployed under p = 3 and p = 5, with sample sizes n = 30, 50, and 100. Overall, these results can serve as a helpful guideline for researchers when choosing appropriate distance and linkage methods to make better decisions.
Skip Nav Destination
Article navigation
27 August 2024
THE 6TH ISM INTERNATIONAL STATISTICAL CONFERENCE (ISM-VI) 2023
19–20 September 2023
Shah Alam, Malaysia
Research Article|
August 27 2024
Comparative study of selected clustering algorithm based on Monte Carlo simulation study
Shamshuritawati Sharif;
Shamshuritawati Sharif
a)
School of Quantitative Science, Universiti Utara Malaysia
, 06010 UUM Sintok, Kedah, Malaysia
a)Corresponding author: [email protected]
Search for other works by this author on:
Nurshaziana Mohamad-Shamsuri
Nurshaziana Mohamad-Shamsuri
b)
School of Quantitative Science, Universiti Utara Malaysia
, 06010 UUM Sintok, Kedah, Malaysia
Search for other works by this author on:
a)Corresponding author: [email protected]
AIP Conf. Proc. 3123, 060003 (2024)
Citation
Shamshuritawati Sharif, Nurshaziana Mohamad-Shamsuri; Comparative study of selected clustering algorithm based on Monte Carlo simulation study. AIP Conf. Proc. 27 August 2024; 3123 (1): 060003. https://doi.org/10.1063/5.0224175
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
10
Views
Citing articles via
Inkjet- and flextrail-printing of silicon polymer-based inks for local passivating contacts
Zohreh Kiaee, Andreas Lösel, et al.
Design of a 100 MW solar power plant on wetland in Bangladesh
Apu Kowsar, Sumon Chandra Debnath, et al.
Effect of coupling agent type on the self-cleaning and anti-reflective behaviour of advance nanocoating for PV panels application
Taha Tareq Mohammed, Hadia Kadhim Judran, et al.
Related Content
Statistical performance evaluation on correlation-based distance
AIP Conf. Proc. (August 2024)
Application of clustering time series on rainfall data (Case study : Rain station in Bengkulu province)
AIP Conf. Proc. (May 2023)
Implementation of agglomerative hierarchical clustering algorithm for severity level of COVID-19 in Indonesia
AIP Conf. Proc. (December 2023)