Clustering is a technique that involves partitioning data to group similar data points in a larger dataset with unknown groups. It is very useful in managing large datasets that have no predetermined classification, making it a crucial tool for data analysis. Based on clustering algorithm, the results of clustering or object segmentation depend on the combination of distance and linkage methods used. Therefore, it is important to investigate which clustering method(s) outperform others for different distance measures and linkage methods. A simulation study for 5000 iteration is conducted in evaluating the statistical performance of several distance and linkages methods. The average value of cophenetic correlation coefficient is computed for two different distributions: multivariate normal distribution and gamma distribution using MATLAB programming in measuring the quality of clustering solution. In addition, simulated data are deployed under p = 3 and p = 5, with sample sizes n = 30, 50, and 100. Overall, these results can serve as a helpful guideline for researchers when choosing appropriate distance and linkage methods to make better decisions.

1.
S.
Vijaya
, Aayushi and
R.
Bateja
,
J. Eng. Appl. Sci.
12
,
7501
7507
(
2017
).
2.
A. M.
Jarman
, Doctoral dissertation,
Georgia Southern University
,
United States
,
2020
.
3.
S.
Vukotic
and
V.
Mircetic
,
Proceedings of the Fifth International Scientific Conference on Tourism in Function of Development of The Republic of Serbia Tourism and Rural Development
(
TISC
,
2020
), Vol.
5
, pp.
470
487
.
4.
V.
Estivill-Castro
,
ACM SIGKDD Explorations Newsletter
,
4
,
65
75
(
2002
).
5.
R.A.
Johnson
and
D.W.
Wichern
,
Applied Multivariate Statistical Analysis
(
Pearson India Education Services
,
Uttar Pradesh, India
2019
).
6.
S.
Saracli
,
N.
Dogan
, and
I.
Dogan
,
J. Inequal. Appl.
203
,
1
8
(
2013
).
This content is only available via PDF.
You do not currently have access to this content.