Monday, May 4, 2020 | 4:00pm
Title: Robust and Heterogeneous Clustering Methods with Application to Irish Flower Species
Abstract
K-means clustering is one of the most popular clustering methods but it is often heavily influenced by the presence of outliers in the data. K-medians clustering is a more robust method when dealing with outliers in the data. However, k-median clustering has its limitations when the clusters contain scattered or strongly departed data points. Heterogeneous clustering is proposed to analyze data involving different within cluster variances. The presentation will focus on the comparison of these three clustering methods with application to simulation data as well as real data on Irish flower species. Regression method, Calinski-Harabasz index and Hartigan index are used as criteria of clustering validity indexes to choose the true number of multidimensional clusters. Sensitivities, specificities and confusion matrix are used to test the accuracy of the clustering methods.
*Please contact mams-staff@case.edu for Zoom presentation access information.