If sample_size is None, no sampling is used. The size of the sample to use when computing the Silhouette Coefficient The distance array itself, use metric="precomputed". If metric is a string, it must be one of the options The metric to use when calculating distance between instances in aįeature array. metric str or callable, default=’euclidean’ Parameters : X of shape (n_samples_a, n_samples_a) if metric = “precomputed” or (n_samples_a, n_features) otherwiseĪn array of pairwise distances between samples, or a feature array. Negative values generally indicate that a sample hasīeen assigned to the wrong cluster, as a different cluster is more similar. The best value is 1 and the worst value is -1. To obtain the values for each sample, use silhouette_samples. This function returns the mean Silhouette Coefficient over all samples. Note that Silhouette Coefficient is only defined if number of labels To clarify, b is the distance between a sample and the nearestĬluster that the sample is not a part of. The Silhouette Coefficient for a sample is (b - a) / max(a, b). The Silhouette Coefficient is calculated using the mean intra-clusterĭistance ( a) and the mean nearest-cluster distance ( b) for each silhouette_score ( X, labels, *, metric = 'euclidean', sample_size = None, random_state = None, ** kwds ) ¶Ĭompute the mean Silhouette Coefficient of all samples.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |