[Scikit-learn-general] silhouette_score and silhouette

Discussion:

[Scikit-learn-general] silhouette_score and silhouette_samples

Sebastian Raschka

2015-06-16 03:54:53 UTC

Hi, all,

I am a little bit confused about the two related metrics silhouette_score and silhouette_samples. The silhouette_samples calculates the silhouette coefficient for each sample and returns an array of those. However, I am wondering if I interpret the silhouette_score correctly. Based on the documentation at http://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html I assume that it's just the average of the silhouette coefficients, which can be confirmed by running, e.g.,

np.mean(silhouette_samples(X, y, metric='euclidean'))

Now, I am wondering why silhouette_score has this additional random_state parameter?

Best,
Sebastian
------------------------------------------------------------------------------

Joel Nothman

2015-06-16 04:34:27 UTC

Permalink

See the sample_size parameter: silhouette score can be calculated on a
random subset of the data, presumably for efficiency. Feel free to submit a
PR improving the docstring.

Post by Sebastian Raschka
Hi, all,
I am a little bit confused about the two related metrics silhouette_score
and silhouette_samples. The silhouette_samples calculates the silhouette
coefficient for each sample and returns an array of those. However, I am
wondering if I interpret the silhouette_score correctly. Based on the
documentation at
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html
I assume that it's just the average of the silhouette coefficients, which
can be confirmed by running, e.g.,
np.mean(silhouette_samples(X, y, metric='euclidean'))
Now, I am wondering why silhouette_score has this additional random_state parameter?
Best,
Sebastian
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Sebastian Raschka

2015-06-16 05:26:43 UTC

Permalink

Thanks, Joel, it makes total sense now! Updating the docstring sounds like a good idea, I will get to it in the next couple of days.

Best,
Sebastian

See the sample_size parameter: silhouette score can be calculated on a random subset of the data, presumably for efficiency. Feel free to submit a PR improving the docstring.
Hi, all,
I am a little bit confused about the two related metrics silhouette_score and silhouette_samples. The silhouette_samples calculates the silhouette coefficient for each sample and returns an array of those. However, I am wondering if I interpret the silhouette_score correctly. Based on the documentation at http://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html I assume that it's just the average of the silhouette coefficients, which can be confirmed by running, e.g.,
np.mean(silhouette_samples(X, y, metric='euclidean'))
Now, I am wondering why silhouette_score has this additional random_state parameter?
Best,
Sebastian
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------

Continue reading on narkive:

Search results for '[Scikit-learn-general] silhouette_score and silhouette_samples' (Questions and Answers)

replies

horse story?

started 2008-03-28 17:52:51 UTC

horses

replies

Is my lack of ability to visualize normal?

started 2011-01-23 23:56:13 UTC

psychology

replies

words that starts with letter s?

started 2008-07-15 23:55:54 UTC

homework help

replies

One Hit Wonders?

started 2006-08-17 19:29:23 UTC

music