Hi Andy & Ken,

Thanks Ken for the alternative but I am using a cosine distance.

Andy, concerning the computation of the mean, the function has to be configurable too but the default function mean is also good for cosine & bregman divergence (http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf see table 8.2 page 501). Yes I could implement easily k-means but I will lose lot of benefits from sklearn frameworks such as the ability to compare easily several unsupervised algorithms. I was simply expected the distance function to be configurable as it is with many other sklearn functions.

On the other hand, do you know why metrics.cluster.unsupervised.silhouette_score required the labels? I understand that we can compute the supervised version of the silhouette score but was looking for the unsupervised version. Even the help doesnât mention anywhere the labels.

I am trying to push for sklearn in my team, quite impress so far.

Thanks,

Francis

From: Kenneth C. Arnold [mailto:***@seas.harvard.edu]

Sent: April-02-13 3:32 PM

To: scikit-learn-***@lists.sourceforge.net

Subject: Re: [Scikit-learn-general] kmeans distance function not configurable

If you want a Mahalanobis distance, though, you can instead just transform your data using the Cholesky decomposition of the distance matrix.

-Ken

On Tue, Apr 2, 2013 at 3:09 PM, Andreas Mueller <***@ais.uni-bonn.de<mailto:***@ais.uni-bonn.de>> wrote:

Hi Francis.

No. It is highly non-trivial for most distance functions to do k-means as

the computation of the mean has to be replaced by a different computation.

If you know how to do that, implementing k-means in pure numpy is not all that hard.

This question comes up quite a lot. Maybe we should do a faq or something.

Cheers,

Andy

On 04/02/2013 09:05 PM, Pieraut, Francis wrote:

Hi guys,

Is there is simple way to change the distance function used in the kmeans implementation?

Thanks,

Francis

------------------------------------------------------------------------------

Minimize network downtime and maximize team effectiveness.

Reduce network management and security costs.Learn how to hire

the most talented Cisco Certified professionals. Visit the

Employer Resources Portal

http://www.cisco.com/web/learning/employer_resources/index.html

_______________________________________________

Scikit-learn-general mailing list

Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------

Minimize network downtime and maximize team effectiveness.

Reduce network management and security costs.Learn how to hire

the most talented Cisco Certified professionals. Visit the

Employer Resources Portal

http://www.cisco.com/web/learning/employer_resources/index.html

_______________________________________________

Scikit-learn-general mailing list

Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/scikit-learn-general