[Scikit-learn-general] k-NN user defined distance

Discussion:

A neuman

2016-01-09 02:19:23 UTC

Hello everyone,

I actually want to use the KNeighboursClassifier, with my own distances.

in the Documentation stands the following:

[callable] : a user-defined function which accepts an array of distances,
and returns an array of the same shape containing the weights.

I just dont know, how should the array looks like?

For example, if I have 100 Samples, the array has a size 100*100?
So for every samples there is a distance to the other 99 samples.

[[0.4, 0.2, ...],[0.3,0.1,...]........[0.9,0.6,...]] something like this?

I would appreciate your help.

best,

Sebastian Raschka

2016-01-09 02:41:21 UTC

Permalink

Post by A neuman

from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree').fit(X)
distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])

(note that I am using the NearestNeighbors class here, but the same applies to the KNeighborsClassifier)

For example, to compute the distances between samples as Euclidean distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree', metric=eucldist).fit(X)
distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])

(alt. you could provide it as lambda function)

Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own distances.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.
I just dont know, how should the array looks like?
For example, if I have 100 Samples, the array has a size 100*100?
So for every samples there is a distance to the other 99 samples.
[[0.4, 0.2, ...],[0.3,0.1,...]........[0.9,0.6,...]] something like this?
I would appreciate your help.
best,
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

A neuman

2016-01-09 02:58:48 UTC

Permalink

Ah, that helped me a lot!!!

So i just write my own function that returns an skalar. This function is
used in the metric parameter of the kNN function.

Thank you!!!

You could just need âregular" Python function that outputs a scalar. For

Post by A neuman

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(note that I am using the NearestNeighbors class here, but the same
applies to the KNeighborsClassifier)
For example, to compute the distances between samples as Euclidean
distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree',

metric=eucldist).fit(X)

Post by A neuman

distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(alt. you could provide it as lambda function)
Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own distances.
[callable] : a user-defined function which accepts an array of

distances, and returns an array of the same shape containing the weights.

Post by A neuman
I just dont know, how should the array looks like?
For example, if I have 100 Samples, the array has a size 100*100?
So for every samples there is a distance to the other 99 samples.
[[0.4, 0.2, ...],[0.3,0.1,...]........[0.9,0.6,...]] something like

this?

Post by A neuman
I would appreciate your help.
best,

------------------------------------------------------------------------------

Post by A neuman
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!

http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________

Post by A neuman
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

A neuman

2016-01-12 18:04:29 UTC

Permalink

Hey,

I Have an another problem,

if I'm using my own metric, there are not only the samples in x and y.
I'm using a 10 fold cv with k-NN Classifier.
My Attributes are only 1's and 0's, but if im printing them out, I'll get:

KNeighborsClassifier(metric=myFunc)

def myFunc(x,y):

print x,'\n'
print y,'\n'

I Cutted some values due to the size:

Thats for x:

[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]

[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]

[ 0. 0. 0. 0.02358491 0.00471698 0. 0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0.00943396 0. .............................52358491 0.53773585
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]

[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0.
1. 1. 1. 1. 0.]

and for y

[ 0. 0. 0. 0.02358491 0.00471698 0. 0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0. ..........
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]

[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0.
0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 1. 0. 0. 1. 0. 1. 1. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0.
0. 1. 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 1. 1.
0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 0. 1. 0. 0. 1. 1. 1. 1. 0. 0. 0. 0. 0. 1. 0.
0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1. 1. 1. 1.
1. 1. 1. 1. 0.]

The problem is, I have to count the occurences from 0's and 1's in x and y.
And if there are some other arrays
lik 0.636..... I dont get the right solution. So in general, i only want
the array with 1 and 0

best,

Post by A neuman
Ah, that helped me a lot!!!
So i just write my own function that returns an skalar. This function is
used in the metric parameter of the kNN function.
Thank you!!!

You could just need âregular" Python function that outputs a scalar. For

Post by A neuman

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(note that I am using the NearestNeighbors class here, but the same
applies to the KNeighborsClassifier)
For example, to compute the distances between samples as Euclidean
distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree',

metric=eucldist).fit(X)

Post by A neuman

distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(alt. you could provide it as lambda function)
Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own distances.
[callable] : a user-defined function which accepts an array of

distances, and returns an array of the same shape containing the weights.

this?

Post by A neuman
I would appreciate your help.
best,

------------------------------------------------------------------------------

http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________

Post by A neuman
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

A neuman

2016-01-12 18:08:33 UTC

Permalink

Sorry, thats not right what I wrote:
X:
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]

Y:
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]

X:
[ 0. 0. 0. 0.02358491 0.00471698 0. 0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0.00943396 0. .............................52358491 0.53773585
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]

Y:
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0.
1. 1. 1. 1. 0.]

and so on..

but X should be also containing 1's and 0's.

best,

Post by A neuman
Hey,
I Have an another problem,
if I'm using my own metric, there are not only the samples in x and y.
I'm using a 10 fold cv with k-NN Classifier.
KNeighborsClassifier(metric=myFunc)
print x,'\n'
print y,'\n'
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0. 0. 0. 0.02358491 0.00471698 0.
0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0.00943396 0. .............................52358491 0.53773585
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0.
1. 1. 1. 1. 0.]
and for y
[ 0. 0. 0. 0.02358491 0.00471698 0.
0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0. ..........
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0.
0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 1. 0. 0. 1. 0. 1. 1. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0.
0. 1. 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 1. 1.
0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 0. 1. 0. 0. 1. 1. 1. 1. 0. 0. 0. 0. 0. 1. 0.
0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1. 1. 1. 1.
1. 1. 1. 1. 0.]
The problem is, I have to count the occurences from 0's and 1's in x and
y. And if there are some other arrays
lik 0.636..... I dont get the right solution. So in general, i only want
the array with 1 and 0
best,

Post by A neuman
Ah, that helped me a lot!!!
So i just write my own function that returns an skalar. This function is
used in the metric parameter of the kNN function.
Thank you!!!

You could just need âregular" Python function that outputs a scalar. For

Post by A neuman

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(note that I am using the NearestNeighbors class here, but the same
applies to the KNeighborsClassifier)
For example, to compute the distances between samples as Euclidean
distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree',

metric=eucldist).fit(X)

Post by A neuman

distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(alt. you could provide it as lambda function)
Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own

distances.

Post by A neuman
[callable] : a user-defined function which accepts an array of

distances, and returns an array of the same shape containing the weights.

this?

Post by A neuman
I would appreciate your help.
best,

------------------------------------------------------------------------------

http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________

Post by A neuman
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Sebastian Raschka

2016-01-12 18:18:09 UTC

Permalink

Hi, I am not sure how your custom metric works, but would a np.where(x >= 0.5, 1., 0.) work in your case?

Post by A neuman
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0. 0. 0. 0.02358491 0.00471698 0. 0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0.00943396 0. .............................52358491 0.53773585
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0.
1. 1. 1. 1. 0.]
and so on..
but X should be also containing 1's and 0's.
best,
Hey,
I Have an another problem,
if I'm using my own metric, there are not only the samples in x and y.
I'm using a 10 fold cv with k-NN Classifier.
KNeighborsClassifier(metric=myFunc)
print x,'\n'
print y,'\n'
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0. 0. 0. 0.02358491 0.00471698 0. 0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0.00943396 0. .............................52358491 0.53773585
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0.
1. 1. 1. 1. 0.]
and for y
[ 0. 0. 0. 0.02358491 0.00471698 0. 0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0. ..........
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0.
0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 1. 0. 0. 1. 0. 1. 1. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0.
0. 1. 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 1. 1.
0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 1. 0.
0. 0. 0. 0. 1. 0. 0. 1. 1. 1. 1. 0. 0. 0. 0. 0. 1. 0.
0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1. 0. 1. 1. 1. 1.
1. 1. 1. 1. 0.]
The problem is, I have to count the occurences from 0's and 1's in x and y. And if there are some other arrays
lik 0.636..... I dont get the right solution. So in general, i only want the array with 1 and 0
best,
Ah, that helped me a lot!!!
So i just write my own function that returns an skalar. This function is used in the metric parameter of the kNN function.
Thank you!!!

Post by A neuman

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(note that I am using the NearestNeighbors class here, but the same applies to the KNeighborsClassifier)
For example, to compute the distances between samples as Euclidean distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree', metric=eucldist).fit(X)
distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(alt. you could provide it as lambda function)
Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own distances.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.
I just dont know, how should the array looks like?
For example, if I have 100 Samples, the array has a size 100*100?
So for every samples there is a distance to the other 99 samples.
[[0.4, 0.2, ...],[0.3,0.1,...]........[0.9,0.6,...]] something like this?
I would appreciate your help.
best,
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140>
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

A neuman

2016-01-12 18:24:00 UTC

Permalink

The custom metric, ist just calculating the tanimoto coef.

a=x.tolist()
b=y.tolist()

c=np.count_nonzero(x==y)
a1=a.count(1.0)
b1=b.count(1.0)

return float(c)/(a1 + b1 - c)

so im Just counting 1's in x and 1's in y

c= are the numer, where 1's are matching ( matching == on the same index
x=[1,0,0,1] and y=[1,0,1,0] would be only matching the first 1.

So in generall all the data samples are 1's and 0's. how are the skalars
are occring in the X vektor? There should be no skalar ins the X vector.
Or am i understanding something wrong?

best,

Post by Sebastian Raschka
Hi, I am not sure how your custom metric works, but would a np.where(x >=
0.5, 1., 0.) work in your case?
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0.6371319 0.54557285 0.30214217 0.14690307 0.49778446 0.89183238
0.52445514 0.63379164 0.71873681 0.55008567]
[ 0. 0. 0. 0.02358491 0.00471698 0.
0.
0. 0. 0.00471698 0.00471698 0.00471698 0.02830189
0.00943396 0. .............................52358491 0.53773585
0.63207547 0.51886792 0.66037736 0.75 0.57075472 0.59433962
0.63679245 0.8490566 0.71698113 0.02358491]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0.
1. 1. 1. 1. 0.]
and so on..
but X should be also containing 1's and 0's.
best,

Post by A neuman
Ah, that helped me a lot!!!
So i just write my own function that returns an skalar. This function is
used in the metric parameter of the kNN function.
Thank you!!!

You could just need âregular" Python function that outputs a scalar.

Post by A neuman

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(note that I am using the NearestNeighbors class here, but the same
applies to the KNeighborsClassifier)
For example, to compute the distances between samples as Euclidean
distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree',

metric=eucldist).fit(X)

Post by A neuman

distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(alt. you could provide it as lambda function)
Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own

distances.

Post by A neuman
[callable] : a user-defined function which accepts an array of

distances, and returns an array of the same shape containing the weights.

this?

Post by A neuman
I would appreciate your help.
best,

------------------------------------------------------------------------------

http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________

Post by A neuman
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Sebastian Raschka

2016-01-12 18:33:20 UTC

Permalink

Isn't the tanimoto coeff a continuous number between 0 and 1?

how are the skalars are occring in the X vektor?

You are returning a fraction -> "return float(c)/(a1 + b1 - c)"
instead of the count, maybe that's an misunderstanding?

The custom metric, ist just calculating the tanimoto coef.
a=x.tolist()
b=y.tolist()
c=np.count_nonzero(x==y)
a1=a.count(1.0)
b1=b.count(1.0)
return float(c)/(a1 + b1 - c)
so im Just counting 1's in x and 1's in y
c= are the numer, where 1's are matching ( matching == on the same index x=[1,0,0,1] and y=[1,0,1,0] would be only matching the first 1.
So in generall all the data samples are 1's and 0's. how are the skalars are occring in the X vektor? There should be no skalar ins the X vector. Or am i understanding something wrong?
best,
Hi, I am not sure how your custom metric works, but would a np.where(x >= 0.5, 1., 0.) work in your case?

Post by A neuman

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(note that I am using the NearestNeighbors class here, but the same applies to the KNeighborsClassifier)
For example, to compute the distances between samples as Euclidean distance (the default) you could just define a Python function
... return np.sqrt(np.sum((x-y)**2))

Post by A neuman

nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree', metric=eucldist).fit(X)
distances, indices = nbrs.kneighbors(X)
distances

array([[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356],
[ 0. , 1. ],
[ 0. , 1. ],
[ 0. , 1.41421356]])
(alt. you could provide it as lambda function)
Best,
Sebastian

Post by A neuman
Hello everyone,
I actually want to use the KNeighboursClassifier, with my own distances.
[callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights.
I just dont know, how should the array looks like?
For example, if I have 100 Samples, the array has a size 100*100?
So for every samples there is a distance to the other 99 samples.
[[0.4, 0.2, ...],[0.3,0.1,...]........[0.9,0.6,...]] something like this?
I would appreciate your help.
best,
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140>
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>

Herbert Schulz

2016-01-12 19:01:14 UTC

Permalink

Yes, it is a number between 0 and 1.

but im calculating it, depended on 2 samples. So x and y. And if im
counting the 1's in x and the 1's in y, i should get the coef with return
float(c)/(a1 + b1 - c)

But i cannot count 1's in x if x is a skalar.

Sebastian Raschka

2016-01-12 19:35:35 UTC

Permalink

I see, as far as I know, your X input array should be a n_samples x n_features array. Is this true in your case?

Btw. instead of converting the inputs to Python lists , you could also get the counts via

a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]

And maybe a few extra lines

assert set(np.unique(x)).issubset({0.0, 1.0})
assert set(np.unique(x)).issubset({0.0, 1.0})

would be useful for debugging purposes?!

Post by Herbert Schulz
Yes, it is a number between 0 and 1.
but im calculating it, depended on 2 samples. So x and y. And if im counting the 1's in x and the 1's in y, i should get the coef with return float(c)/(a1 + b1 - c)
But i cannot count 1's in x if x is a skalar.
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Herbert Schulz

2016-01-12 20:19:53 UTC

Permalink

The X array which I'm using to fit for the kNN algorithm is n_samples x n
features big.

but for calculating the distance or in this case the tanimoto.

i thought tanimot(x,y) if I call it with :
KNeighborsClassifier(metric=tanimoto).fit(X_train,y_train) is,

x= is in sample and y= is one sample. So x is somehow converted into this
skalar array.

If I'm printing out your example, it actually works:
[ 0.35592602 0.75653009 0.53405987 0.47403539 0.56255335 0.82097978
0.90163467 0.67498667 0.09987697 0.0341568 ] [ 0.35592602 0.75653009
0.53405987 0.47403539 0.56255335 0.82097978
0.90163467 0.67498667 0.09987697 0.0341568 ]

print
x y
[ 0. 0.] [-1. -1.]
[ 0. 0.] [-2. -1.]
[ 0. 0.] [-3. -2.]
[ 0. 0.] [ 1. 1.]
[ 0. 0.] [ 2. 1.]
[ 0. 0.] [ 3. 2.]
[-1. -1.] [ 0. 0.]
[-1. -1.] [-1. -1.]
......
......

But there are still at the first 2 lines the skalars.

the output from my X_train is:

[array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.,
......
1., 0., 0., 1., 1., 1., 1., 0., 1., 0., 1., 0., 1.,
0., 1., 0., 1., 1., 1., 0., 1., 0., 0., 1., 0., 1.,
1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 0.]), array([
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
......
0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 1., 0., 0.,
1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.,
1., 1., 0., 1., 0., 1., 1., 1., 0., 1., 0., 0., 0.,
0., 1., 0., 1., 1., 0., 1., 1., 1., 1., 0.]), array([
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.,
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
...
0., 0., 0., 1., 1., 1., 0., 0., 0., 0., 0., 0., 1.,
0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 0., 1., 0.,
1., 0., 1., 0., 1., 1., 1., 1., 1., 1., 0.]), array([
0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,....

Herbert Schulz

2016-01-13 00:21:08 UTC

Permalink

Here is an example code, where the failure occurs.

sorry for the big tests vector, couldn't show it otherwise.

import numpy as np
from sklearn.neighbors import NearestNeighbors

def tanimoto(x,y):

print "X OUTPUT\n ",x,"B OUTPUT\n",y

c=np.sum(x==y)
a1 = np.sum(x)
b1 = np.sum(y)

return float(c)/(a1 + b1 - c)

tests=[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0,
1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0,
1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0], [0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0,
1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0,
0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 1.0]]

classifiers=NearestNeighbors( n_neighbors=2,metric=tanimoto).fit(tests)

and the output is then what i mentioned

x are only floats (0.573... ) and B are containing 1's and 0's like it
should

best,

Sebastian Raschka

2016-01-13 00:55:36 UTC

Permalink

Hi, Herbert,
sorry, but I am still a bit confused about what you are trying to accomplish when you say

Post by Herbert Schulz
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should

When I run it on a small test dataset ...

from sklearn.neighbors import NearestNeighbors
import numpy as np

X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])

def tan(x, y):
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)

nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree', metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
distances

I get

array([[ 0.75],
[-2. ],
[ 0.25]])

which is something I would expect given the function above!?

Maybe you could give us a short excerpt of how your input array looks like (e.g,. a 5x3 matrix or so) and what distances you'd expect to see.

Best,
Sebastian

Post by Herbert Schulz
Here is an example code, where the failure occurs.
sorry for the big tests vector, couldn't show it otherwise.
import numpy as np
from sklearn.neighbors import NearestNeighbors
print "X OUTPUT\n ",x,"B OUTPUT\n",y
c=np.sum(x==y)
a1 = np.sum(x)
b1 = np.sum(y)
return float(c)/(a1 + b1 - c)
tests=[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]
classifiers=NearestNeighbors( n_neighbors=2,metric=tanimoto).fit(tests)
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should
best,
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Herbert Schulz

2016-01-13 01:33:19 UTC

Permalink

Sorry that i coudln't explained it very well

I thought that

X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0,
1.0]])

def tan(x, y):

print x,y

c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)

example:

[ 1. 0. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]

this is the output from x and y printed in the tan(x,y) function.

#If I'm printing x and y in the tanimoto function, i should get something
like ----->

[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]

without the array containing the floats like: [ 0.66666667 0.33333333
1. 0.66666667]

The problem is just, if I'm using the tanimoto metric, im getting bad
predictions... so realy bad like 0.0 accuracy, but maybe this is just an
another problem. I just thought, that im doing something wrong. And
therefore i printed x,y in the tanimoto function to check it. These float
array just confused me, due to may X_train array contains actually only 1's
and 0's

And does the (in my case) KNeighborsClassifier() use these distances
automatically if i pass the matrik=tanimoto? or should i calculate the
distance and give the array to the weights parameter.

best,

Herbert

Post by Sebastian Raschka
Hi, Herbert,
sorry, but I am still a bit confused about what you are trying to accomplish when you say
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should
When I run it on a small test dataset ...
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)
nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree',
metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
distances
I get
array([[ 0.75],
[-2. ],
[ 0.25]])
which is something I would expect given the function above!?
Maybe you could give us a short excerpt of how your input array looks like
(e.g,. a 5x3 matrix or so) and what distances you'd expect to see.
Best,
Sebastian
Here is an example code, where the failure occurs.
sorry for the big tests vector, couldn't show it otherwise.
import numpy as np
from sklearn.neighbors import NearestNeighbors
print "X OUTPUT\n ",x,"B OUTPUT\n",y
c=np.sum(x==y)
a1 = np.sum(x)
b1 = np.sum(y)
return float(c)/(a1 + b1 - c)
tests=[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0,
1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0,
1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0], [0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0,
1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0,
0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 1.0]]
classifiers=NearestNeighbors( n_neighbors=2,metric=tanimoto).fit(tests)
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should
best,
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Herbert Schulz

2016-01-13 01:45:26 UTC

Permalink

ps.

1. I printed the x,y array. And i thougtif these is the output:

[ 0.49178495 0.44239588 0.43451225 0.40576958 0.82022061 0.02921787
0.08832147 0.43397282 0.15083042 0.49916182] [ 0.49178495 0.44239588
0.43451225 0.40576958 0.82022061 0.02921787
0.08832147 0.43397282 0.15083042 0.49916182]
[ 0.66666667 0.33333333 1. 0.66666667] [ 1. 0. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667] [ 0. 0. 1. 0.]
[ 0.66666667 0.33333333 1. 0.66666667] [ 1. 1. 1. 1.]
[ 1. 0. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]

and we use the code:

c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)

the check
c=np.sum(x==y)

is not right or? I just want to compare
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]

but not something like, which is also printed out from the tan(x,y)
function.

[ 0.66666667 0.33333333 1. 0.66666667] [ 1. 1. 1. 1.]

Post by Herbert Schulz
Sorry that i coudln't explained it very well
I thought that
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print x,y
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)
[ 1. 0. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]
this is the output from x and y printed in the tan(x,y) function.
#If I'm printing x and y in the tanimoto function, i should get something
like ----->
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]
without the array containing the floats like: [ 0.66666667 0.33333333
1. 0.66666667]
The problem is just, if I'm using the tanimoto metric, im getting bad
predictions... so realy bad like 0.0 accuracy, but maybe this is just an
another problem. I just thought, that im doing something wrong. And
therefore i printed x,y in the tanimoto function to check it. These float
array just confused me, due to may X_train array contains actually only 1's
and 0's
And does the (in my case) KNeighborsClassifier() use these distances
automatically if i pass the matrik=tanimoto? or should i calculate the
distance and give the array to the weights parameter.
best,
Herbert

Post by Sebastian Raschka
Hi, Herbert,
sorry, but I am still a bit confused about what you are trying to accomplish when you say
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should
When I run it on a small test dataset ...
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)
nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree',
metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
distances
I get
array([[ 0.75],
[-2. ],
[ 0.25]])
which is something I would expect given the function above!?
Maybe you could give us a short excerpt of how your input array looks
like (e.g,. a 5x3 matrix or so) and what distances you'd expect to see.
Best,
Sebastian
Here is an example code, where the failure occurs.
sorry for the big tests vector, couldn't show it otherwise.
import numpy as np
from sklearn.neighbors import NearestNeighbors
print "X OUTPUT\n ",x,"B OUTPUT\n",y
c=np.sum(x==y)
a1 = np.sum(x)
b1 = np.sum(y)
return float(c)/(a1 + b1 - c)
tests=[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0,
1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0,
1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0], [0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0,
1.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0,
1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0,
1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0,
0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0,
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0,
0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
1.0, 1.0, 1.0, 1.0]]
classifiers=NearestNeighbors( n_neighbors=2,metric=tanimoto).fit(tests)
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should
best,
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Sebastian Raschka

2016-01-13 21:12:40 UTC

Permalink

I guess I got it now! This behavior (see below) is indeed a bit strange:

from sklearn.neighbors import NearestNeighbors
import numpy as np

X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])

def tan(x, y):
print(y)
return 1

nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree', metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)

[ 0.51786272 0.53042315 0.87815766 0.90239616 0.34253599 0.98631925
0.29768794 0.36593595 0.28956526 0.24720931]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]

It seems to be due to the partitioning via the ball tree algorithm; I am not sure if this is intended. It would be nice to get some feedback on this ...

Switching to "brute" seems to return the expected results:

from sklearn.neighbors import NearestNeighbors
import numpy as np

X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])

def tan(x, y):
print(y)
return 1

nbrs = NearestNeighbors(n_neighbors=1, algorithm='brute', metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)

[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]

Post by Herbert Schulz
ps.
[ 0.49178495 0.44239588 0.43451225 0.40576958 0.82022061 0.02921787
0.08832147 0.43397282 0.15083042 0.49916182] [ 0.49178495 0.44239588 0.43451225 0.40576958 0.82022061 0.02921787
0.08832147 0.43397282 0.15083042 0.49916182]
[ 0.66666667 0.33333333 1. 0.66666667] [ 1. 0. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667] [ 0. 0. 1. 0.]
[ 0.66666667 0.33333333 1. 0.66666667] [ 1. 1. 1. 1.]
[ 1. 0. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)
the check
c=np.sum(x==y)
is not right or? I just want to compare
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
but not something like, which is also printed out from the tan(x,y) function.
[ 0.66666667 0.33333333 1. 0.66666667] [ 1. 1. 1. 1.]
Sorry that i coudln't explained it very well
I thought that
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print x,y
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)
[ 1. 0. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]
this is the output from x and y printed in the tan(x,y) function.
#If I'm printing x and y in the tanimoto function, i should get something like ----->
[ 1. 0. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 0. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 0. 1. 1.] [ 1. 1. 1. 1.]
[ 0. 0. 1. 0.] [ 1. 0. 1. 1.]
[ 0. 0. 1. 0.] [ 0. 0. 1. 0.]
[ 0. 0. 1. 0.] [ 1. 1. 1. 1.]
[ 1. 1. 1. 1.] [ 1. 0. 1. 1.]
[ 1. 1. 1. 1.] [ 0. 0. 1. 0.]
[ 1. 1. 1. 1.] [ 1. 1. 1. 1.]
without the array containing the floats like: [ 0.66666667 0.33333333 1. 0.66666667]
The problem is just, if I'm using the tanimoto metric, im getting bad predictions... so realy bad like 0.0 accuracy, but maybe this is just an another problem. I just thought, that im doing something wrong. And therefore i printed x,y in the tanimoto function to check it. These float array just confused me, due to may X_train array contains actually only 1's and 0's
And does the (in my case) KNeighborsClassifier() use these distances automatically if i pass the matrik=tanimoto? or should i calculate the distance and give the array to the weights parameter.
best,
Herbert
Hi, Herbert,
sorry, but I am still a bit confused about what you are trying to accomplish when you say

Post by Herbert Schulz
and the output is then what i mentioned
x are only floats (0.573... ) and B are containing 1's and 0's like it should

When I run it on a small test dataset ...
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
c=np.sum(x==y)
a1 = x[x == 1.0].shape[0]
b1 = y[y == 1.0].shape[0]
return float(c)/(a1 + b1 - c)
nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree', metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
distances
I get
array([[ 0.75],
[-2. ],
[ 0.25]])
which is something I would expect given the function above!?
Maybe you could give us a short excerpt of how your input array looks like (e.g,. a 5x3 matrix or so) and what distances you'd expect to see.
Best,
Sebastian

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Shishir Pandey

2016-02-23 10:44:25 UTC

Permalink

I have been experimenting with the above code. I have noticed the following
things:

1. If we set algorithm = 'brute' the algorithm does not enter the
function tan, i.e., putting a breakpoint at the print statement does not
stop execution on it during the fit method. It does however use this
function when using kneighbors method.
2. I think one cannot use the user defined metric with 'brute'.
3. On the other hand if we set the algorithm = 'ball_tree' the execution
does go through the tan function during the fit method. But if you see the
values of x and y at this time it will be different from the values of x
and y that you entered.
4. Clearly, the ball_tree algorithm is doing some weird stuff. I don't
think it is using the defined metric tan for making the tree.

--
sp

from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print(y)
return 1
nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree',
metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
[ 0.51786272 0.53042315 0.87815766 0.90239616 0.34253599 0.98631925
0.29768794 0.36593595 0.28956526 0.24720931]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
It seems to be due to the partitioning via the ball tree algorithm; I am
not sure if this is intended. It would be nice to get some feedback on this
...
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print(y)
return 1
nbrs = NearestNeighbors(n_neighbors=1, algorithm='brute',
metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]

Jacob Vanderplas

2016-02-23 18:56:35 UTC

Permalink

Post by Shishir Pandey
I have been experimenting with the above code. I have noticed the
1. If we set algorithm = 'brute' the algorithm does not enter the
function tan, i.e., putting a breakpoint at the print statement does not
stop execution on it during the fit method. It does however use this
function when using kneighbors method. I think one cannot use the user
defined metric with 'brute'.
This sounds like a bug â can you open an issue?
1. On the other hand if we set the algorithm = 'ball_tree' the
execution does go through the tan function during the fit method. But if
you see the values of x and y at this time it will be different from the
values of x and y that you entered. Clearly, the ball_tree algorithm is
doing some weird stuff. I don't think it is using the defined metric tan
for making the tree.

https://github.com/scikit-learn/scikit-learn/pull/6288

Post by Shishir Pandey
--
sp

from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print(y)
return 1
nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree', metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
[ 0.51786272 0.53042315 0.87815766 0.90239616 0.34253599 0.98631925
0.29768794 0.36593595 0.28956526 0.24720931]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
It seems to be due to the partitioning via the ball tree algorithm; I am
not sure if this is intended. It would be nice to get some feedback on this
...
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print(y)
return 1
nbrs = NearestNeighbors(n_neighbors=1, algorithm='brute',
metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]

Shishir Pandey

2016-02-24 04:22:49 UTC

Permalink

Hi Jacob

I went through the code. The 'fit' method in nearest neighbors does not do
any distance calculations. It only initializes the class variables. In that
case this is probably not a bug.

--
sp

On Wed, Feb 24, 2016 at 12:26 AM, Jacob Vanderplas <

Post by Shishir Pandey
I have been experimenting with the above code. I have noticed the

Post by Shishir Pandey
1. If we set algorithm = 'brute' the algorithm does not enter the
function tan, i.e., putting a breakpoint at the print statement does not
stop execution on it during the fit method. It does however use this
function when using kneighbors method. I think one cannot use the user
defined metric with 'brute'.
This sounds like a bug â can you open an issue?
1. On the other hand if we set the algorithm = 'ball_tree' the
execution does go through the tan function during the fit method. But if
you see the values of x and y at this time it will be different from the
values of x and y that you entered. Clearly, the ball_tree algorithm is
doing some weird stuff. I don't think it is using the defined metric tan
for making the tree.

https://github.com/scikit-learn/scikit-learn/pull/6288

Post by Shishir Pandey
--
sp

from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print(y)
return 1
nbrs = NearestNeighbors(n_neighbors=1, algorithm='ball_tree', metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
[ 0.51786272 0.53042315 0.87815766 0.90239616 0.34253599 0.98631925
0.29768794 0.36593595 0.28956526 0.24720931]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 0.66666667 0.33333333 1. 0.66666667]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
It seems to be due to the partitioning via the ball tree algorithm; I
am not sure if this is intended. It would be nice to get some feedback on
this ...
from sklearn.neighbors import NearestNeighbors
import numpy as np
X = np.array([[1.0, 0.0, 1.0, 1.0], [0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0]])
print(y)
return 1
nbrs = NearestNeighbors(n_neighbors=1, algorithm='brute',
metric=tan).fit(X)
distances, indices = nbrs.kneighbors(X)
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]
[ 1. 0. 1. 1.]
[ 0. 0. 1. 0.]
[ 1. 1. 1. 1.]