Discussion:
[Scikit-learn-general] Mutual Info bases on nearest neighbors
Shishir Pandey
2016-02-11 01:07:13 UTC
Permalink
Hi

I want to estimate the mutual information based on nearest neighbor method:
http://arxiv.org/pdf/cond-mat/0305641.pdf


This requires me to use the max norm. For which I have defined a function
norm. Not I want Nearest neighbors to fit according to this norm and when I
find the kneighbors I want it to give me kneighbors based on this max norm
but instead I am getting results in Euclidean distances. How do I fix this?
Here is the class that I have created.


class MaxNormNN:
"""
Nearest neighbors based on max norm
"""

def __init__(self, x_dim, y_dim, x, y):
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]


def max_norm(self, z1, z2, ord = 1):
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])

def NNs(self):
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)



--
sp
Daniel Homola
2016-02-11 01:11:48 UTC
Permalink
Hi,

Mr Mayorov has done a great job and coded this up already:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py

If you want to do feature selection based on MI, check out the JMI method:
https://github.com/danielhomola/mifs

Cheers,
d
Post by Shishir Pandey
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a
function norm. Not I want Nearest neighbors to fit according to this
norm and when I find the kneighbors I want it to give me kneighbors
based on this max norm but instead I am getting results in Euclidean
distances. How do I fix this? Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Shishir Pandey
2016-02-11 03:10:29 UTC
Permalink
Thanks.

--
sp

On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola <
Post by Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
https://github.com/danielhomola/mifs
Cheers,
d
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a function
norm. Not I want Nearest neighbors to fit according to this norm and when I
find the kneighbors I want it to give me kneighbors based on this max norm
but instead I am getting results in Euclidean distances. How do I fix this?
Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Manoj Kumar
2016-02-11 03:48:59 UTC
Permalink
Hi,

In any case you can just supply metric='chebyshev' to do that for you in
NearestNeighbors.
Post by Shishir Pandey
Thanks.
--
sp
On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola <
Post by Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
https://github.com/danielhomola/mifs
Cheers,
d
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a function
norm. Not I want Nearest neighbors to fit according to this norm and when I
find the kneighbors I want it to give me kneighbors based on this max norm
but instead I am getting results in Euclidean distances. How do I fix this?
Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Manoj,
http://github.com/MechCoder
Shishir Pandey
2016-02-11 11:12:06 UTC
Permalink
Hi

I would like to know if :
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py

supports Y to be a matrix. From what I see it seems like Y can only be a
column vector.

--
sp
Post by Manoj Kumar
Hi,
In any case you can just supply metric='chebyshev' to do that for you in
NearestNeighbors.
Post by Shishir Pandey
Thanks.
--
sp
On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola <
Post by Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
https://github.com/danielhomola/mifs
Cheers,
d
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a
function norm. Not I want Nearest neighbors to fit according to this norm
and when I find the kneighbors I want it to give me kneighbors based on
this max norm but instead I am getting results in Euclidean distances. How
do I fix this? Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Manoj,
http://github.com/MechCoder
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Shishir Pandey
2016-02-11 11:26:50 UTC
Permalink
Hi Manoj

I am not using at the Chebyshev metric. If you see the code both X and Y
are vectors. And we create a new vector Z = (X, Y). But the norm we are
looking at is ||Z - Z'|| = max (||X - X'||, ||Y - Y'||) where each of the
sub parts are l1 norms between vectors X, X' and Y, Y'.



--
sp
Post by Daniel Homola
Hi
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
supports Y to be a matrix. From what I see it seems like Y can only be a
column vector.
--
sp
On Thu, Feb 11, 2016 at 9:18 AM, Manoj Kumar <
Post by Manoj Kumar
Hi,
In any case you can just supply metric='chebyshev' to do that for you in
NearestNeighbors.
Post by Shishir Pandey
Thanks.
--
sp
On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola <
Post by Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
https://github.com/danielhomola/mifs
Cheers,
d
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a
function norm. Not I want Nearest neighbors to fit according to this norm
and when I find the kneighbors I want it to give me kneighbors based on
this max norm but instead I am getting results in Euclidean distances. How
do I fix this? Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Manoj,
http://github.com/MechCoder
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Daniel Homola
2016-02-11 11:39:37 UTC
Permalink
Hi,

if your Y is continuous you can use Gael's code
(mutual_information(X,Y), k=3) here:
https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429
to estimate MI between two matrices.

if your Y is discrete, i.e. multiple class labellings for X, you can
only do this column by column separately. At least in my understanding.

Cheers,
d
Post by Daniel Homola
Hi
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
supports Y to be a matrix. From what I see it seems like Y can only be
a column vector.
--
sp
On Thu, Feb 11, 2016 at 9:18 AM, Manoj Kumar
Hi,
In any case you can just supply metric='chebyshev' to do that for
you in NearestNeighbors.
On Wed, Feb 10, 2016 at 10:10 PM, Shishir Pandey
Thanks.
--
sp
On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
If you want to do feature selection based on MI, check out
https://github.com/danielhomola/mifs
Cheers,
d
Post by Shishir Pandey
Hi
I want to estimate the mutual information based on
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have
defined a function norm. Not I want Nearest neighbors to
fit according to this norm and when I find the kneighbors
I want it to give me kneighbors based on this max norm
but instead I am getting results in Euclidean distances.
How do I fix this? Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Manoj,
http://github.com/MechCoder
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2016-02-11 11:54:24 UTC
Permalink
I didn't remember that I had put that code online :). The Internet is a
wonderful thing!

Gaƫl
if your Y is continuous you can use Gael's code (mutual_information(X,Y), k=3)
https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429
to estimate MI between two matrices.
if your Y is discrete, i.e. multiple class labellings for X, you can only do
this column by column separately. At least in my understanding.
Cheers,
d
Hi
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/
feature_selection/mutual_info_.py
supports Y to be a matrix. From what I see it seems like Y can only be a
column vector.
--
Gael Varoquaux
Researcher, INRIA Parietal
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-79-68
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
Shishir Pandey
2016-02-11 13:57:16 UTC
Permalink
Just to get an idea. Do any of the papers point out any problem as to why
the mutual information cannot be calculated for discrete valued Y matrix?

--
sp

On Thu, Feb 11, 2016 at 5:09 PM, Daniel Homola <
Hi,
if your Y is continuous you can use Gael's code (mutual_information(X,Y),
https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429
to estimate MI between two matrices.
if your Y is discrete, i.e. multiple class labellings for X, you can only
do this column by column separately. At least in my understanding.
Cheers,
d
Hi
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
supports Y to be a matrix. From what I see it seems like Y can only be a
column vector.
--
sp
On Thu, Feb 11, 2016 at 9:18 AM, Manoj Kumar <
Post by Manoj Kumar
Hi,
In any case you can just supply metric='chebyshev' to do that for you in
NearestNeighbors.
Post by Shishir Pandey
Thanks.
--
sp
On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola <
Post by Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
https://github.com/danielhomola/mifs
Cheers,
d
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a
function norm. Not I want Nearest neighbors to fit according to this norm
and when I find the kneighbors I want it to give me kneighbors based on
this max norm but instead I am getting results in Euclidean distances. How
do I fix this? Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Manoj,
http://github.com/MechCoder
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Shishir Pandey
2016-02-12 01:11:51 UTC
Permalink
Daniel

As you have pointed out on the comments section in:
https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429

That you are getting negative MI. How did you overcome this problem?

--
sp
Post by Shishir Pandey
Just to get an idea. Do any of the papers point out any problem as to why
the mutual information cannot be calculated for discrete valued Y matrix?
--
sp
On Thu, Feb 11, 2016 at 5:09 PM, Daniel Homola <
Hi,
if your Y is continuous you can use Gael's code (mutual_information(X,Y),
https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429
to estimate MI between two matrices.
if your Y is discrete, i.e. multiple class labellings for X, you can only
do this column by column separately. At least in my understanding.
Cheers,
d
Hi
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
supports Y to be a matrix. From what I see it seems like Y can only be a
column vector.
--
sp
On Thu, Feb 11, 2016 at 9:18 AM, Manoj Kumar <
Post by Manoj Kumar
Hi,
In any case you can just supply metric='chebyshev' to do that for you in
NearestNeighbors.
On Wed, Feb 10, 2016 at 10:10 PM, Shishir Pandey <
Post by Shishir Pandey
Thanks.
--
sp
On Thu, Feb 11, 2016 at 6:41 AM, Daniel Homola <
Post by Daniel Homola
Hi,
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/feature_selection/mutual_info_.py
https://github.com/danielhomola/mifs
Cheers,
d
Hi
http://arxiv.org/pdf/cond-mat/0305641.pdf
This requires me to use the max norm. For which I have defined a
function norm. Not I want Nearest neighbors to fit according to this norm
and when I find the kneighbors I want it to give me kneighbors based on
this max norm but instead I am getting results in Euclidean distances. How
do I fix this? Here is the class that I have created.
"""
Nearest neighbors based on max norm
"""
self.x_dim = x_dim
self.y_dim = y_dim
self.x = x
self.y = y
self. z = np.c_[x,y]
x_dist = np.linalg.norm(np.array(z1[:self.x_dim]) - \
np.array(z2[:self.x_dim]), ord = ord)
y_dist = np.linalg.norm(np.array(z1[self.x_dim:]) - \
np.array(z2[self.x_dim:]), ord = ord)
return np.max([x_dist, y_dist])
nn = NearestNeighbors(n_neighbors = 2, func = max_norm)
nn.fit(self.z)
# print nn.kneighbors(self.z)
--
sp
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Manoj,
http://github.com/MechCoder
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Daniel Homola
2016-03-11 12:06:37 UTC
Permalink
Hi all,

I'm using GraphLASSO to estimate the graphical model and precision
matrix of my variables. It is well known that GraphLASSO and related
methods are very sensitive to contaminated data and their estimates have
low break-down points:
http://arxiv.org/abs/1501.01219

As suggested by the authors in the above paper I'd like to use
GraphLASSO with robust correlation matrices to estimate the precision
matrix. When I looked at the source code of GraphLASSO however, it
seemed like there's no way to supply the fit method with a
pre-calculated correlation matrix, as it is calculated internally and
automatically using the empirical_covariance() method.

I know that if I rank my data column-wise before applying GraphLASSO
I'll get Spearman correlation. I tried this, and it already improves the
performance of GraphLASSO with outliers, so I'd advise adding this trick
to the documentation to raise awareness of this issue.

But the authors suggest the Gaussian rank correlation works even better
than Spearman, because it handles normally distributed variables better,
while still having high break-down point as Spearman. Therefore I'd like
to calculate a Gaussian rank correlation matrix and supply it to the
GraphLASSO method. Is there any way with the current implementation to
do this or should I rewrite the GraphLASSO class to make this possible?

Cheers,
Daniel

Loading...