Alejandro Weinstein

2011-11-07 02:45:57 UTC

Hi:

I am observing an unexpected behavior of Isomap, related to the

dimensions of the transformed data. If I generate random data, say

1000 points each with dimension 10, and fit a transform using as a

parameter out_dim=3, the fitted data has dimension (1000, 3), as

expected. However, when I repeat the same steps but this time using my

data set consisting of 427 points, each of dimension 400, the fitted

data has dimension (427, 2), i.e., the output dimension is 1 less than

out_dim. Using LLE with the same data set and parameters, the fitted

data has the expected dimension (427, 3).

The following code illustrate the phenomena:

#############################################

import numpy as np

from sklearn import manifold

n = 1000;

m = 10;

X = np.random.rand(n,m)

n_neighbors = 5

out_dim = 3

Y = manifold.Isomap(n_neighbors, out_dim).fit_transform(X)

print 'Using random data and Isomap'

print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)

X = np.load('X.npy')

Y = manifold.Isomap(n_neighbors, out_dim).fit_transform(X)

print

print 'Using the data X.npy and Isomap'

print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)

Y = manifold.LocallyLinearEmbedding(n_neighbors, out_dim).fit_transform(X)

print

print 'Using the data X.npy and LLE'

print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)

##################################################################

And this is the output:

Using random data and Isomap

X shape:(1000, 10), out_dim:3, Y shape: (1000, 3)

Using the data X.npy and Isomap

X shape:(427, 400), out_dim:3, Y shape: (427, 2)

Using the data X.npy and LLE

X shape:(427, 400), out_dim:3, Y shape: (427, 3)

The code and the data set is available at

https://github.com/aweinstein/scrapcode

In case it is relevant, the data set consist of documents represented

in the Latent Semantic Analysis space.

Is this the expected behavior of Isomap, or is there something wrong?

Alejandro.

I am observing an unexpected behavior of Isomap, related to the

dimensions of the transformed data. If I generate random data, say

1000 points each with dimension 10, and fit a transform using as a

parameter out_dim=3, the fitted data has dimension (1000, 3), as

expected. However, when I repeat the same steps but this time using my

data set consisting of 427 points, each of dimension 400, the fitted

data has dimension (427, 2), i.e., the output dimension is 1 less than

out_dim. Using LLE with the same data set and parameters, the fitted

data has the expected dimension (427, 3).

The following code illustrate the phenomena:

#############################################

import numpy as np

from sklearn import manifold

n = 1000;

m = 10;

X = np.random.rand(n,m)

n_neighbors = 5

out_dim = 3

Y = manifold.Isomap(n_neighbors, out_dim).fit_transform(X)

print 'Using random data and Isomap'

print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)

X = np.load('X.npy')

Y = manifold.Isomap(n_neighbors, out_dim).fit_transform(X)

print 'Using the data X.npy and Isomap'

print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)

Y = manifold.LocallyLinearEmbedding(n_neighbors, out_dim).fit_transform(X)

print 'Using the data X.npy and LLE'

print 'X shape:%s, out_dim:%d, Y shape: %s' % (X.shape, out_dim, Y.shape)

##################################################################

And this is the output:

Using random data and Isomap

X shape:(1000, 10), out_dim:3, Y shape: (1000, 3)

Using the data X.npy and Isomap

X shape:(427, 400), out_dim:3, Y shape: (427, 2)

Using the data X.npy and LLE

X shape:(427, 400), out_dim:3, Y shape: (427, 3)

The code and the data set is available at

https://github.com/aweinstein/scrapcode

In case it is relevant, the data set consist of documents represented

in the Latent Semantic Analysis space.

Is this the expected behavior of Isomap, or is there something wrong?

Alejandro.