Discussion:
GSoC - Improving GMM
(too old to reply)
Manoj Kumar
2014-01-18 19:30:11 UTC
Permalink
Hello,

I found this idea "Improving Gaussian Mixture Models" , repeating in 2012
and 2013, so I assumed this to be of real interest to the scikit-learn
community. I have a fundamental knowledge of Gaussian Models, and the EM
algorithm. I would like to take this project forward as part of GSoC. I
took a quick look at the issues tracker, and I found a number of issues.

I mailed Vlad (since his name was mentioned there as a mentor) and this is
what he had to say

"
Hey Manoj,

I just noticed I'm listed as a possible mentor for that. I think when I
put my name there I was thinking of HMM instead of GMM, oops!

I'm guessing that the module is not really maintained and it would be good
if somebody who is involved with GMMs actively would take it under their
wing.

I guess the point of the GSoC idea that was on there was that somebody
proposed
to do a GSoC project to implement coresets for GMM fitting (there are two
links
there). I have absolutely no experience with this method.
Of course in order to add a major new feature to a suboptimally maintained
model,
some refactoring needed to be listed as well.

Again, my feeling is that this idea came from a potential student and it
isn't a
burning need. What do you think about it?

Best,
Vlad
"
Can someone clearly explain, what the community expects out of such a
project, the project description ("Refurbish the current GMM code to put it
to the scikit's standards") in the wiki page, seems a bit vague to me.

Thanks.
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
Andy
2014-01-29 20:53:12 UTC
Permalink
Hey Manoj.
I agree that the description is vague.
I think what Vlad was trying to say that refurbishing only makes sense
if it comes with long-time support by an active user.

Basically, "refurbishing" means
- have a simple and sklearn-consistent interface
- be numerically stable, reliable and repeatable
- serve all feasible major usecases
- be easy to apply to the problems that people have in practice

While you could certainly do the first, and probably the second given
some familiarity,
doing the last two is hard if you are not using the method actively in
your day-to-day data mangling.
And even if the implementation was refurbished, but you are not around
afterwards, it is not clear who will be able
to maintain it.

I don't think implementing coresets is a good idea, because it is mostly
helpful for cluster computing afaik.
Also, it adds more abstractions on top of a suboptimal interface and
implementation.
Additionally, I would really like to limit the number of additional
estimators before 1.0.

If you feel up to the task of really making this a great implementation,
and also taking care of it in the long run,
please go ahead with the proposal. But I think that might be a bit much
to ask for a GSoC.

Cheers,
Andy

ps: only my opinion ;)
Post by Manoj Kumar
Hello,
I found this idea "Improving Gaussian Mixture Models" , repeating in
2012 and 2013, so I assumed this to be of real interest to the
scikit-learn community. I have a fundamental knowledge of Gaussian
Models, and the EM algorithm. I would like to take this project
forward as part of GSoC. I took a quick look at the issues tracker,
and I found a number of issues.
I mailed Vlad (since his name was mentioned there as a mentor) and
this is what he had to say
"
Hey Manoj,
I just noticed I'm listed as a possible mentor for that. I think when I
put my name there I was thinking of HMM instead of GMM, oops!
I'm guessing that the module is not really maintained and it would be good
if somebody who is involved with GMMs actively would take it under
their wing.
I guess the point of the GSoC idea that was on there was that somebody
proposed
to do a GSoC project to implement coresets for GMM fitting (there are
two links
there). I have absolutely no experience with this method.
Of course in order to add a major new feature to a suboptimally
maintained model,
some refactoring needed to be listed as well.
Again, my feeling is that this idea came from a potential student and
it isn't a
burning need. What do you think about it?
Best,
Vlad
"
Can someone clearly explain, what the community expects out of such a
project, the project description ("Refurbish the current GMM code to
put it to the scikit's standards") in the wiki page, seems a bit vague
to me.
Thanks.
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Manoj Kumar
2014-01-30 06:48:20 UTC
Permalink
Hi Andy.

Thanks for the response :)

I'm looking into the project ideas but I'm am unable to zero in on a single
idea for GSoC . My knowledge is limited to linear and clustering models,
however I am willing to learn and read the literature well before GSoC and
I am a pretty quick learner. It would be really nice if you or some of the
other sklearn devs, suggest a couple of more ideas (maybe 2 or 3 estimators
together or improving on existing estimators), that would help me write a
successful GSoC proposal.

Thanks again,
Post by Andy
Hey Manoj.
I agree that the description is vague.
I think what Vlad was trying to say that refurbishing only makes sense if
it comes with long-time support by an active user.
Basically, "refurbishing" means
- have a simple and sklearn-consistent interface
- be numerically stable, reliable and repeatable
- serve all feasible major usecases
- be easy to apply to the problems that people have in practice
While you could certainly do the first, and probably the second given some
familiarity,
doing the last two is hard if you are not using the method actively in
your day-to-day data mangling.
And even if the implementation was refurbished, but you are not around
afterwards, it is not clear who will be able
to maintain it.
I don't think implementing coresets is a good idea, because it is mostly
helpful for cluster computing afaik.
Also, it adds more abstractions on top of a suboptimal interface and
implementation.
Additionally, I would really like to limit the number of additional
estimators before 1.0.
If you feel up to the task of really making this a great implementation,
and also taking care of it in the long run,
please go ahead with the proposal. But I think that might be a bit much to
ask for a GSoC.
Cheers,
Andy
ps: only my opinion ;)
Hello,
I found this idea "Improving Gaussian Mixture Models" , repeating in
2012 and 2013, so I assumed this to be of real interest to the scikit-learn
community. I have a fundamental knowledge of Gaussian Models, and the EM
algorithm. I would like to take this project forward as part of GSoC. I
took a quick look at the issues tracker, and I found a number of issues.
I mailed Vlad (since his name was mentioned there as a mentor) and this
is what he had to say
"
Hey Manoj,
I just noticed I'm listed as a possible mentor for that. I think when I
put my name there I was thinking of HMM instead of GMM, oops!
I'm guessing that the module is not really maintained and it would be good
if somebody who is involved with GMMs actively would take it under their
wing.
I guess the point of the GSoC idea that was on there was that somebody
proposed
to do a GSoC project to implement coresets for GMM fitting (there are two
links
there). I have absolutely no experience with this method.
Of course in order to add a major new feature to a suboptimally maintained
model,
some refactoring needed to be listed as well.
Again, my feeling is that this idea came from a potential student and it
isn't a
burning need. What do you think about it?
Best,
Vlad
"
Can someone clearly explain, what the community expects out of such a
project, the project description ("Refurbish the current GMM code to put it
to the scikit's standards") in the wiki page, seems a bit vague to me.
Thanks.
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends. Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
Andy
2014-02-02 21:42:30 UTC
Permalink
Hi Manoj.
Unfortunately I can not give you any advice at the moment. I am way to
swamped to take care of GSoC :-/
I think in both clustering and linear models there is a lot of room for
improvement.
For clustering there was BIRCH (that's the name, right?) that I think
Olivier wanted to implement. Maybe that would be an
interesting GSoC project? You'd have to ask Olivier and possibly Gael,
though.
Gael is also working on some more agglomerative clustering algorithms,
I'm not sure what the status is there.

What kind of linear models did you work on? I think Alex wanted to
improve the Bayesian linear models.

Sorry I can't be of more help.

Andy
Post by Manoj Kumar
Hi Andy.
Thanks for the response :)
I'm looking into the project ideas but I'm am unable to zero in on a
single idea for GSoC . My knowledge is limited to linear and
clustering models, however I am willing to learn and read the
literature well before GSoC and I am a pretty quick learner. It would
be really nice if you or some of the other sklearn devs, suggest a
couple of more ideas (maybe 2 or 3 estimators together or improving on
existing estimators), that would help me write a successful GSoC proposal.
Thanks again,
Hey Manoj.
I agree that the description is vague.
I think what Vlad was trying to say that refurbishing only makes
sense if it comes with long-time support by an active user.
Basically, "refurbishing" means
- have a simple and sklearn-consistent interface
- be numerically stable, reliable and repeatable
- serve all feasible major usecases
- be easy to apply to the problems that people have in practice
While you could certainly do the first, and probably the second
given some familiarity,
doing the last two is hard if you are not using the method
actively in your day-to-day data mangling.
And even if the implementation was refurbished, but you are not
around afterwards, it is not clear who will be able
to maintain it.
I don't think implementing coresets is a good idea, because it is
mostly helpful for cluster computing afaik.
Also, it adds more abstractions on top of a suboptimal interface
and implementation.
Additionally, I would really like to limit the number of
additional estimators before 1.0.
If you feel up to the task of really making this a great
implementation, and also taking care of it in the long run,
please go ahead with the proposal. But I think that might be a bit
much to ask for a GSoC.
Cheers,
Andy
ps: only my opinion ;)
Post by Manoj Kumar
Hello,
I found this idea "Improving Gaussian Mixture Models" , repeating
in 2012 and 2013, so I assumed this to be of real interest to the
scikit-learn community. I have a fundamental knowledge of
Gaussian Models, and the EM algorithm. I would like to take this
project forward as part of GSoC. I took a quick look at the
issues tracker, and I found a number of issues.
I mailed Vlad (since his name was mentioned there as a mentor)
and this is what he had to say
"
Hey Manoj,
I just noticed I'm listed as a possible mentor for that. I think when I
put my name there I was thinking of HMM instead of GMM, oops!
I'm guessing that the module is not really maintained and it would be good
if somebody who is involved with GMMs actively would take it
under their wing.
I guess the point of the GSoC idea that was on there was that
somebody proposed
to do a GSoC project to implement coresets for GMM fitting (there
are two links
there). I have absolutely no experience with this method.
Of course in order to add a major new feature to a suboptimally
maintained model,
some refactoring needed to be listed as well.
Again, my feeling is that this idea came from a potential student
and it isn't a
burning need. What do you think about it?
Best,
Vlad
"
Can someone clearly explain, what the community expects out of
such a project, the project description ("Refurbish the current
GMM code to put it to the scikit's standards") in the wiki page,
seems a bit vague to me.
Thanks.
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends. Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends. Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2014-02-03 10:47:12 UTC
Permalink
Post by Manoj Kumar
I'm looking into the project ideas but I'm am unable to zero in on a single
idea for GSoC . My knowledge is limited to linear and clustering models,
however I am willing to learn and read the literature well before GSoC and I am
a pretty quick learner. It would be really nice if you or some of the other
sklearn devs, suggest a couple of more ideas (maybe 2 or 3 estimators together
or improving on existing estimators), that would help me write a successful
GSoC proposal.
I think that linear models could use a lot of improvements in
scikit-learn. Amongst other things, the 'strong rules' could and should
be included in scikit-learn. Moreover, our coordinnate descent code can
probably be made more optimal by choosing the coordinates it optimizes in
better ways.

I would personnally be excited by a GSOC that makes linear models faster,
and I know that there is a lot of room for improvements.

Specifically, for linear models, I think that the following are
improvements that I would like to see (in increasing order of
difficulty):

- Better strategy to choose coordinates in the coordinate descent.
Chances are the simply randomizing the choice would be better than the
linear traversal that we are doing. My personnal bias would be to
benchmark the performances on "wide data": many features, not many
samples, as in bioinformatics; and I would be particularly interested
in neuroimaging data.

- Strong rules

- Integrating part of Mathieu Blondel's lightning. Mathieu would need to
be heavily implicated here.

However, we have already had a student on a similar project (specifically
the strong rules) and the strong rules never got merged in because the
maths were too hard for the student, and he wasn't able to implement them
cleanly.

Manoj, you are motivated, and you have been producing code. I like that
aspect a lot. I think that you should try and play with the parts of the
codebase that you might be interested in modifying in a GSOC to see if
you master them well enough to propose an enhancement. The more you try
to understand and improve the current codebase, the more you read the
corresponding literature, the more convincing you will sound. Part of
preparing the GSOC is making sure that you can specify a project were you
know where to go and you feel you can be successful.

If I feel that you are up to the task, I could be motivated by mentoring
you in a GSOC on linear models.

Cheers,

Gaël
Manoj Kumar
2014-02-04 07:39:44 UTC
Permalink
Hi Gael,

Thanks for considering mentoring me as part of GSoC.
Post by Gael Varoquaux
- Better strategy to choose coordinates in the coordinate descent.
Chances are the simply randomizing the choice would be better than the
linear traversal that we are doing. My personnal bias would be to
benchmark the performances on "wide data": many features, not many
samples, as in bioinformatics; and I would be particularly interested
in neuroimaging data.
Would you be able to tell me which part of the code-base I should play
with for this project? (I'm assuming it is the cython code in
coordinate_descent.pyx) Some references to literature would definitely help.
Post by Gael Varoquaux
- Strong rules
Alex had provided me a link to this gist,
https://gist.github.com/fabianp/3097107 . Sorry for sounding dumb, but is
this one of the "strong rules"?


And one last question, what about generalized additive models? Would that
be a good GSoC project to do?


Thanks.
Alexandre Gramfort
2014-02-04 08:04:00 UTC
Permalink
Would you be able to tell me which part of the code-base I should play with
for this project? (I'm assuming it is the cython code in
coordinate_descent.pyx)
yes.
Some references to literature would definitely help.
you have a few on the wiki page.

you can also start from wikipedia:

http://en.wikipedia.org/wiki/Coordinate_descent
http://en.wikipedia.org/wiki/Random_coordinate_descent

you can try to write the maths to see if you can find the update rules.
Post by Gael Varoquaux
- Strong rules
Alex had provided me a link to this gist,
https://gist.github.com/fabianp/3097107 . Sorry for sounding dumb, but is
this one of the "strong rules"?
yes
And one last question, what about generalized additive models? Would that be
a good GSoC project to do?
I am +0 on this now. I suggested MARS/EARTH as there is already some code
which would facilitate success.

Alex
Gael Varoquaux
2014-02-04 08:28:55 UTC
Permalink
Post by Alexandre Gramfort
Post by Manoj Kumar
Alex had provided me a link to this gist,
https://gist.github.com/fabianp/3097107 . Sorry for sounding dumb, but is
this one of the "strong rules"?
yes
http://arxiv.org/pdf/1011.2234
Post by Alexandre Gramfort
Post by Manoj Kumar
And one last question, what about generalized additive models? Would that be
a good GSoC project to do?
I am +0 on this now. I suggested MARS/EARTH as there is already some code
which would facilitate success.
I would personnally be more excited about merging in the fast logistic
regression and SVM from lightning https://github.com/mblondel/lightning.

G
Nick Pentreath
2014-02-04 08:32:12 UTC
Permalink
That does seem like it would be a very worthwhile project - but why was
lightning outside scikit-learn initially? Are some of the algorithms too
cutting edge or not cited enough, or some other reason?



On Tue, Feb 4, 2014 at 10:28 AM, Gael Varoquaux <
Post by Manoj Kumar
Post by Alexandre Gramfort
Post by Manoj Kumar
Alex had provided me a link to this gist,
https://gist.github.com/fabianp/3097107 . Sorry for sounding dumb,
but is
Post by Alexandre Gramfort
Post by Manoj Kumar
this one of the "strong rules"?
yes
http://arxiv.org/pdf/1011.2234
Post by Alexandre Gramfort
Post by Manoj Kumar
And one last question, what about generalized additive models? Would
that be
Post by Alexandre Gramfort
Post by Manoj Kumar
a good GSoC project to do?
I am +0 on this now. I suggested MARS/EARTH as there is already some code
which would facilitate success.
I would personnally be more excited about merging in the fast logistic
regression and SVM from lightning https://github.com/mblondel/lightning.
G
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2014-02-04 08:34:01 UTC
Permalink
Are some of the algorithms too cutting edge or not cited enough,
Yes
or some other reason?
I think that it is good practice to explore new ideas outside of
scikit-learn. It usually takes a lot of effort and time to figure out
whether an approach will be a lot of benefits or not.

Gaël
Post by Alexandre Gramfort
Post by Manoj Kumar
Alex had provided me a link to this gist,
https://gist.github.com/fabianp/3097107 . Sorry for sounding dumb, but
is
Post by Alexandre Gramfort
Post by Manoj Kumar
this one of the "strong rules"?
yes
http://arxiv.org/pdf/1011.2234
Post by Alexandre Gramfort
Post by Manoj Kumar
And one last question, what about generalized additive models? Would
that be
Post by Alexandre Gramfort
Post by Manoj Kumar
a good GSoC project to do?
I am +0 on this now. I suggested MARS/EARTH as there is already some code
which would facilitate success.
I would personnally be more excited about merging in the fast logistic
regression and SVM from lightning https://github.com/mblondel/lightning.
G
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/
ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Gael Varoquaux
Researcher, INRIA Parietal
Laboratoire de Neuro-Imagerie Assistee par Ordinateur
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-79-68
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
Nick Pentreath
2014-02-04 09:03:23 UTC
Permalink
Makes sense


Well from what I understand, just getting the multi-class logistic and svm loss and the group lasso penalty into scikit-learn seems like a worthwhile undertaking.
—
Sent from Mailbox for iPhone

On Tue, Feb 4, 2014 at 10:34 AM, Gael Varoquaux
Post by Gael Varoquaux
Are some of the algorithms too cutting edge or not cited enough,
Yes
or some other reason?
I think that it is good practice to explore new ideas outside of
scikit-learn. It usually takes a lot of effort and time to figure out
whether an approach will be a lot of benefits or not.
Gaël
Post by Alexandre Gramfort
Post by Manoj Kumar
Alex had provided me a link to this gist,
https://gist.github.com/fabianp/3097107 . Sorry for sounding dumb, but
is
Post by Alexandre Gramfort
Post by Manoj Kumar
this one of the "strong rules"?
yes
http://arxiv.org/pdf/1011.2234
Post by Alexandre Gramfort
Post by Manoj Kumar
And one last question, what about generalized additive models? Would
that be
Post by Alexandre Gramfort
Post by Manoj Kumar
a good GSoC project to do?
I am +0 on this now. I suggested MARS/EARTH as there is already some code
which would facilitate success.
I would personnally be more excited about merging in the fast logistic
regression and SVM from lightning https://github.com/mblondel/lightning.
G
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/
ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Gael Varoquaux
Researcher, INRIA Parietal
Laboratoire de Neuro-Imagerie Assistee par Ordinateur
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-79-68
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Manoj Kumar
2014-02-04 14:21:17 UTC
Permalink
A really naive question in Cython. Lets say I make some changes in
cd_fast.pyx, and I want to debug something by printing it out, how do I do
it?

I tried recompiling from my top level directory using "sudo python setup.py
build_ext --inplace" , but it doesn't seem to print when I run my script.
What am I doing wrong?
Gael Varoquaux
2014-02-04 14:23:10 UTC
Permalink
Post by Manoj Kumar
I tried recompiling from my top level directory using "sudo python setup.py
build_ext --inplace" , but it doesn't seem to print when I run my script.
What am I doing wrong?
You need to regenerate the c files from the cython, by doing 'cython
cd_fast.pyx' in the corresponding directory.

G
Andy
2014-02-04 21:00:25 UTC
Permalink
Post by Manoj Kumar
I tried recompiling from my top level directory using "sudo python setup.py
build_ext --inplace" , but it doesn't seem to print when I run my script.
What am I doing wrong?
Usually just running the setup.py inside the folder you changed is
enough and makes compilation much faster.
Manoj Kumar
2014-02-05 04:57:12 UTC
Permalink
Hi,

I went through the enet_coordinate_descent function in cd_fast.pyx. I have
some questions which are noobish but I'll go ahead and ask them anyway.

It seems in L176 in each cycle, each omega_j is updated as

[image: \frac{\omega_{j}\sum_{i = 1}^n(X_{i}^j)^2 - \alpha + \sum_{i = 1}^n
(y_{i} - X'\omega)(X_{j}^i)}{\sum_{i = 1}^n X_{i}^j+ \beta}]
...1]

when the term other than alpha in the numerator is greater than, alpha
(correct me if I'm wrong)



When I went through the wikipedia article, and from my previous knowledge,
don't we just do partial derivative of the cost function with respect to
omega_{j} and equate it to zero for one cycle of iterations.


The cost function is
1 norm(y - X w, 2)^2 + alpha norm(w, 1) + beta norm(w, 2)^2
- ----
2 2

If we differentiate this with respect to w and equate to zero, shouldn't we
get something like

[image: -\frac{\alpha + \sum_{i = 1}^n (y_{i} - X'\omega)(X_{j}^i)}{\beta}]

I don't understand where I am going wrong
Manoj Kumar
2014-02-05 05:02:44 UTC
Permalink
I'm sorry it should just be

[image: \frac{\omega_{j}\sum_{i = 1}^n(X_{i}^j)^2 - \alpha + \sum_{i = 1}^n
(y_{i} - X'\omega)(X_{j}^i)}{\sum_{i = 1}^n (X_{i}^j)^2+ \beta}]


in the first equation
Post by Manoj Kumar
Hi,
I went through the enet_coordinate_descent function in cd_fast.pyx. I have
some questions which are noobish but I'll go ahead and ask them anyway.
It seems in L176 in each cycle, each omega_j is updated as
[image: \frac{\omega_{j}\sum_{i = 1}^n(X_{i}^j)^2 - \alpha + \sum_{i =
1}^n (y_{i} - X'\omega)(X_{j}^i)}{\sum_{i = 1}^n X_{i}^j+ \beta}]
...1]
when the term other than alpha in the numerator is greater than, alpha
(correct me if I'm wrong)
When I went through the wikipedia article, and from my previous knowledge,
don't we just do partial derivative of the cost function with respect to
omega_{j} and equate it to zero for one cycle of iterations.
The cost function is
1 norm(y - X w, 2)^2 + alpha norm(w, 1) + beta norm(w, 2)^2
- ----
2 2
If we differentiate this with respect to w and equate to zero, shouldn't
we get something like
[image: -\frac{\alpha + \sum_{i = 1}^n (y_{i} - X'\omega)(X_{j}^i)}{\beta}]
I don't understand where I am going wrong
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
Issam
2014-02-05 12:02:24 UTC
Permalink
Hi Scikit reviewers,

I have been working with scikit-learn for three pull requests - namely,
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.

For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.

How will this suit for GSoC?

I thank you earnestly in advance...

Best Regards,
--Issam
Gael Varoquaux
2014-02-05 15:30:12 UTC
Permalink
Post by Issam
I have been working with scikit-learn for three pull requests - namely,
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished before
the GSoC. Actually, I was hoping that it could be finished before next
release.

For the rest, I'll let someone who knows neural nets better than me
reply, as I don't know the state of the art, and I don't know what is
feasible in deep learning without GPUs.


Cheers,

Gaël
Kyle Kastner
2014-02-05 17:40:02 UTC
Permalink
Not to bandwagon extra things on this particular effort, but one future
consideration is that if scikit-learn supported multilayer neural networks,
and eventually multilayer convolutional neural networks, it would become
feasible to load pretrained nets ALA OverFeat, DeCAF (recent papers with
sweet results) and use them as transforms.

I am doubtful about the ability to train a reasonably deep neural network
without GPU, specialized hardware, or a server, but I think loading
pretrained coefficients and using them as a transform is very reasonable.
It may be too "messy" for adoption in scikit-learn, but an adapter layer
could be very useful - I know this is basically what I and other
competitors used for a recent kaggle competition with great success.

Kyle


On Wed, Feb 5, 2014 at 9:30 AM, Gael Varoquaux <
Post by Gael Varoquaux
Post by Issam
I have been working with scikit-learn for three pull requests - namely,
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished before
the GSoC. Actually, I was hoping that it could be finished before next
release.
For the rest, I'll let someone who knows neural nets better than me
reply, as I don't know the state of the art, and I don't know what is
feasible in deep learning without GPUs.
Cheers,
Gaël
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Andy
2014-02-05 18:54:14 UTC
Permalink
Post by Kyle Kastner
Not to bandwagon extra things on this particular effort, but one
future consideration is that if scikit-learn supported multilayer
neural networks, and eventually multilayer convolutional neural
networks, it would become feasible to load pretrained nets ALA
OverFeat, DeCAF (recent papers with sweet results) and use them as
transforms.
I am doubtful about the ability to train a reasonably deep neural
network without GPU, specialized hardware, or a server, but I think
loading pretrained coefficients and using them as a transform is very
reasonable. It may be too "messy" for adoption in scikit-learn, but an
adapter layer could be very useful - I know this is basically what I
and other competitors used for a recent kaggle competition with great
success.
That was convolutional nets for images, right? Or did you have some
other pretrained net?
I don't think convnets for images are in the scope, and an adaptor
should rather live in a project like DeCAF.
Andy
2014-02-05 18:58:37 UTC
Permalink
Post by Gael Varoquaux
Post by Issam
I have been working with scikit-learn for three pull requests - namely,
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
+1
Post by Gael Varoquaux
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished before
the GSoC. Actually, I was hoping that it could be finished before next
release.
I'm also still hopeful there.
Unfortunately I will definitely be unable to mentor.

About pretraining: that is really out of style now ;)
Afaik "everybody" is now doing purely supervised training using drop-out.

Implementing pretrained deep nets should be fairly easy for a user if we
support more than one hidden layer,
but just doing a pipeline of RBMs / Autoencoders. As that is not that
popular any more, I don't think we should put much effort there.

Deeper nets might be interesting but I'm quite sceptical about doing
without GPUs.

On the other hand I think it should be possible for you to find a topic
around these general concepts.
But I'm not sure who could mentor.

Cheers,
Andy
Thomas Johnson
2014-02-05 19:17:38 UTC
Permalink
Apologies if this is slightly offtopic, but is there a high-quality Python
implementation of DropOut / DropConnect available somewhere?
Post by Andy
Post by Gael Varoquaux
Post by Issam
I have been working with scikit-learn for three pull requests - namely,
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
+1
Post by Gael Varoquaux
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished before
the GSoC. Actually, I was hoping that it could be finished before next
release.
I'm also still hopeful there.
Unfortunately I will definitely be unable to mentor.
About pretraining: that is really out of style now ;)
Afaik "everybody" is now doing purely supervised training using drop-out.
Implementing pretrained deep nets should be fairly easy for a user if we
support more than one hidden layer,
but just doing a pipeline of RBMs / Autoencoders. As that is not that
popular any more, I don't think we should put much effort there.
Deeper nets might be interesting but I'm quite sceptical about doing
without GPUs.
On the other hand I think it should be possible for you to find a topic
around these general concepts.
But I'm not sure who could mentor.
Cheers,
Andy
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
abhishek
2014-02-05 19:23:25 UTC
Permalink
Hi all,

As this is the topic for neural networks extension in scikit-learn for
GSoC, I'd like to ask if the GSoC projects can be done in groups of two as
I'm interesting in developing extensions but it would be great to have some
help from @issam.

Regards,
Abhishek
Post by Thomas Johnson
Apologies if this is slightly offtopic, but is there a high-quality Python
implementation of DropOut / DropConnect available somewhere?
Post by Issam
Post by Gael Varoquaux
Post by Issam
I have been working with scikit-learn for three pull requests -
namely,
Post by Gael Varoquaux
Post by Issam
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
+1
Post by Gael Varoquaux
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished
before
Post by Gael Varoquaux
the GSoC. Actually, I was hoping that it could be finished before next
release.
I'm also still hopeful there.
Unfortunately I will definitely be unable to mentor.
About pretraining: that is really out of style now ;)
Afaik "everybody" is now doing purely supervised training using drop-out.
Implementing pretrained deep nets should be fairly easy for a user if we
support more than one hidden layer,
but just doing a pipeline of RBMs / Autoencoders. As that is not that
popular any more, I don't think we should put much effort there.
Deeper nets might be interesting but I'm quite sceptical about doing
without GPUs.
On the other hand I think it should be possible for you to find a topic
around these general concepts.
But I'm not sure who could mentor.
Cheers,
Andy
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Nicholas Dronen
2014-02-05 19:32:38 UTC
Permalink
Hi, Thomas:

Pylearn2 supports dropout:


https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/costs/mlp/dropout.py

Regards,

Nick


On Wed, Feb 5, 2014 at 12:17 PM, Thomas Johnson
Post by Thomas Johnson
Apologies if this is slightly offtopic, but is there a high-quality Python
implementation of DropOut / DropConnect available somewhere?
Post by Issam
Post by Gael Varoquaux
Post by Issam
I have been working with scikit-learn for three pull requests -
namely,
Post by Gael Varoquaux
Post by Issam
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
+1
Post by Gael Varoquaux
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm for
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished
before
Post by Gael Varoquaux
the GSoC. Actually, I was hoping that it could be finished before next
release.
I'm also still hopeful there.
Unfortunately I will definitely be unable to mentor.
About pretraining: that is really out of style now ;)
Afaik "everybody" is now doing purely supervised training using drop-out.
Implementing pretrained deep nets should be fairly easy for a user if we
support more than one hidden layer,
but just doing a pipeline of RBMs / Autoencoders. As that is not that
popular any more, I don't think we should put much effort there.
Deeper nets might be interesting but I'm quite sceptical about doing
without GPUs.
On the other hand I think it should be possible for you to find a topic
around these general concepts.
But I'm not sure who could mentor.
Cheers,
Andy
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Kyle Kastner
2014-02-05 20:36:06 UTC
Permalink
I can say that pylearn2 does NOT (in the main branch, at least) have an
implementation of DropConnect - only dropout as Nick mentioned. A tutorial
on using DropConnect is here:
http://fastml.com/regularizing-neural-networks-with-dropout-and-with-dropconnect/

Kyle
Post by Nicholas Dronen
https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/costs/mlp/dropout.py
Regards,
Nick
On Wed, Feb 5, 2014 at 12:17 PM, Thomas Johnson <
Post by Thomas Johnson
Apologies if this is slightly offtopic, but is there a high-quality
Python implementation of DropOut / DropConnect available somewhere?
Post by Issam
Post by Gael Varoquaux
Post by Issam
I have been working with scikit-learn for three pull requests -
namely,
Post by Gael Varoquaux
Post by Issam
Multi-layer Perceptron (MLP), Sparse Auto-encoders, and Gaussian
Restricted Boltzmann Machines.
Yes, you have been doing good work here!
+1
Post by Gael Varoquaux
Post by Issam
For the upcoming GSoC, I propose to ensure completing these three pull
requests. I also would develop Greedy layer-wise training algorithm
for
Post by Gael Varoquaux
Post by Issam
deep learning, extending MLP to allow for more than one hidden layer,
where weights are initialized using Sparse Auto-encoders or RBM.
How will this suit for GSoC?
The MLP is almost finished. I would hope that it would be finished
before
Post by Gael Varoquaux
the GSoC. Actually, I was hoping that it could be finished before next
release.
I'm also still hopeful there.
Unfortunately I will definitely be unable to mentor.
About pretraining: that is really out of style now ;)
Afaik "everybody" is now doing purely supervised training using drop-out.
Implementing pretrained deep nets should be fairly easy for a user if we
support more than one hidden layer,
but just doing a pipeline of RBMs / Autoencoders. As that is not that
popular any more, I don't think we should put much effort there.
Deeper nets might be interesting but I'm quite sceptical about doing
without GPUs.
On the other hand I think it should be possible for you to find a topic
around these general concepts.
But I'm not sure who could mentor.
Cheers,
Andy
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Issam
2014-02-24 20:47:07 UTC
Permalink
Post by Andy
I'm also still hopeful there.
Unfortunately I will definitely be unable to mentor.
About pretraining: that is really out of style now ;)
Afaik "everybody" is now doing purely supervised training using drop-out.
Implementing pretrained deep nets should be fairly easy for a user if we
support more than one hidden layer,
but just doing a pipeline of RBMs / Autoencoders. As that is not that
popular any more, I don't think we should put much effort there.
Very True, it is all about having a stack of hidden layers :)
Post by Andy
On the other hand I think it should be possible for you to find a topic
around these general concepts.
I am working on extending Extreme Learning Machine in my thesis, I think
that would be a good
It differs from Backpropagation in that, instead of running newton's
gradient descent for finding the weights minimizing the objective
function, it uses least-squares for the minimum. This means that it is
much faster. While we don't hear about ELMs much, it is in fact highly
cited.


*Extreme learning machine*: theory and applications
<http://www.sciencedirect.com/science/article/pii/S0925231206000385>
has 1285 citations and it got published in 2006; a large number
of citations for a fairly recent article. I believe scikit-learn
could add such an interesting learning algorithm along with its
variations (weighted ELMs, sequential ELMS, etc.)
Post by Andy
Not to bandwagon extra things on this particular effort, but one
future consideration is that if scikit-learn supported multilayer
neural networks, and eventually multilayer convolutional neural
networks, it would become feasible to load pretrained nets ALA
OverFeat, DeCAF (recent papers with sweet results) and use them as
transforms.
I have implemented convolutional neural networks, but, like you said, it
is not feasible without GPUs. What you mentioned
sounds like a great opportunity to make NNs attractive for huge
datasets, but it seems there isn't someone willing to mentor for an NN
project.
Post by Andy
Afaik "everybody" is now doing purely supervised training using drop-out.
Interesting how drop-out gained momentum when it is fairly recent :).
Since I have a basic version of the algorithm, I can work on it in the GSoC.

Chances are the Multi-layer perceptron PR would be completed before the
summer, so it won't be included in the GSoC proposal.

In order not to get into a scope creep, I compiled the following list of
algorithms to be proposed for the GSoC 2014,

1) Extreme Learning Machines
(http://sentic.net/extreme-learning-machines.pdf)
1a) Weighted Extreme Learning Machines
1b) Sequential Extreme Learning machines

2) Completing Sparse Auto-encoders

3) Extending MLP to support multiple hidden layers
3a) Deep Belief Network

Thank you very much!
Gael Varoquaux
2014-02-25 06:52:50 UTC
Permalink
Post by Issam
I am working on extending Extreme Learning Machine in my thesis, I
think that would be a good It differs from Backpropagation in that,
instead of running newton's gradient descent for finding the weights
minimizing the objective function, it uses least-squares for the
minimum. This means that it is much faster. While we don't hear about
ELMs much, it is in fact highly cited.
Extreme learning machine: theory and applications has 1285 citations
and it got published in 2006; a large number of citations for a fairly
recent article. I believe scikit-learn could add such an interesting
learning algorithm along with its variations (weighted ELMs, sequential
ELMS, etc.)
It does sound like a possible candidate for inclusion.
Post by Issam
Chances are the Multi-layer perceptron PR would be completed before the
summer, so it won't be included in the GSoC proposal.
In order not to get into a scope creep, I compiled the following list of
algorithms to be proposed for the GSoC 2014,
1) Extreme Learning Machines (http://sentic.net/extreme-learning-machines.pdf)
1a) Weighted Extreme Learning Machines
1b) Sequential Extreme Learning machines
2) Completing Sparse Auto-encoders
3) Extending MLP to support multiple hidden layers
3a) Deep Belief Network
Sound reasonnable. I think that you should open a wiki page or an issue,
or some document where we can keep track of this info and work on
building a full proposal.

Cheers,

Gaël
Lars Buitinck
2014-02-26 12:29:43 UTC
Permalink
Post by Gael Varoquaux
Post by Issam
Extreme learning machine: theory and applications has 1285 citations
and it got published in 2006; a large number of citations for a fairly
recent article. I believe scikit-learn could add such an interesting
learning algorithm along with its variations (weighted ELMs, sequential
ELMS, etc.)
It does sound like a possible candidate for inclusion.
We have a PR that implements them, but in too convoluted a way. My
personal choice for implementing these would be a transformer doing a
random projection + nonlinear activation. That way, you can stack any
linear model on top (think SGDClassifier for large-scale work) and get
a basic ELM. I've toyed with this variant before (typing this from
memory):

class RandomHiddenLayer(BaseEstimator, TransformerMixin):
def __init__(self, n_components=100, random_state=None):
self.n_components = n_components
self.random_state = random_state
def fit(self, X, y=None):
random_state = check_random_state(self.random_state)
self.components_ = random_state.randn(n_components, X.shape[1])
return self
def transform(self, X):
return np.tanh(safe_sparse_dot(X, self.components_.T))

Now, make_pipeline(RandomHiddenLayer(), SGDClassifier()) is an ELM
except with regularized hinge loss instead of least squares. I guess
LDA can be used to get the "real" ELM.

I recently implemented baseline RBF networks in pretty much the same
way: k-means + RBF kernel + linear classifier. I didn't submit a PR
because it's just a pipeline of existing components.
Post by Gael Varoquaux
Post by Issam
Chances are the Multi-layer perceptron PR would be completed before the
summer, so it won't be included in the GSoC proposal.
In order not to get into a scope creep, I compiled the following list of
algorithms to be proposed for the GSoC 2014,
1) Extreme Learning Machines (http://sentic.net/extreme-learning-machines.pdf)
1a) Weighted Extreme Learning Machines
1b) Sequential Extreme Learning machines
Does sequential mean for sequence data?
Gael Varoquaux
2014-02-26 12:32:08 UTC
Permalink
Post by Lars Buitinck
I recently implemented baseline RBF networks in pretty much the same
way: k-means + RBF kernel + linear classifier. I didn't submit a PR
because it's just a pipeline of existing components.
All your points about transformers and pipelines are true and good
points. Part of the work for 'deep learning' in scikit-learn is
documentation and example to exihibit these patterns better.

G
Vlad Niculae
2014-02-26 12:40:18 UTC
Permalink
Post by Gael Varoquaux
documentation and example
This was exactly my thought. Many such (near-)equivalences are not
obvious, especially
for beginners. If Lars's hinge ELM and RBF network would work well (or
provide
interesting feature visualisations) on some sklearn.dataset, an example
would be
very awesome.

The KMeans + sparse coding transformer that was lying around in a PR
might also be
expressible as a pipeline I guess.

Vlad
Post by Gael Varoquaux
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Lars Buitinck
2014-02-26 12:44:50 UTC
Permalink
Post by Vlad Niculae
This was exactly my thought. Many such (near-)equivalences are not
obvious, especially
for beginners. If Lars's hinge ELM and RBF network would work well (or
provide
interesting feature visualisations) on some sklearn.dataset, an example
would be
very awesome.
ELM on digits works extremely well: https://gist.github.com/larsmans/2493300
Mathieu Blondel
2014-02-26 14:13:40 UTC
Permalink
+1 for an RBF network transformer (with an option to choose between k-means
and random sampling).

Mathieu
Post by Vlad Niculae
Post by Gael Varoquaux
documentation and example
This was exactly my thought. Many such (near-)equivalences are not
obvious, especially
for beginners. If Lars's hinge ELM and RBF network would work well (or
provide
interesting feature visualisations) on some sklearn.dataset, an example
would be
very awesome.
The KMeans + sparse coding transformer that was lying around in a PR
might also be
expressible as a pipeline I guess.
Vlad
------------------------------------------------------------------------------
Post by Gael Varoquaux
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
Post by Gael Varoquaux
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Issam
2014-02-26 12:42:50 UTC
Permalink
Post by Lars Buitinck
We have a PR that implements them, but in too convoluted a way. My
personal choice for implementing these would be a transformer doing a
random projection + nonlinear activation. That way, you can stack any
linear model on top (think SGDClassifier for large-scale work) and get
a basic ELM. I've toyed with this variant before (typing this from
Yes, I am aware of the PR; it is too complex like you said; the
algorithm can be developed in a much simpler way, however :).
Post by Lars Buitinck
Part of the work for 'deep learning' in scikit-learn is
documentation and example to exihibit these patterns better.
Indeed, neural networks deals a lot with stacking up kernels and
classifiers on top of one another, it would be prudent to devote one
complete section
in Neural Networks for this. Or perhaps special pipelines to simplify
such common tasks.
Post by Lars Buitinck
Does sequential mean for sequence data?
Yes, by sequential I mean, training on the dataset in small batches,
equivalent to "partial_fit". It would allow ELMs to work on datasets of
over million row matrcies.
Post by Lars Buitinck
I think that you should open a wiki page or an issue,
or some document where we can keep track of this info and work on
building a full proposal.
That's a really good idea; I will create a wiki page describing the
proposal in much more details, before submitting it :).

Thank you
~I
Gael Varoquaux
2014-02-26 12:51:51 UTC
Permalink
Or perhaps special pipelines to simplify such common tasks.
I'd rather avoid special pipelines. For we, that would mean that we have
an API problem with the pipeline, that needs to be identified and
solved.

G
Lars Buitinck
2014-02-26 12:55:11 UTC
Permalink
Post by Gael Varoquaux
Or perhaps special pipelines to simplify such common tasks.
I'd rather avoid special pipelines. For we, that would mean that we have
an API problem with the pipeline, that needs to be identified and
solved.
Well, for deep learning, you'd want a generalized backprop on the
final N steps, I guess :p
Gael Varoquaux
2014-02-26 12:56:42 UTC
Permalink
Post by Lars Buitinck
Post by Gael Varoquaux
I'd rather avoid special pipelines. For we, that would mean that we have
an API problem with the pipeline, that needs to be identified and
solved.
Well, for deep learning, you'd want a generalized backprop on the
final N steps, I guess :p
OK. Point taken!

G
federico vaggi
2014-02-26 15:56:10 UTC
Permalink
As an aside Lars - I'd actually love to see the recepy, if you don't mind
putting up a gist or notebook.
Post by Lars Buitinck
Post by Gael Varoquaux
Post by Issam
Extreme learning machine: theory and applications has 1285 citations
and it got published in 2006; a large number of citations for a fairly
recent article. I believe scikit-learn could add such an interesting
learning algorithm along with its variations (weighted ELMs, sequential
ELMS, etc.)
It does sound like a possible candidate for inclusion.
We have a PR that implements them, but in too convoluted a way. My
personal choice for implementing these would be a transformer doing a
random projection + nonlinear activation. That way, you can stack any
linear model on top (think SGDClassifier for large-scale work) and get
a basic ELM. I've toyed with this variant before (typing this from
self.n_components = n_components
self.random_state = random_state
random_state = check_random_state(self.random_state)
self.components_ = random_state.randn(n_components, X.shape[1])
return self
return np.tanh(safe_sparse_dot(X, self.components_.T))
Now, make_pipeline(RandomHiddenLayer(), SGDClassifier()) is an ELM
except with regularized hinge loss instead of least squares. I guess
LDA can be used to get the "real" ELM.
I recently implemented baseline RBF networks in pretty much the same
way: k-means + RBF kernel + linear classifier. I didn't submit a PR
because it's just a pipeline of existing components.
Post by Gael Varoquaux
Post by Issam
Chances are the Multi-layer perceptron PR would be completed before the
summer, so it won't be included in the GSoC proposal.
In order not to get into a scope creep, I compiled the following list of
algorithms to be proposed for the GSoC 2014,
1) Extreme Learning Machines (
http://sentic.net/extreme-learning-machines.pdf)
Post by Gael Varoquaux
Post by Issam
1a) Weighted Extreme Learning Machines
1b) Sequential Extreme Learning machines
Does sequential mean for sequence data?
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
James Bergstra
2014-02-27 17:12:28 UTC
Permalink
I'd be happy to help define and mentor this PR, if a mentor is needed. I'd
really like to see this nnet work merged into sklearn, and some of the
other ideas that have been mentioned here too (e.g. docs & code for ELM).


On Wed, Feb 26, 2014 at 10:56 AM, federico vaggi
Post by federico vaggi
As an aside Lars - I'd actually love to see the recepy, if you don't mind
putting up a gist or notebook.
Post by Lars Buitinck
Post by Gael Varoquaux
Post by Issam
Extreme learning machine: theory and applications has 1285 citations
and it got published in 2006; a large number of citations for a fairly
recent article. I believe scikit-learn could add such an interesting
learning algorithm along with its variations (weighted ELMs, sequential
ELMS, etc.)
It does sound like a possible candidate for inclusion.
We have a PR that implements them, but in too convoluted a way. My
personal choice for implementing these would be a transformer doing a
random projection + nonlinear activation. That way, you can stack any
linear model on top (think SGDClassifier for large-scale work) and get
a basic ELM. I've toyed with this variant before (typing this from
self.n_components = n_components
self.random_state = random_state
random_state = check_random_state(self.random_state)
self.components_ = random_state.randn(n_components, X.shape[1])
return self
return np.tanh(safe_sparse_dot(X, self.components_.T))
Now, make_pipeline(RandomHiddenLayer(), SGDClassifier()) is an ELM
except with regularized hinge loss instead of least squares. I guess
LDA can be used to get the "real" ELM.
I recently implemented baseline RBF networks in pretty much the same
way: k-means + RBF kernel + linear classifier. I didn't submit a PR
because it's just a pipeline of existing components.
Post by Gael Varoquaux
Post by Issam
Chances are the Multi-layer perceptron PR would be completed before the
summer, so it won't be included in the GSoC proposal.
In order not to get into a scope creep, I compiled the following list
of
Post by Gael Varoquaux
Post by Issam
algorithms to be proposed for the GSoC 2014,
1) Extreme Learning Machines (
http://sentic.net/extreme-learning-machines.pdf)
Post by Gael Varoquaux
Post by Issam
1a) Weighted Extreme Learning Machines
1b) Sequential Extreme Learning machines
Does sequential mean for sequence data?
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Issam
2014-02-28 19:18:30 UTC
Permalink
Post by James Bergstra
I'd be happy to help define and mentor this PR, if a mentor is needed.
I'd really like to see this nnet work merged into sklearn, and some of
the other ideas that have been mentioned here too (e.g. docs & code
for ELM).
Hi James, that would be more than great.

@Gael has sent the message below for those who wish to mentor, so they
could fill the form given by the link. However, the link doesn't seem to
Post by James Bergstra
Hi,
I have had a number of people telling me that they could pitch in the
help mentoring for the GSoC. If these could fill in the PSF sign up form,
https://docs.google.com/forms/d/1wEaF22w2sKQY4iWxzFKmuYPf6ZEgV4uUKEKssHFSTBg/vie
wform
Cheers,
Gaël
--Issam
Olivier Grisel
2014-03-14 14:38:52 UTC
Permalink
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.

If you are still interested, there is an official template to follow
for PSF affiliated sub-projects (such as scikit-learn):

https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014

You can also have a look at Manoj's submission:

https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
--
Olivier
Issam
2014-03-15 13:29:04 UTC
Permalink
Thanks Olivier, I will upload the proposal very soon.

While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.

Cheers. :)
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
Jaidev Deshpande
2014-03-19 18:00:50 UTC
Permalink
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,

What's the update on your proposal? I don't mean to rush you at all, you're
probably working hard on the proposal as we speak, this is just a gentle
bump.

All the best.
Post by Issam
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
Issam
2014-03-19 19:11:15 UTC
Permalink
Thank you for the reminder, I didn't realize the deadline is close :(; I
got side tracked by a quiz I had.

Anyhow, I will finish and upload the proposal today, and hopefully we
can review it thoroughly tomorrow.

Sorry for the delay.

Thanks.
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you at all,
you're probably working hard on the proposal as we speak, this is just
a gentle bump.
All the best.
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Issam
2014-03-20 00:24:38 UTC
Permalink
Hi all,

I uploaded the Neural Network proposal to this link,

https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn

Please see if it is detailed enough as a promising proposal.

Thank you.
~Issam
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you at all,
you're probably working hard on the proposal as we speak, this is just
a gentle bump.
All the best.
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Jaidev Deshpande
2014-03-20 07:47:15 UTC
Permalink
Post by abhishek
Hi all,
I uploaded the Neural Network proposal to this link,
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn
Please see if it is detailed enough as a promising proposal.
Thank you.
~Issam
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you at all,
you're probably working hard on the proposal as we speak, this is just a
gentle bump.
All the best.
Post by Issam
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Hi Issam,

Looks OK at first glance, but please add it as a gist. Those are much
easier to comment on.

Thanks
--
JD
Issam
2014-03-20 09:47:17 UTC
Permalink
True that, I uploaded the proposal as a gist to this link,

https://gist.github.com/IssamLaradji/9660324

Thank you.

~Issam
Post by Jaidev Deshpande
Hi Issam,
Looks OK at first glance, but please add it as a gist. Those are much
easier to comment on.
Thanks
--
JD
Issam
2014-03-20 19:44:34 UTC
Permalink
Hi all,

I uploaded the proposal for Neural Networks to melange, here is the
public link.

http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/issamou/5668600916475904

Thank you.

Regards,
~Issam
Post by abhishek
Hi all,
I uploaded the Neural Network proposal to this link,
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn
Please see if it is detailed enough as a promising proposal.
Thank you.
~Issam
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you at
all, you're probably working hard on the proposal as we speak,
this is just a gentle bump.
All the best.
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official
proposal
Post by Olivier Grisel
for this GSoC application.
If you are still interested, there is an official template
to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph
databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Hi Issam,
Looks OK at first glance, but please add it as a gist. Those are much
easier to comment on.
Thanks
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Mathieu Blondel
2014-03-21 02:42:45 UTC
Permalink
Naive questions from someone who knows nothing about ELM. What's the
motivation for implementing both non-regularized and regularized ELM? If
the former tends to overfit, I would keep only the latter. And why do you
need 2 weeks for implementing the regularized variant? Is the algorithm
fundamentally different from the non-regularized variant?

Mathieu
Post by abhishek
Hi all,
I uploaded the proposal for Neural Networks to melange, here is the public
link.
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/issamou/5668600916475904
Thank you.
Regards,
~Issam
Post by abhishek
Hi all,
I uploaded the Neural Network proposal to this link,
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn
Please see if it is detailed enough as a promising proposal.
Thank you.
~Issam
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you at all,
you're probably working hard on the proposal as we speak, this is just a
gentle bump.
All the best.
Post by Issam
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Hi Issam,
Looks OK at first glance, but please add it as a gist. Those are much
easier to comment on.
Thanks
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Issam
2014-03-21 09:18:53 UTC
Permalink
Hi Mathieu,

The regularized version is fundamentally different from the
non-regularized version, in that it uses the derivative of the objective
function for which it solves using least-square solution. The basic ELM
version is classic; so, it wouldn't hurt having them both :). Further,
it is not clear that the regularized version always perform better. And,
many extentions of ELM assume the non-regularized version - like
kernel-based ELMs and Sequential ELM.

It shouldn't take two weeks, sorry. After implementing the
non-regularized version it would take two more days at worst.

Since I am removing sparse auto-encoders from the proposal, I will add
two other variants of Extreme Learning Machines,

1) Weighted Extreme Learning Machines for Imbalanced Data

2) Kernel-Based Extreme Learning Machines, with,
2a) Radial Basis Function Kernel
2b) Polynomial kernel

If there is time I will implement an Extreme Learning Machines version
that increments hidden neurons without recalculating the whole
least-square solution [1]. In other words, it can quickly find the best
number of hidden neurons for a particular problem.

Thanks.

[1]Error minimized*extreme learning machine*with growth of hidden nodes
and incremental*learning*
<http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5161346>
Post by Mathieu Blondel
Naive questions from someone who knows nothing about ELM. What's the
motivation for implementing both non-regularized and regularized ELM?
If the former tends to overfit, I would keep only the latter. And why
do you need 2 weeks for implementing the regularized variant? Is the
algorithm fundamentally different from the non-regularized variant?
Mathieu
Hi all,
I uploaded the proposal for Neural Networks to melange, here is
the public link.
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/issamou/5668600916475904
Thank you.
Regards,
~Issam
Post by abhishek
Hi all,
I uploaded the Neural Network proposal to this link,
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn
Please see if it is detailed enough as a promising proposal.
Thank you.
~Issam
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by
implementing a basic
version of each of the proposed algorithms, which I will
cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you
at all, you're probably working hard on the proposal as we
speak, this is just a gentle bump.
All the best.
Post by Olivier Grisel
Issam if I am not mistaken you have not written an
official proposal
Post by Olivier Grisel
for this GSoC application.
If you are still interested, there is an official
template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph
databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free
book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph
databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Hi Issam,
Looks OK at first glance, but please add it as a gist. Those are
much easier to comment on.
Thanks
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Arnaud Joly
2014-03-21 10:14:18 UTC
Permalink
Hi Issam,


Why not starting by improving multilayer neural network before adding new algorithms ?

To neural network expert, is it interesting to have layer configuration à la Torch
https://github.com/torch/nn/blob/master/README.md ?

Best,
Arnaud
Post by Issam
Hi Mathieu,
The regularized version is fundamentally different from the non-regularized version, in that it uses the derivative of the objective function for which it solves using least-square solution. The basic ELM version is classic; so, it wouldn't hurt having them both :). Further, it is not clear that the regularized version always perform better. And, many extentions of ELM assume the non-regularized version - like kernel-based ELMs and Sequential ELM.
It shouldn't take two weeks, sorry. After implementing the non-regularized version it would take two more days at worst.
Since I am removing sparse auto-encoders from the proposal, I will add two other variants of Extreme Learning Machines,
1) Weighted Extreme Learning Machines for Imbalanced Data
2) Kernel-Based Extreme Learning Machines, with,
2a) Radial Basis Function Kernel
2b) Polynomial kernel
If there is time I will implement an Extreme Learning Machines version that increments hidden neurons without recalculating the whole least-square solution [1]. In other words, it can quickly find the best number of hidden neurons for a particular problem.
Thanks.
[1] Error minimized extreme learning machine with growth of hidden nodes and incremental learning
Naive questions from someone who knows nothing about ELM. What's the motivation for implementing both non-regularized and regularized ELM? If the former tends to overfit, I would keep only the latter. And why do you need 2 weeks for implementing the regularized variant? Is the algorithm fundamentally different from the non-regularized variant?
Mathieu
Hi all,
I uploaded the proposal for Neural Networks to melange, here is the public link.
Thank you.
Regards,
~Issam
Post by abhishek
Hi all,
I uploaded the Neural Network proposal to this link,
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn
Please see if it is detailed enough as a promising proposal.
Thank you.
~Issam
Post by Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by implementing a basic
version of each of the proposed algorithms, which I will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush you at all, you're probably working hard on the proposal as we speak, this is just a gentle bump.
All the best.
Post by Olivier Grisel
Issam if I am not mistaken you have not written an official proposal
for this GSoC application.
If you are still interested, there is an official template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Hi Issam,
Looks OK at first glance, but please add it as a gist. Those are much easier to comment on.
Thanks
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2014-03-21 10:16:26 UTC
Permalink
Post by Arnaud Joly
To neural network expert, is it interesting to have layer configuration à la Torch
https://github.com/torch/nn/blob/master/README.md ?
I think that this should be done only after implementing a base set of
useful algorithms. I am not sure that there is time in a GSOC to get
there.
Mathieu Blondel
2014-03-21 11:08:46 UTC
Permalink
Hi Issam,

Thanks for the clarification. Another remark. It seems to me that it would
be nice if you allocated, say, two weeks at the beginning of your GSOC to
finish your on-going PRs, especially the MLP one.

Mathieu


On Fri, Mar 21, 2014 at 7:16 PM, Gael Varoquaux <
Post by Gael Varoquaux
Post by Arnaud Joly
To neural network expert, is it interesting to have layer configuration
à la
Post by Arnaud Joly
Torch
https://github.com/torch/nn/blob/master/README.md ?
I think that this should be done only after implementing a base set of
useful algorithms. I am not sure that there is time in a GSOC to get
there.
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Jaidev Deshpande
2014-03-21 11:21:12 UTC
Permalink
Post by Jaidev Deshpande
Hi Issam,
Thanks for the clarification. Another remark. It seems to me that it would
be nice if you allocated, say, two weeks at the beginning of your GSOC to
finish your on-going PRs, especially the MLP one.
+1
Post by Jaidev Deshpande
Mathieu
On Fri, Mar 21, 2014 at 7:16 PM, Gael Varoquaux <
Post by Gael Varoquaux
Post by Arnaud Joly
To neural network expert, is it interesting to have layer configuration
à la
Post by Arnaud Joly
Torch
https://github.com/torch/nn/blob/master/README.md ?
I think that this should be done only after implementing a base set of
useful algorithms. I am not sure that there is time in a GSOC to get
there.
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
Jaidev Deshpande
2014-03-21 11:28:30 UTC
Permalink
On Fri, Mar 21, 2014 at 4:51 PM, Jaidev Deshpande <
Post by Jaidev Deshpande
Hi Issam,
Thanks for the clarification. Another remark. It seems to me that it
would be nice if you allocated, say, two weeks at the beginning of your
GSOC to finish your on-going PRs, especially the MLP one.
+1
I'm sorry, I did not intend to +1 that. Does it make sense for Issam to
allocate two weeks of GSoC coding time to finish the MLP PRs? I think it
can be done in less time. (Ideally, can't it be done before or during the
community bonding period?)
Post by Jaidev Deshpande
Mathieu
On Fri, Mar 21, 2014 at 7:16 PM, Gael Varoquaux <
Post by Arnaud Joly
Post by Arnaud Joly
To neural network expert, is it interesting to have layer
configuration à la
Post by Arnaud Joly
Torch
https://github.com/torch/nn/blob/master/README.md ?
I think that this should be done only after implementing a base set of
useful algorithms. I am not sure that there is time in a GSOC to get
there.
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
--
JD
Issam
2014-03-21 11:57:40 UTC
Permalink
How about assigning the first week to finalizing the PR? Because the
documentation haven't been thoroughly reviewed yet.

Thanks
Post by Jaidev Deshpande
I'm sorry, I did not intend to +1 that. Does it make sense for Issam
to allocate two weeks of GSoC coding time to finish the MLP PRs? I
think it can be done in less time. (Ideally, can't it be done before
or during the community bonding period?)
Jaidev Deshpande
2014-03-21 12:02:57 UTC
Permalink
Post by Issam
How about assigning the first week to finalizing the PR? Because the
documentation haven't been thoroughly reviewed yet.
Thanks
Good enough.
Post by Issam
Post by Jaidev Deshpande
I'm sorry, I did not intend to +1 that. Does it make sense for Issam
to allocate two weeks of GSoC coding time to finish the MLP PRs? I
think it can be done in less time. (Ideally, can't it be done before
or during the community bonding period?)
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
Issam
2014-03-21 12:54:38 UTC
Permalink
Hi all,

I updated the Neural Network proposal in melange,

http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/issamou/5668600916475904

Thank you.

~Issam
James Bergstra
2014-03-21 13:25:51 UTC
Permalink
The proposal looks good to me! A few small comments:

1. I'm confused by the paragraph on regularized ELMs: I think you mean that
in cases where the hidden weights (the classifier?) are *underdetermined*
because there are far more *unknowns* then *samples* then you need to
regularize somehow. (Right!?)

2. Testing: no mention of how you will test any of this work. It's hard to
know when an ML algorithm is implemented well. How will you know? Usually
reproducing published results is a good bar to aim for, which ones do you
have in mind? E.g. if there are some results in your PhD thesis that you
want to reproduce, then mention that. How long does it take to train such
things, do you need access to big computers?

3. If you are just now completing your Masters degree on such models, you
might want to mention that in your proposal's "Past Work" section :)
Post by abhishek
Hi all,
I updated the Neural Network proposal in melange,
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/issamou/5668600916475904
Thank you.
~Issam
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Issam
2014-03-21 13:50:30 UTC
Permalink
Post by James Bergstra
1. I'm confused by the paragraph on regularized ELMs: I think you mean
that in cases where the hidden weights (the classifier?) are
*underdetermined* because there are far more *unknowns* then *samples*
then you need to regularize somehow. (Right!?)
I meant the opposite :) - there are usually far more "samples" than
"unknowns". The unknowns depend on the number of hidden neurons and
output neurons which is usually small.

Typically the hidden weights matrix (the weights going out of the hidden
neurons to the output neuron) is a 150x1 matrix. In other words there
are 150 hidden neurons and 1 output neuron. This means there are 150
unknown variables . Since least-square solutions can be considered as
systems of linear equations, solving for 150 unknown variables is
possible with 150 samples. But datasets usually are as large as 10, 000
samples, meaning the number of unique solutions are very large as well,
hence overdetermined (http://en.wikipedia.org/wiki/Overdetermined_system).

Therefore, regularization would constrain the amount of solutions by
making sure they satisfy a meaningful constraint - like SVM's
maximization of the margins between classes.

Sorry that this wasn't clear in the proposal.
Post by James Bergstra
2. Testing: no mention of how you will test any of this work. It's
hard to know when an ML algorithm is implemented well. How will you
know? Usually reproducing published results is a good bar to aim for,
which ones do you have in mind? E.g. if there are some results in your
PhD thesis that you want to reproduce, then mention that. How long
does it take to train such things, do you need access to big computers?
That's the main motivation of using Extreme Learning Machines; they take
seconds to train ;). The only obstacle is memory, because it processes
the matrices all at once; however, this is where Sequential ELMs come in :).

I will add another section explaining the evaluation of the algorithms.
It would include, solving systems of linear equation by hand and
comparing it with the algorithm's output; how does that sound?
Obviously, this is besides testing for coding issues like checking
whether the control flow works as intended.

A bit cheesy, but I intend to cross-check the algorithms' outputs with
that of the MATLAB's versions of the implementations, and theano's
implementation of deep networks. :)
Post by James Bergstra
3. If you are just now completing your Masters degree on such models,
you might want to mention that in your proposal's "Past Work" section :)
Sure thing :).
Post by James Bergstra
Hi all,
I updated the Neural Network proposal in melange,
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/issamou/5668600916475904
Thank you.
~Issam
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
James Bergstra
2014-03-21 14:44:10 UTC
Permalink
Post by James Bergstra
1. I'm confused by the paragraph on regularized ELMs: I think you mean
that in cases where the hidden weights (the classifier?) are
*underdetermined* because there are far more *unknowns* then *samples* then
you need to regularize somehow. (Right!?)
I meant the opposite :) - there are usually far more "samples" than
"unknowns". The unknowns depend on the number of hidden neurons and output
neurons which is usually small.
Typically the hidden weights matrix (the weights going out of the hidden
neurons to the output neuron) is a 150x1 matrix. In other words there are
150 hidden neurons and 1 output neuron. This means there are 150 unknown
variables . Since least-square solutions can be considered as systems of
linear equations, solving for 150 unknown variables is possible with 150
samples. But datasets usually are as large as 10, 000 samples, meaning the
number of unique solutions are very large as well, hence overdetermined (
http://en.wikipedia.org/wiki/Overdetermined_system).
Therefore, regularization would constrain the amount of solutions by
making sure they satisfy a meaningful constraint - like SVM's maximization
of the margins between classes.
Sorry that this wasn't clear in the proposal.
I get that if you have 10, 000 samples and 150 features, then your system
is over-determined.
Where I think you go wrong is in worrying about a large number of unique
solutions. Over-determined typically means 0 solutions! (Have another look
at that page you linked, it's the under-determined systems that need
explicit regularization to find a unique solution.)

SVM has max-margin regularization partly because of the shape of hinge
loss, not because of the number of samples. The hinge loss is a constant 0
for large parts of the input domain, so there isn't a single "best" point
on the loss function. Conventional regularizations like L1 and L2 push the
solution to be closer to the inflection point of the hinge.

Anyway, ML in general deals with noisy data (both in classification and
numeric regression) so that's actually the dominant reason why
regularization is used, even when the system is technically overdetermined.

For your proposal, it would probably be more accurate to explain that when
training data is noisy, it regularization during training can lead to more
accurate predictions on test data. That's why the regularized ELM is worth
implementing.


Also: new topic: Did you mention earlier in the thread that you need
derivatives to implement a regularized ELM? Why don't you just use some of
the existing linear (or even non-linear?) regression models in sklearn to
classify the features computed by the initial layers of the ELM? This is a
more detailed question that doesn't really affect your proposal, but I'd
like to hear your thoughts and maybe discuss it.
Post by James Bergstra
2. Testing: no mention of how you will test any of this work. It's hard
to know when an ML algorithm is implemented well. How will you know?
Usually reproducing published results is a good bar to aim for, which ones
do you have in mind? E.g. if there are some results in your PhD thesis that
you want to reproduce, then mention that. How long does it take to train
such things, do you need access to big computers?
That's the main motivation of using Extreme Learning Machines; they take
seconds to train ;). The only obstacle is memory, because it processes the
matrices all at once; however, this is where Sequential ELMs come in :).
I will add another section explaining the evaluation of the algorithms. It
would include, solving systems of linear equation by hand and comparing it
with the algorithm's output; how does that sound? Obviously, this is
besides testing for coding issues like checking whether the control flow
works as intended.
A bit cheesy, but I intend to cross-check the algorithms' outputs with
that of the MATLAB's versions of the implementations, and theano's
implementation of deep networks. :)
Sounds good, but I wouldn't be so confident they always take seconds to
train. I think some deep vision system models are pretty much just big
convolutional ELMs (e.g.
http://jmlr.org/proceedings/papers/v28/bergstra13.pdf) and they can take up
to say, an hour of GPU time to (a) compute all of the features for a big
data set and (b) train the linear output model. Depending on your data set
you might want to use more than 150 output neurons! When I was doing those
experiments, it seemed that models got better and better the more outputs I
used, they just take longer to train and eventually don't fit in memory.
Issam
2014-03-21 15:19:13 UTC
Permalink
Post by James Bergstra
I get that if you have 10, 000 samples and 150 features, then your
system is over-determined.
Where I think you go wrong is in worrying about a large number of
unique solutions. Over-determined typically means 0 solutions! (Have
another look at that page you linked, it's the under-determined
systems that need explicit regularization to find a unique solution.)
Sorry for the confusion. Yes, it can either mean 0 solutions or
infinitely many solutions; that's why we use pseudo inverse to
approximate a solution :).

Sorry again, I meant that constrains help in underdetermined systems
like in the example below, I don't know why I confused between the two
:S, thanks for pointing this out.

x1 + x2 + x3 = 5
x1+ x2 + x3 = 2
Post by James Bergstra
Anyway, ML in general deals with noisy data (both in classification
and numeric regression) so that's actually the dominant reason why
regularization is used, even when the system is technically
overdetermined.
For your proposal, it would probably be more accurate to explain that
when training data is noisy, it regularization during training can
lead to more accurate predictions on test data. That's why the
regularized ELM is worth implementing.
You are right.
Post by James Bergstra
Also: new topic: Did you mention earlier in the thread that you need
derivatives to implement a regularized ELM? Why don't you just use
some of the existing linear (or even non-linear?) regression models in
sklearn to classify the features computed by the initial layers of the
ELM? This is a more detailed question that doesn't really affect your
proposal, but I'd like to hear your thoughts and maybe discuss it.
" regression models in sklearn to classify the features computed by the
initial layers of the ELM" I didn't get this, do you mean we can use PCA
or SVDs to get more meaningful hidden features?

The derivative is for solving the dual optimization problem by the KKT
theorem, allowing us to add constraints like in SVM.
Please look at page 6 in,
http://www.ntu.edu.sg/home/egbhuang/pdf/ELM-Unified-Learning.pdf
Post by James Bergstra
Sounds good, but I wouldn't be so confident they always take seconds
to train. I think some deep vision system models are pretty much just
big convolutional ELMs (e.g.
http://jmlr.org/proceedings/papers/v28/bergstra13.pdf) and they can
take up to say, an hour of GPU time to (a) compute all of the features
for a big data set and (b) train the linear output model. Depending on
your data set you might want to use more than 150 output neurons! When
I was doing those experiments, it seemed that models got better and
better the more outputs I used, they just take longer to train and
eventually don't fit in memory.
True, well "seconds to train" was in its figurative sense. However, the
time it takes is nothing like backpropagation ;). You could go as large
as 1000 hidden neurons and time is still not an issue. It wouldn't be
slower than SVM, for example. The real issue lies in memory :).

Thank you!
James Bergstra
2014-03-21 17:39:21 UTC
Permalink
Post by Issam
Post by James Bergstra
Also: new topic: Did you mention earlier in the thread that you need
derivatives to implement a regularized ELM? Why don't you just use
some of the existing linear (or even non-linear?) regression models in
sklearn to classify the features computed by the initial layers of the
ELM? This is a more detailed question that doesn't really affect your
proposal, but I'd like to hear your thoughts and maybe discuss it.
" regression models in sklearn to classify the features computed by the
initial layers of the ELM" I didn't get this, do you mean we can use PCA
or SVDs to get more meaningful hidden features?
No that's not what I meant. Maybe we can chat about this off-list after I
read more about ELMs e.g. the reference you gave below. And maybe your
Masters thesis too?
Post by Issam
The derivative is for solving the dual optimization problem by the KKT
theorem, allowing us to add constraints like in SVM.
Please look at page 6 in,
http://www.ntu.edu.sg/home/egbhuang/pdf/ELM-Unified-Learning.pdf
Thanks for the reference!
Post by Issam
Post by James Bergstra
Sounds good, but I wouldn't be so confident they always take seconds
to train. I think some deep vision system models are pretty much just
big convolutional ELMs (e.g.
http://jmlr.org/proceedings/papers/v28/bergstra13.pdf) and they can
take up to say, an hour of GPU time to (a) compute all of the features
for a big data set and (b) train the linear output model. Depending on
your data set you might want to use more than 150 output neurons! When
I was doing those experiments, it seemed that models got better and
better the more outputs I used, they just take longer to train and
eventually don't fit in memory.
True, well "seconds to train" was in its figurative sense. However, the
time it takes is nothing like backpropagation ;). You could go as large
as 1000 hidden neurons and time is still not an issue. It wouldn't be
slower than SVM, for example. The real issue lies in memory :).
Yep, that's right. Not necessarily trivially fast, but way faster than
backprop. A pretty normal computer should be fine for the ELM and even the
backprop stuff when you get to it.
Issam
2014-03-21 18:40:05 UTC
Permalink
Post by Issam
No that's not what I meant. Maybe we can chat about this off-list
after I read more about ELMs e.g. the reference you gave below. And
maybe your Masters thesis too?
Sure thing, I will give you the thesis as soon as I finish the first
write-up. :)

This is the first paper in Extreme Learning Machines, offering a good
introduction on the topic,

http://www.sciencedirect.com/science/article/pii/S0925231206000385

You might also find its presentation slides useful,

http://www.kovan.ceng.metu.edu.tr/~erol/Courses/CENG569/student-presentations/Yamac%20Kurtulus%20Ceng569%20Slide.pdf

Thanks.
~Issam
Issam
2014-03-21 10:26:00 UTC
Permalink
Hi Arnaud,

You are right, I was supposed to finish my MLP PR before the summer, but
my thesis took over my time, which, fortunately, I am completing this
semester :).

Anyhow, I would start with multi-layer perceptron and deep networks
before delving into developing other algorithms.

For the layer configuration, it is absolutely necessary, as there are
many types of layers. But as @Gael said, it could be too complex and
time consuming for the summer. And I have not done it before; so, I
can't anticipate the time it takes. On the other hand, I have already
worked on the algorithms I proposed, meaning I will not face unexpected
obstacles :).

Thanks,
~Issam
Post by Jaidev Deshpande
Hi Issam,
Why not starting by improving multilayer neural network before adding new algorithms ?
To neural network expert, is it interesting to have layer
configuration à la Torch
https://github.com/torch/nn/blob/master/README.md ?
Best,
Arnaud
Gael Varoquaux
2014-03-21 10:33:43 UTC
Permalink
Post by Issam
You are right, I was supposed to finish my MLP PR before the summer, but
my thesis took over my time, which, fortunately, I am completing this
semester :).
Out of curiosity, when is it due?
Post by Issam
On the other hand, I have already worked on the algorithms I proposed,
meaning I will not face unexpected obstacles :).
That's definitely a good point.

Gaël
Issam
2014-03-21 10:38:46 UTC
Permalink
The thesis is due April 30, 2014, which is 19 days before GSoC starts :).
Post by Gael Varoquaux
Post by Issam
You are right, I was supposed to finish my MLP PR before the summer, but
my thesis took over my time, which, fortunately, I am completing this
semester :).
Out of curiosity, when is it due?
Gael Varoquaux
2014-03-21 10:42:32 UTC
Permalink
Post by Issam
The thesis is due April 30, 2014, which is 19 days before GSoC starts :).
Good luck! I hope you finish on time, to be able to enjoy some rest.

Gaël
Issam
2014-03-21 10:24:53 UTC
Permalink
Hi Arnaud,

You are right, I was supposed to finish my MLP PR before the summer, but
my thesis took over my time, which, fortunately, I am completing this
semester :).

Anyhow, I would start with multi-layer perceptron and deep networks
before delving into developing other algorithms.

For the layer configuration, it is absolutely necessary, as there are
many types of layers. But as @Gael said, it could be too complex and
time consuming for the summer. And I have not done it before; so, I
can't anticipate the time it takes. On the other hand, I have already
worked on the algorithms I proposed, meaning I will not face unexpected
obstacles :).

Thanks,
~Issam
Post by Jaidev Deshpande
Hi Issam,
Why not starting by improving multilayer neural network before adding new algorithms ?
To neural network expert, is it interesting to have layer
configuration à la Torch
https://github.com/torch/nn/blob/master/README.md ?
Best,
Arnaud
Post by Issam
Hi Mathieu,
The regularized version is fundamentally different from the
non-regularized version, in that it uses the derivative of the
objective function for which it solves using least-square solution.
The basic ELM version is classic; so, it wouldn't hurt having them
both :). Further, it is not clear that the regularized version always
perform better. And, many extentions of ELM assume the
non-regularized version - like kernel-based ELMs and Sequential ELM.
It shouldn't take two weeks, sorry. After implementing the
non-regularized version it would take two more days at worst.
Since I am removing sparse auto-encoders from the proposal, I will
add two other variants of Extreme Learning Machines,
1) Weighted Extreme Learning Machines for Imbalanced Data
2) Kernel-Based Extreme Learning Machines, with,
2a) Radial Basis Function Kernel
2b) Polynomial kernel
If there is time I will implement an Extreme Learning Machines
version that increments hidden neurons without recalculating the
whole least-square solution [1]. In other words, it can quickly find
the best number of hidden neurons for a particular problem.
Thanks.
[1]Error minimized*extreme learning machine*with growth of hidden
nodes and incremental*learning*
<http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5161346>
Post by Mathieu Blondel
Naive questions from someone who knows nothing about ELM. What's the
motivation for implementing both non-regularized and regularized
ELM? If the former tends to overfit, I would keep only the latter.
And why do you need 2 weeks for implementing the regularized
variant? Is the algorithm fundamentally different from the
non-regularized variant?
Mathieu
Hi all,
I uploaded the proposal for Neural Networks to melange, here is
the public link.
Thank you.
Regards,
~Issam
Post by abhishek
Hi all,
I uploaded the Neural Network proposal to this link,
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014:-Extending-Neural-Networks-Module-for-Scikit-learn
Please see if it is detailed enough as a promising proposal.
Thank you.
~Issam
On Sat, Mar 15, 2014 at 6:59 PM, Issam
Thanks Olivier, I will upload the proposal very soon.
While doing so, I will strengthen my proposal by
implementing a basic
version of each of the proposed algorithms, which I
will cite in my
proposal.
Cheers. :)
Hi Issam,
What's the update on your proposal? I don't mean to rush
you at all, you're probably working hard on the proposal
as we speak, this is just a gentle bump.
All the best.
Post by Olivier Grisel
Issam if I am not mistaken you have not written an
official proposal
Post by Olivier Grisel
for this GSoC application.
If you are still interested, there is an official
template to follow
https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2014-Application:-Improved-Linear-Models
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph
databases and their
applications. Written by three acclaimed leaders in
the field,
this first edition is now available. Download your
free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph
databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Hi Issam,
Looks OK at first glance, but please add it as a gist. Those
are much easier to comment on.
Thanks
--
JD
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Manoj Kumar
2014-02-05 13:08:01 UTC
Permalink
I finally worked it out myself from this research paper,
http://www.stanford.edu/~hastie/Papers/glmnet.pdf
There were two mistakes that I made,

1. We need to add a w_{j} * X[:, j] term to update the residuals. (Eq n 7)

2. When is X is not standardised, there should be a sum of square terms in
the denominator. (Eqn 10 under weighted updates)

So I understood what actually is happening in coordinate_descent.
Post by Manoj Kumar
I'm sorry it should just be
[image: \frac{\omega_{j}\sum_{i = 1}^n(X_{i}^j)^2 - \alpha + \sum_{i =
1}^n (y_{i} - X'\omega)(X_{j}^i)}{\sum_{i = 1}^n (X_{i}^j)^2+ \beta}]
in the first equation
On Wed, Feb 5, 2014 at 10:27 AM, Manoj Kumar <
Post by Manoj Kumar
Hi,
I went through the enet_coordinate_descent function in cd_fast.pyx. I
have some questions which are noobish but I'll go ahead and ask them anyway.
It seems in L176 in each cycle, each omega_j is updated as
[image: \frac{\omega_{j}\sum_{i = 1}^n(X_{i}^j)^2 - \alpha + \sum_{i =
1}^n (y_{i} - X'\omega)(X_{j}^i)}{\sum_{i = 1}^n X_{i}^j+ \beta}]
...1]
when the term other than alpha in the numerator is greater than, alpha
(correct me if I'm wrong)
When I went through the wikipedia article, and from my previous
knowledge, don't we just do partial derivative of the cost function with
respect to omega_{j} and equate it to zero for one cycle of iterations.
The cost function is
1 norm(y - X w, 2)^2 + alpha norm(w, 1) + beta norm(w, 2)^2
- ----
2 2
If we differentiate this with respect to w and equate to zero, shouldn't
we get something like
[image: -\frac{\alpha + \sum_{i = 1}^n (y_{i} -
X'\omega)(X_{j}^i)}{\beta}]
I don't understand where I am going wrong
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
--
Regards,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com
Manoj Kumar
2014-02-05 20:36:58 UTC
Permalink
Hi, (Sorry to be spamming this list)

I just created a wiki page for the project discussion, so that I can dump
my ideas.
https://github.com/scikit-learn/scikit-learn/wiki/Linear-models-project-discussion,-GSoC-2014

I went through the gist and had a quick look at the research paper and I
understood (or atleast I think I did) what is to be done for the "strong
rules corresponding to the Lasso". Integrating this with the present
lasso_path, and coordinate_descent will be a part of my project.

@Gael and Alex
As far as the remaining part goes, "Better strategy to choose coordinates
in the coordinate descent." I'm not exactly sure how to go forward, I
looked at the research paper, but there were mentions of "L1 - regularised
Logistic Regression" only and nothing about linear.

And what about "Removing LibLinear dependency so that we can have
Regularised Logistic Regression"? Do you think that would gel better with
the Lasso idea?
Alexandre Gramfort
2014-02-05 20:45:26 UTC
Permalink
the idea of dropping LibLinear for Logisitic Regression has been
around for some time now. If we manage to have a least the same
performance, supporting both L1 or L1+L2 regularization, without
penalizing the intercept ... it would be great.

@mblondel any thought wrt to lightning ?

Alex
Joel Nothman
2014-02-05 21:05:22 UTC
Permalink
If we manage to have a least the same performance, supporting both L1 or
L1+L2 regularization, without penalizing the intercept

Yes, this would make me happy too!


On 6 February 2014 07:45, Alexandre Gramfort <
the idea of dropping LibLinear for Logisitic Regression has been
around for some time now. If we manage to have a least the same
performance, supporting both L1 or L1+L2 regularization, without
penalizing the intercept ... it would be great.
@mblondel any thought wrt to lightning ?
Alex
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2014-02-03 10:49:28 UTC
Permalink
Post by Manoj Kumar
I found this idea "Improving Gaussian Mixture Models" , repeating in
2012 and 2013, so I assumed this to be of real interest to the
scikit-learn community. I have a fundamental knowledge of Gaussian
Models, and the EM algorithm. I would like to take this project forward
as part of GSoC. I took a quick look at the issues tracker, and I found
a number of issues.
One problem that we have here, is that I don't believe that any of the
current core developers are experts of GMMs. Thus unless you have
good experience with GMMs, or find yourself a mentor who has, I don't
believe that this is a good GSOC project to follow.

G
Sturla Molden
2014-02-03 14:06:54 UTC
Permalink
Post by Gael Varoquaux
One problem that we have here, is that I don't believe that any of the
current core developers are experts of GMMs. Thus unless you have
good experience with GMMs, or find yourself a mentor who has, I don't
believe that this is a good GSOC project to follow.
Not a core developer though, nor an expert, but I have coded a fair amount
of GMMs for the purpose of automated spike sorting. I thought about
submitting CEM, CEM2 and Gaussian hierarchical clustering to scikit-learn,
but I have never really had the time. In my experience, CEM2 to optimize
the MML score is the most reliable and numerically stabile method for
fitting GMMs (cf. Figuereido & Jain). The most common formulation of the EM
algorithm for fitting GMMs with ML is particularly unstabile, as it tends
to converge towards singularities and then blow up. CEM2 is rather robust
against that. I also have code to generate random samples from GMMs (which
I needed for some strange purpose I'm not even sure I remember. I think i
wanted to Monte Carlo integrate to find the KL divergence between two of
them, or something...) I'm also working on a new way to fit GMMs that I
think might find the globally best solution, not just a local optima, but I
still have to prove it rigourously. The paper has been sitting half written
on my desk for ages... :-)

And while we are talking about mixture modelling: Would nnclean (cf. Byers
& Raftery) be of any interest to scikit-learn? I have an extemely fast
kd-tree based implementation of it.

Sturla
Continue reading on narkive:
Loading...