Discussion:
[Scikit-learn-general] Multivariate Adaptive Regression Splines (MARS, aka earth)
Jason Rudy
2013-01-10 17:21:01 UTC
Permalink
Hi there,
I'm working on an implementation of MARS [1] that I'd like to share, and it
seems like sklearn would be a good place for it. The MARS algorithm is
currently available as part of the R package "earth" and is one of the only
reasons I still use R. Would sklearn be a good place for such an
algorithm? Are there any guidelines or procedures I should be aware of
before contributing?

Best,

Jason


[1] Friedman, J. (1991). Multivariate adaptive regression splines. The
annals of statistics, 19(1), 1–67.
Lars Buitinck
2013-01-10 17:35:30 UTC
Permalink
Post by Jason Rudy
I'm working on an implementation of MARS [1] that I'd like to share, and
it seems like sklearn would be a good place for it. The MARS algorithm is
currently available as part of the R package "earth" and is one of the only
reasons I still use R. Would sklearn be a good place for such an
algorithm? Are there any guidelines or procedures I should be aware of
before contributing?

I guess that would fit in scikit-learn, but I'm not an expert on fancy
regression analysis. The contributor guidelines can be found here:

https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md

In addition, make sure that (1) you own the code or your employer is ok
with you publishing it under BSD license terms, and (2) apparently MARS is
a trademark so call the estimator something else, like EarthRegressor or
MARegressionSplines.

--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
Peter Prettenhofer
2013-01-10 17:43:49 UTC
Permalink
Post by Lars Buitinck
Post by Jason Rudy
I'm working on an implementation of MARS [1] that I'd like to share, and
it seems like sklearn would be a good place for it. The MARS algorithm is
currently available as part of the R package "earth" and is one of the only
reasons I still use R. Would sklearn be a good place for such an algorithm?
Are there any guidelines or procedures I should be aware of before
contributing?
I'd love to see MARS in the sklearn - is your implementation currently
publicly available?
Post by Lars Buitinck
I guess that would fit in scikit-learn, but I'm not an expert on fancy
https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md
In addition, make sure that (1) you own the code or your employer is ok with
you publishing it under BSD license terms, and (2) apparently MARS is a
trademark so call the estimator something else, like EarthRegressor or
MARegressionSplines.
--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Peter Prettenhofer
Andreas Mueller
2013-01-10 19:48:05 UTC
Permalink
Hi Jason.
Thanks for wanting to contribute MARS to sklearn.

There is even an issue requesting the feature ;)

https://github.com/scikit-learn/scikit-learn/issues/845
I think it would be great addition.

You should be aware of the fact that contributing to sklearn is a bit
more than just implementing the algorithm.
For your code to be merged, you'd also have to provide documentation,
examples and unit-tests.

Also, it would be good to know if any of the core developers are
familiar enough with the method to take
care of it in the future.


Cheers,
Andy
Jason Rudy
2013-01-11 17:09:31 UTC
Permalink
I thought it would be a good idea to figure out where I could contribute it
early on in case it affects design decisions. The code is definitely not
done or publicly available yet. It sounds like sklearn is the right place
for this, then. Unit tests and examples are already part of my plan, and I
am certainly willing write documentation. I will take a look at the
guidelines

I'm implementing MARS as part of a project for work, but my company is very
much okay with my contributing the MARS portion of the project to an open
source library of some kind. The way they see it, if we meet a few good
Python developers in the long run as a result of getting involved with the
community then it will have been worth the extra effort.

So, I will continue my development with sklearn in mind and will be in
touch. I'll also comment on the issue tracker to that effect.


On Thu, Jan 10, 2013 at 11:48 AM, Andreas Mueller
Post by Andreas Mueller
Hi Jason.
Thanks for wanting to contribute MARS to sklearn.
There is even an issue requesting the feature ;)
https://github.com/scikit-learn/scikit-learn/issues/845
I think it would be great addition.
You should be aware of the fact that contributing to sklearn is a bit
more than just implementing the algorithm.
For your code to be merged, you'd also have to provide documentation,
examples and unit-tests.
Also, it would be good to know if any of the core developers are
familiar enough with the method to take
care of it in the future.
Cheers,
Andy
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2013-01-11 17:11:59 UTC
Permalink
Post by Jason Rudy
I'm working on an implementation of MARS [1] that I'd like to share, and it
seems like sklearn would be a good place for it.  The MARS algorithm is
currently available as part of the R package "earth" and is one of the only
reasons I still use R.  Would sklearn be a good place for such an algorithm? 
I personnally would love to see MARS in the scikit. I think that it is
probably a mouthful to code it well :).

Thanks a lot for offering

G
Jason Rudy
2013-03-05 19:15:25 UTC
Permalink
So I've finally got something to show. Gael, you were entirely correct
about it being a mouthful. I've been developing it as a separate package
for simplicity, but will be integrating with scikit-learn as soon as I get
the time. Here is what I've got so far in case anyone wants to take a look:

https://github.com/jcrudy/py-earth

I would be very grateful for feedback if anyone has any, and especially for
bug reports. I'm giving a talk about MARS at PyData (perhaps I'll see some
of you there:). Preparing for the talk will take up a lot of time, so my
pull request might have to wait until after.





On Fri, Jan 11, 2013 at 9:11 AM, Gael Varoquaux <
Post by Jason Rudy
Post by Jason Rudy
I'm working on an implementation of MARS [1] that I'd like to share, and
it
Post by Jason Rudy
seems like sklearn would be a good place for it. The MARS algorithm is
currently available as part of the R package "earth" and is one of the
only
Post by Jason Rudy
reasons I still use R. Would sklearn be a good place for such an
algorithm?
I personnally would love to see MARS in the scikit. I think that it is
probably a mouthful to code it well :).
Thanks a lot for offering
G
------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Andreas Mueller
2013-03-05 19:57:50 UTC
Permalink
Post by Jason Rudy
So I've finally got something to show. Gael, you were entirely
correct about it being a mouthful. I've been developing it as a
separate package for simplicity, but will be integrating with
scikit-learn as soon as I get the time. Here is what I've got so far
https://github.com/jcrudy/py-earth
I would be very grateful for feedback if anyone has any, and
especially for bug reports. I'm giving a talk about MARS at PyData
(perhaps I'll see some of you there:). Preparing for the talk will
take up a lot of time, so my pull request might have to wait until after.
woah, awesome :)
Did you compare somehow with R?

Btw, Gael is offline atm ;)
Jason Rudy
2013-03-06 03:05:24 UTC
Permalink
Just anecdotally I can say the goodness of fit and speed seem comparable,
but the models produced are slightly different. I'm working on making a
more comprehensive comparison.


On Tue, Mar 5, 2013 at 11:57 AM, Andreas Mueller
Post by Andreas Mueller
Post by Jason Rudy
So I've finally got something to show. Gael, you were entirely
correct about it being a mouthful. I've been developing it as a
separate package for simplicity, but will be integrating with
scikit-learn as soon as I get the time. Here is what I've got so far
https://github.com/jcrudy/py-earth
I would be very grateful for feedback if anyone has any, and
especially for bug reports. I'm giving a talk about MARS at PyData
(perhaps I'll see some of you there:). Preparing for the talk will
take up a lot of time, so my pull request might have to wait until after.
woah, awesome :)
Did you compare somehow with R?
Btw, Gael is offline atm ;)
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Loading...