Discussion:
future of maxentropy module (was: sparse rmatvec and maxentropy)
(too old to reply)
Ralf Gommers
2011-01-24 14:35:54 UTC
Permalink
(excuse the cross-post, but this may be of interest to scipy-user and the
scikits.learn crowd)
On Sat, Jan 22, 2011 at 8:50 AM, Ralf Gommers
I picked up the montecarlo code when I was playing around with these.
http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper-maxent/files/head:/scikits/statsmodels/sandbox/maxentropy/<http://bazaar.launchpad.net/%7Ejsseabold/statsmodels/statsmodels-skipper-maxent/files/head:/scikits/statsmodels/sandbox/maxentropy/>
I'm curious if the maxentropy stuff as it is in scipy wouldn't find
more use and maintenance in scikits.learn. The implementation is
somewhat use specific (natural language processing), though this is
not by any means set in stone.
Probably, but wouldn't it need a lot of work before it could be moved? It
has a grand total of one test, mostly non-working examples, and is
obviously
hardly used at all (see r6919 and r6920 for more examples of broken
code).
Perhaps it's worth asking the scikits.learn guys, and otherwise consider
deprecating it if they're not interested?
I haven't seen or heard anyone using it besides Skipper. There are
also still some features that where designed for pysparse and never
fully updated to scipy.sparse.
http://projects.scipy.org/scipy/ticket/856
I also thought deprecating and removing maxentropy will be the best
idea, if nobody volunteers to give it a workout.
So I guess we just have to ask this out loud: is anyone using the
scipy.maxentropy module or interested in doing so? If you are, would you be
interested in putting some work into it, like making the examples work and
adding some tests?

The current status is that 3 out of 4 examples are broken, the module has
only a single test, and from broken code that went unnoticed for a long time
it is clear that there are very few users.

If no one steps up, I propose to deprecate the module for the 0.10 release.
If there are any users out there that missed this email and step up then, we
can always un-deprecate again.

To the scikits.learn developers: would this code fit better and see more use
in scikits.learn than in scipy? Would you be interested to pick it up?

Ralf
Olivier Grisel
2011-01-24 14:56:13 UTC
Permalink
Post by Ralf Gommers
To the scikits.learn developers: would this code fit better and see more use
in scikits.learn than in scipy? Would you be interested to pick it up?
There is already a maxent model in scikit learn which is a wrapper for
LibLinear :

scikits.learn.linear_model.LogisticRegression

AFAIK, LibLinear is pretty much state of the art so I don't think the
scikits.learn project is interested reusing this code.

Best,
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
xinfan meng
2011-01-24 15:27:57 UTC
Permalink
I guess many people just don't know that logistic regression is
maximum entropy model.

On Mon, Jan 24, 2011 at 10:56 PM, Olivier Grisel
Post by Olivier Grisel
Post by Ralf Gommers
To the scikits.learn developers: would this code fit better and see more use
in scikits.learn than in scipy? Would you be interested to pick it up?
There is already a maxent model in scikit learn which is a wrapper for
 scikits.learn.linear_model.LogisticRegression
AFAIK, LibLinear is pretty much state of the art so I don't think the
scikits.learn project is interested reusing this code.
Best,
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
j***@gmail.com
2011-01-24 15:29:52 UTC
Permalink
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.

Josef
On Mon, Jan 24, 2011 at 10:56 PM, Olivier Grisel
Post by Olivier Grisel
Post by Ralf Gommers
To the scikits.learn developers: would this code fit better and see more use
in scikits.learn than in scipy? Would you be interested to pick it up?
There is already a maxent model in scikit learn which is a wrapper for
 scikits.learn.linear_model.LogisticRegression
AFAIK, LibLinear is pretty much state of the art so I don't think the
scikits.learn project is interested reusing this code.
Best,
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
xinfan meng
2011-01-24 15:39:14 UTC
Permalink
Take a look at this:

http://www.cs.berkeley.edu/~klein/papers/maxent-tutorial-slides-6.pdf
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Josef
On Mon, Jan 24, 2011 at 10:56 PM, Olivier Grisel
Post by Olivier Grisel
Post by Ralf Gommers
To the scikits.learn developers: would this code fit better and see more use
in scikits.learn than in scipy? Would you be interested to pick it up?
There is already a maxent model in scikit learn which is a wrapper for
 scikits.learn.linear_model.LogisticRegression
AFAIK, LibLinear is pretty much state of the art so I don't think the
scikits.learn project is interested reusing this code.
Best,
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
Skipper Seabold
2011-01-24 15:48:16 UTC
Permalink
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Section 6 and 7.3 I believe details the (somewhat coincidental ?)
relationship between maximum entropy and the likelihood for the logit
model from an econometrics standpoint.

http://www.google.com/url?sa=t&source=web&cd=2&ved=0CCIQFjAB&url=http%3A%2F%2Fwww1.american.edu%2Fcas%2Fecon%2Ffaculty%2Fgolan%2FGolan%2520Review%2520Foundations%252008.pdf&ei=f549Tde-DoqRgQeGxcD1CA&usg=AFQjCNHdMFjyF5BmErXI_pIq5AjO5NY5IQ&sig2=D4LCJGysUVWei_k01shg6w

Skipper
Jason Rennie
2011-01-24 15:56:06 UTC
Permalink
IIRC, maximum likelihood for an exponential model and maxent are equivalent.
Maxent simply restricts you to exponential models (which is not such a bad
restriction :-). Adam Berger is a good source for info on maxent:

http://www.cs.cmu.edu/~aberger/maxent.html

Ah, here's a slide where he talks about the equivalence:

http://www.cs.cmu.edu/afs/cs/user/aberger/www/html/tutorial/node8.html

Jason
Post by j***@gmail.com
Post by xinfan meng
I guess many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Josef
--
Jason Rennie
Research Scientist, ITA Software
617-714-2645
http://www.itasoftware.com/
Mathieu Blondel
2011-01-24 16:30:52 UTC
Permalink
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Strictly speaking Logistic Regression is a model and Maximum Entropy
is a parameter estimation method... But like Jason said, the MLE/MAP
and Maximum Entropy solutions of Logistic Regression are the same.

The reason for the confusion is probably because the Logistic
Regression and Maximum Entropy communities usually present things
differently. People in the Logistic Regression community work directly
with features and use one D-dimensional weight vector for each of the
M classes. People in the Maximum Entropy community (NLP people)
usually work with feature functions f(x,y) and a single
(DxM)-dimensional vector. But they are essentially the same: applying
the feature functions is equivalent to creating vectors where all
components are 0 except those in the block corresponding to the class
the example is labeled with.

Some good posts on the lingpipe blog:
http://lingpipe-blog.com/2008/04/03/logistic-regression-by-any-other-name/
http://lingpipe-blog.com/2009/04/30/max-entropy-logistic-regressionfeature-extraction-coefficient-encoding/

Ryan McDonald has excellent slides about linear models for NLP (he
uses the single-vector representation):
http://www.ryanmcd.com/courses/gslt2009/gslt2009.html

Mathieu
Gael Varoquaux
2011-01-24 17:10:19 UTC
Permalink
Wow, I didn't know either.

Could somethat that understand the link between the two reasonnably well
add a note (user the '.. note::' directive) to the scikit's
documentation, so that it appears when we google.

Hum, OK, the logistic regression isn't even documented. I'll add a stub
so that 'someone' can add the note.

Matthieu (Blondel) would you do that, please, as you seem to be the
person who has the best understanding. I'll review your branch, in return
:$.

G
Post by Mathieu Blondel
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Strictly speaking Logistic Regression is a model and Maximum Entropy
is a parameter estimation method... But like Jason said, the MLE/MAP
and Maximum Entropy solutions of Logistic Regression are the same.
The reason for the confusion is probably because the Logistic
Regression and Maximum Entropy communities usually present things
differently. People in the Logistic Regression community work directly
with features and use one D-dimensional weight vector for each of the
M classes. People in the Maximum Entropy community (NLP people)
usually work with feature functions f(x,y) and a single
(DxM)-dimensional vector. But they are essentially the same: applying
the feature functions is equivalent to creating vectors where all
components are 0 except those in the block corresponding to the class
the example is labeled with.
http://lingpipe-blog.com/2008/04/03/logistic-regression-by-any-other-name/
http://lingpipe-blog.com/2009/04/30/max-entropy-logistic-regressionfeature-extraction-coefficient-encoding/
Ryan McDonald has excellent slides about linear models for NLP (he
http://www.ryanmcd.com/courses/gslt2009/gslt2009.html
Mathieu
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Gael Varoquaux
Research Fellow, INSERM
Associate researcher, INRIA
Laboratoire de Neuro-Imagerie Assistee par Ordinateur
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-78-35
Mobile: ++ 33-6-28-25-64-62
http://gael-varoquaux.info
Gael Varoquaux
2011-01-24 17:15:25 UTC
Permalink
Post by Gael Varoquaux
at 'someone' can add the note.
Post by Gael Varoquaux
Matthieu (Blondel) would you do that, please, as you seem to be the
person who has the best understanding. I'll review your branch, in return
:$.
Haha basse tentative de marchandage a 2 balles :)
Granted!
j***@gmail.com
2011-01-24 17:53:31 UTC
Permalink
Post by Mathieu Blondel
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Thanks for all the references.

Skimming and reading for a bit, it's a lot clearer than trying to
figure it out from the scipy.maxentropy examples.
Still I have some major language barriers as in the first lingpipe-blog article.
Post by Mathieu Blondel
Strictly speaking Logistic Regression is a model and Maximum Entropy
is a parameter estimation method... But like Jason said, the MLE/MAP
and Maximum Entropy solutions of Logistic Regression are the same.
I think Logistic Regression here is more a multinomial logit model and
not binary, as I understand it.

The equivalence looks to me (now) like it holds only in a special
case, which might be the main application in NLP and the only thing
that scipy.maxentropy does.
But as an estimation method Maximum Entropy is more general, and I
have seen lot's of variation on multinomial logit where, I guess, the
relationship breaks down.

All the NLP references that I have looked at and Golan's chapter 7
work with (finite) discrete sample spaces, but I guess the equivalence
extends to the more common econometrics case of continuous explanatory
variables. I never tried to figure out the big model in
scipy.maxentropy.
Post by Mathieu Blondel
The reason for the confusion is probably because the Logistic
Regression and Maximum Entropy communities usually present things
differently. People in the Logistic Regression community work directly
with features and use one D-dimensional weight vector for each of the
M classes. People in the Maximum Entropy community (NLP people)
usually work with feature functions f(x,y) and a single
(DxM)-dimensional vector. But they are essentially the same: applying
the feature functions is equivalent to creating vectors where all
components are 0 except those in the block corresponding to the class
the example is labeled with.
http://lingpipe-blog.com/2008/04/03/logistic-regression-by-any-other-name/
http://lingpipe-blog.com/2009/04/30/max-entropy-logistic-regressionfeature-extraction-coefficient-encoding/
These two are fun, I think the problems with differences in
parameterization is the same in econometrics with a variety of
different multinomial or conditional logit specifications.

Josef
Post by Mathieu Blondel
Ryan McDonald has excellent slides about linear models for NLP (he
http://www.ryanmcd.com/courses/gslt2009/gslt2009.html
Mathieu
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
j***@gmail.com
2011-01-24 20:35:33 UTC
Permalink
Post by j***@gmail.com
Post by Mathieu Blondel
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Thanks for all the references.
Skimming and reading for a bit, it's a lot clearer than trying to
figure it out from the scipy.maxentropy examples.
Still I have some major language barriers as in the first lingpipe-blog article.
Post by Mathieu Blondel
Strictly speaking Logistic Regression is a model and Maximum Entropy
is a parameter estimation method... But like Jason said, the MLE/MAP
and Maximum Entropy solutions of Logistic Regression are the same.
I think Logistic Regression here is more a multinomial logit model and
not binary, as I understand it.
The equivalence looks to me (now) like it holds only in a special
case, which might be the main application in NLP and the only thing
that scipy.maxentropy does.
But as an estimation method Maximum Entropy is more general, and I
have seen lot's of variation on multinomial logit where, I guess, the
relationship breaks down.
All the NLP references that I have looked at and Golan's chapter 7
work with (finite) discrete sample spaces, but I guess the equivalence
extends to the more common econometrics case of continuous explanatory
variables. I never tried to figure out the big model in
scipy.maxentropy.
Post by Mathieu Blondel
The reason for the confusion is probably because the Logistic
Regression and Maximum Entropy communities usually present things
differently. People in the Logistic Regression community work directly
with features and use one D-dimensional weight vector for each of the
M classes. People in the Maximum Entropy community (NLP people)
usually work with feature functions f(x,y) and a single
(DxM)-dimensional vector. But they are essentially the same: applying
the feature functions is equivalent to creating vectors where all
components are 0 except those in the block corresponding to the class
the example is labeled with.
http://lingpipe-blog.com/2008/04/03/logistic-regression-by-any-other-name/
http://lingpipe-blog.com/2009/04/30/max-entropy-logistic-regressionfeature-extraction-coefficient-encoding/
These two are fun, I think the problems with differences in
parameterization is the same in econometrics with a variety of
different multinomial or conditional logit specifications.
just as a footnote
http://lingpipe-blog.com/2010/01/12/nobel-memorial-prize-for-logistic-regression-aka-discrete-choice-analysis/
I was fighting with these parameterizations last summer.

Josef
Post by j***@gmail.com
Josef
Post by Mathieu Blondel
Ryan McDonald has excellent slides about linear models for NLP (he
http://www.ryanmcd.com/courses/gslt2009/gslt2009.html
Mathieu
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
xinfan meng
2011-01-25 01:04:30 UTC
Permalink
Thanks for your further clarification. But I still have one question
that bothering me for a long time. If Logistic regression and Maximum
entropy are the same, why they are solved with totally different
methods? GIS or IIS are used for Maximum entropy, but IRLS is used
for Logistic regression.
Post by Mathieu Blondel
Post by j***@gmail.com
I guess  many people just don't know that logistic regression is
maximum entropy model.
Do you have a reference for that? I never heard of the relationship either.
Strictly speaking Logistic Regression is a model and Maximum Entropy
is a parameter estimation method... But like Jason said, the MLE/MAP
and Maximum Entropy solutions of Logistic Regression are the same.
The reason for the confusion is probably because the Logistic
Regression and Maximum Entropy communities usually present things
differently. People in the Logistic Regression community work directly
with features and use one D-dimensional weight vector for each of the
M classes. People in the Maximum Entropy community (NLP people)
usually work with feature functions f(x,y) and a single
(DxM)-dimensional vector. But they are essentially the same: applying
the feature functions is equivalent to creating vectors where all
components are 0 except those in the block corresponding to the class
the example is labeled with.
http://lingpipe-blog.com/2008/04/03/logistic-regression-by-any-other-name/
http://lingpipe-blog.com/2009/04/30/max-entropy-logistic-regressionfeature-extraction-coefficient-encoding/
Ryan McDonald has excellent slides about linear models for NLP (he
http://www.ryanmcd.com/courses/gslt2009/gslt2009.html
Mathieu
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
Skipper Seabold
2011-01-25 01:36:38 UTC
Permalink
Post by xinfan meng
Thanks for your further clarification. But I still have one question
that bothering me for a long time. If Logistic regression and Maximum
entropy are the same, why they are solved with totally different
methods? GIS or IIS are used for  Maximum entropy, but IRLS is used
for Logistic regression.
Bear in mind I don't have a machine learning background, so someone
might disagree with me (please do). But I would not say that they are
the same as the motivations for the two are different. I would instead
say that the solutions coincide in this particular case (if the
entropy is measured by Kullback-Liebler distance with uniform priors).
Maximum entropy is actually more general and uses less assumptions.
When I first saw the proof, I thought of how maximum likelihood
assuming a normal distribution and the least squares solutions
coincide but the motivations (and solution methods) are a bit
different.

FWIW,

Skipper
Jason Rennie
2011-01-25 02:40:51 UTC
Permalink
It's easier to get a paper published if you have a cute name. In fact, a
*lot* of estimation methods are variants of gradient descent. A good
gradient descent-type solver, such as conjugate gradients or L-BFGS will
often do better than a more-commonly-used method. Here's one example:

@inproceedings{Salakhutdinov03
,author = "Ruslan Salakhutdinov and Sam Roweis and Zoubin Ghahramani"
,title = "Optimization with {EM} and Expectation-Conjugate-Gradient"
,year = 2003
,booktitle = "Proceedings of the Twentieth International Conference on
Machine Learning (ICML-2003)"
}

Cheers,

Jason
Post by xinfan meng
Thanks for your further clarification. But I still have one question
that bothering me for a long time. If Logistic regression and Maximum
entropy are the same, why they are solved with totally different
methods? GIS or IIS are used for Maximum entropy, but IRLS is used
for Logistic regression.
--
Jason Rennie
Research Scientist, ITA Software
617-714-2645
http://www.itasoftware.com/
Gael Varoquaux
2011-01-25 06:17:35 UTC
Permalink
xinfan meng
2011-01-25 07:00:28 UTC
Permalink
Amazing. I have never read this paper before. We really should add it
to the document reference.

On Tue, Jan 25, 2011 at 2:17 PM, Gael Varoquaux
   It's easier to get a paper published if you have a cute name.  In fact, a
   *lot* of estimation methods are variants of gradient descent.  A good
   gradient descent-type solver, such as conjugate gradients or L-BFGS will
   often do better than a more-commonly-used method.
http://acl.ldc.upenn.edu/W/W02/W02-2018.pdf
The academic world does have its warts.
Gaël
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
Gael Varoquaux
2011-01-25 07:40:14 UTC
Permalink
Go for it: if you send a pull request adding it to the docs, I'll pull!

----- Original message -----
Post by xinfan meng
Amazing. I have never read this paper before. We really should add it
to the document reference.
On Tue, Jan 25, 2011 at 2:17 PM, Gael Varoquaux
   It's easier to get a paper published if you have a cute name.  In
fact, a   *lot* of estimation methods are variants of gradient
descent.  A good   gradient descent-type solver, such as conjugate
gradients or L-BFGS will   often do better than a more-commonly-used
method.
http://acl.ldc.upenn.edu/W/W02/W02-2018.pdf
The academic world does have its warts.
Gaël
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better
price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer
expires February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better
price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer
expires  February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
xinfan meng
2011-01-24 15:21:36 UTC
Permalink
I agree that the maxent model might be a little awkward to be put into
scipy, thus Many nlper just don't notice that. I myself just recently
found scipy has one such module. Previously, my friends and I just use
maxent toolkit by Zhang Le. Though, I have to say, the examples
provided in scipy.maxent is very nice as a companion of the maxent
paper by Adam Berger.

On Mon, Jan 24, 2011 at 10:35 PM, Ralf Gommers
Post by Ralf Gommers
(excuse the cross-post, but this may be of interest to scipy-user and the
scikits.learn crowd)
On Sat, Jan 22, 2011 at 8:50 AM, Ralf Gommers
I picked up the montecarlo code when I was playing around with these.
http://bazaar.launchpad.net/~jsseabold/statsmodels/statsmodels-skipper-maxent/files/head:/scikits/statsmodels/sandbox/maxentropy/
I'm curious if the maxentropy stuff as it is in scipy wouldn't find
more use and maintenance in scikits.learn.  The implementation is
somewhat use specific (natural language processing), though this is
not by any means set in stone.
Probably, but wouldn't it need a lot of work before it could be moved? It
has a grand total of one test, mostly non-working examples, and is
obviously
hardly used at all (see r6919 and r6920 for more examples of broken
code).
Perhaps it's worth asking the scikits.learn guys, and otherwise consider
deprecating it if they're not interested?
I haven't seen or heard anyone using it besides Skipper. There are
also still some features that where designed for pysparse and never
fully updated to scipy.sparse.
http://projects.scipy.org/scipy/ticket/856
I also thought deprecating and removing maxentropy will be the best
idea, if nobody volunteers to give it a workout.
So I guess we just have to ask this out loud: is anyone using the
scipy.maxentropy module or interested in doing so? If you are, would you be
interested in putting some work into it, like making the examples work and
adding some tests?
The current status is that 3 out of 4 examples are broken, the module has
only a single test, and from broken code that went unnoticed for a long time
it is clear that there are very few users.
If no one steps up, I propose to deprecate the module for the 0.10 release.
If there are any users out there that missed this email and step up then, we
can always un-deprecate again.
To the scikits.learn developers: would this code fit better and see more use
in scikits.learn than in scipy? Would you be interested to pick it up?
Ralf
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Best Wishes
--------------------------------------------
Meng Xinfan(蒙新泛)
Institute of Computational Linguistics
Department of Computer Science & Technology
School of Electronic Engineering & Computer Science
Peking University
Beijing, 100871
China
Loading...