Discussion:
[Scikit-learn-general] LogisticRegression regularizes intercept
David Ojeda
2016-03-15 09:03:07 UTC
Permalink
Hello scikit-learners,

A while back, I used this wonderful library to replicate some work that was
done previously on R. I really liked the design of this library; kudos!
I mainly used a LogisticRegression with L1 regularization, but I ran into
some problems when trying to understand why my results were slightly
different. In fact, I found out that scikit-learn does not regularize as
advertised in the user guide (cf.
http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression).
Here, it is said that the objective function is C * (the entropy part) +
||w||_1, which is the L1 norm of the weight vector. In this formula, the
intercept c is not regularized. However, the internal code of scikit-learn
does ||W||_1 where W is [w0,...,wn,c]. In other words, c is regularized.

I have two questions regarding this:
1. Does anyone know the effect of regularizing the intercept? To me, it
doesn't seem entirely correct.
2. Shouldn't the user guide show the correct formula ?

Have a nice day...

David
Tom DLT
2016-03-15 10:36:58 UTC
Permalink
Hi David,

Indeed the "liblinear" solver does regularize the intercept, which is not
entirely correct, and should probably be more detailed in the doc.
To lessen this effect, you may want to increase the "intercept_scaling"
parameter to a quite large value.

Note that if you use a L2 regularization instead, you can also try the 3
other solvers
("newton-cg", "lbfgs" and "sag"), which do not regularize the intercept and
that also handle multinomial loss.

Best,

Tom
Post by David Ojeda
Hello scikit-learners,
A while back, I used this wonderful library to replicate some work that
was done previously on R. I really liked the design of this library; kudos!
I mainly used a LogisticRegression with L1 regularization, but I ran into
some problems when trying to understand why my results were slightly
different. In fact, I found out that scikit-learn does not regularize as
advertised in the user guide (cf.
http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression).
Here, it is said that the objective function is C * (the entropy part) +
||w||_1, which is the L1 norm of the weight vector. In this formula, the
intercept c is not regularized. However, the internal code of scikit-learn
does ||W||_1 where W is [w0,...,wn,c]. In other words, c is regularized.
1. Does anyone know the effect of regularizing the intercept? To me, it
doesn't seem entirely correct.
2. Shouldn't the user guide show the correct formula ?
Have a nice day...
David
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Loading...