Thanks again.
Post by j***@gmail.comJust to come in here as an econometrician and statsmodels maintainer.
statsmodels intentionally doesn't enforce binary data for Logit or
similar models, any data between 0 and 1 is fine.
Logistic Regression/Logit or similar Binomial/Bernoulli models can
consistently estimate the expected value (predicted mean) for a continuous
variable that is between 0 and 1 like a proportion. (Binomial belongs to
the exponential family where quasi-maximum likelihood method works well.)
Inference has to be adjusted because a logit model cannot be "true" if
the data is not binary.
I have somewhere references and examples for this usecase.
statsmodels doesn't do "classification", i.e. hard thresholding, users
can do it themselves if they need to.
Which means we leave classification to scikit-learn and only do
regression, even for funny data, and statsmodels doesn't have methods that
take advantage of the classification structure of a model.
Josef
On Sat, Oct 3, 2015 at 10:50 PM, Sebastian Raschka <
Post by Sebastian RaschkaHi, George,
logistic regression is a binary classifier by nature (class labels 0
and 1). Scikit-learn supports multi-class classification via One-vs-One or
One-vs-All though; and there is a generalization (softmax) that gives you
meaningful probabilities for multiple classes (i.e., class probabilities
sum up to 1). In any case, logistic regression works with nominal class
labels - categorical class labels with no order implied.
To keep a long story short: Logistic regression is a classifier, not a
regressor â the name is misleading, I agree. I think you may want to look
into regression analysis for your continuous target variable.
Best,
Sebastian
Post by George BezerraHi there,
I would like to train a logistic regression model on a continuous
(i.e., not categorical) target variable. The target is a probability, which
is why I am using a logistic regression for this problem. However, the
sklearn function tries to find the class labels by running a unique() on
the target values, which is disastrous if y is continuous.
Post by George BezerraIs there a way to train logistic regression on a continuous target
variable in sklearn?
Post by George BezerraAny help is highly appreciated.
Best,
George.
--
George Bezerra
------------------------------------------------------------------------------
Post by George Bezerra_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general