Discussion:
[Scikit-learn-general] bootstrap depracation warning
Arman Eshaghi
2014-08-15 09:31:28 UTC
Permalink
Hi,

I'm wondering why I'm getting deprecation warnings for bootstrap
`sklearn.cross_validation.Bootstrap` . It seems to be a very useful
feature, so maybe you are trying to transfer the class to another place?
I'm writing some code that I need to run for a long time and I would be
very much interested to see what the plan is for future.

Thanks
Arman
Gael Varoquaux
2014-08-15 11:03:49 UTC
Permalink
We haven't been able to understand where in the context of machine learning this object was useful. Could you please give us an example of use. 

Gaël

<div>-------- Original message --------</div><div>From: Arman Eshaghi <***@gmail.com> </div><div>Date:15/08/2014 11:31 (GMT+01:00) </div><div>To: scikit-learn-***@lists.sourceforge.net </div><div>Subject: [Scikit-learn-general] bootstrap depracation warning </div><div>
</div>Hi,

I'm wondering why I'm getting deprecation warnings for bootstrap `sklearn.cross_validation.Bootstrap` . It seems to be a very useful feature, so maybe you are trying to transfer the class to another place? I'm writing some code that I need to run for a long time and I would be very much interested to see what the plan is for future.

Thanks
Arman
Arman Eshaghi
2014-08-17 15:55:15 UTC
Permalink
I use it to get more stable results in cross validation, but I'm sure there
is a more important thing that I do not understand here, I will use
permutation (shuffle and split) from now on.

Thanks
Arman


On Fri, Aug 15, 2014 at 3:33 PM, Gael Varoquaux <
Post by Gael Varoquaux
We haven't been able to understand where in the context of machine
learning this object was useful. Could you please give us an example of
use.
Gaël
-------- Original message --------
From: Arman Eshaghi
Date:15/08/2014 11:31 (GMT+01:00)
Subject: [Scikit-learn-general] bootstrap depracation warning
Hi,
I'm wondering why I'm getting deprecation warnings for bootstrap
`sklearn.cross_validation.Bootstrap` . It seems to be a very useful
feature, so maybe you are trying to transfer the class to another place?
I'm writing some code that I need to run for a long time and I would be
very much interested to see what the plan is for future.
Thanks
Arman
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Sebastian Raschka
2014-08-17 17:58:14 UTC
Permalink
I wouldn't remove bootstrapping. It is maybe not as commonly used as k-fold cross validation, but it is a quite established sampling technique. It would be good to keep this "sampling with replacement" alternative to cross validation.

Sebastian
I use it to get more stable results in cross validation, but I'm sure there is a more important thing that I do not understand here, I will use permutation (shuffle and split) from now on.
Thanks
Arman
Post by Gael Varoquaux
We haven't been able to understand where in the context of machine learning this object was useful. Could you please give us an example of use.
Gaël
-------- Original message --------
From: Arman Eshaghi
Date:15/08/2014 11:31 (GMT+01:00)
Subject: [Scikit-learn-general] bootstrap depracation warning
Hi,
I'm wondering why I'm getting deprecation warnings for bootstrap `sklearn.cross_validation.Bootstrap` . It seems to be a very useful feature, so maybe you are trying to transfer the class to another place? I'm writing some code that I need to run for a long time and I would be very much interested to see what the plan is for future.
Thanks
Arman
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2014-08-17 21:29:15 UTC
Permalink
Post by Sebastian Raschka
I wouldn't remove bootstrapping. It is maybe not as commonly used as k-fold
cross validation, but it is a quite established sampling technique.
Indeed, I use bootstrap almost every day, but not in the context of
measuring predictive performance.
Post by Sebastian Raschka
It would be good to keep this "sampling with replacement" alternative
to cross validation.
How exactly would you use the Bootstrap object? I am worried that it is
getting misused. It does not belong to the same conceptual classes as the
other CV iterators.

Gaël
Sebastian Raschka
2014-08-17 21:38:16 UTC
Permalink
Post by Gael Varoquaux
How exactly would you use the Bootstrap object? I am worried that it is
getting misused. It does not belong to the same conceptual classes as the
other CV iterators.
I would use it similarly to a Kfold CV object, but to address different "questions", e.g., the determination of the variance of estimated accuracy (or recall or precision etc.).
Post by Gael Varoquaux
Post by Sebastian Raschka
I wouldn't remove bootstrapping. It is maybe not as commonly used as k-fold
cross validation, but it is a quite established sampling technique.
Indeed, I use bootstrap almost every day, but not in the context of
measuring predictive performance.
Post by Sebastian Raschka
It would be good to keep this "sampling with replacement" alternative
to cross validation.
How exactly would you use the Bootstrap object? I am worried that it is
getting misused. It does not belong to the same conceptual classes as the
other CV iterators.
Gaël
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Olivier Grisel
2014-08-18 07:46:28 UTC
Permalink
But the sklearn.cross_validation.Bootstrap currently implemented in sklearn
is a cross validation iterator, not a generic resampling method to estimate
variance or confidence intervals. Don't be mislead by the name. If we chose
to deprecate and then remove this class, it's precisely because it causes
confusion.
Arman Eshaghi
2014-08-18 07:57:06 UTC
Permalink
thanks for the discussion. Could you please what the right way of using
boostraping for confidence interval calculation (or other statistics) would
be? I mean what would you do to get, as olivier said a "generic resampling
method to estimate variance or confidence intervals"? I'm under the
impression that I need to define my own function for this as it is
not exactly what I had in mind?

Also it seems that shuffle and split (I call it permutation) is also an
iterator for cross-validation (same confusion about bootstraping)?
Post by Olivier Grisel
But the sklearn.cross_validation.Bootstrap currently implemented in
sklearn is a cross validation iterator, not a generic resampling method to
estimate variance or confidence intervals. Don't be mislead by the name. If
we chose to deprecate and then remove this class, it's precisely because it
causes confusion.
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Olivier Grisel
2014-08-18 08:38:54 UTC
Permalink
Post by Arman Eshaghi
thanks for the discussion. Could you please what the right way of using
boostraping for confidence interval calculation (or other statistics) would
be? I mean what would you do to get, as olivier said a "generic resampling
method to estimate variance or confidence intervals"? I'm under the
impression that I need to define my own function for this as it is
not exactly what I had in mind?

For genericbootstrap confidence intervals you can use scikits.bootstrap (a
small separate project). I would personally be in favor of having such
tools in scipy.stats by default though.
Post by Arman Eshaghi
Also it seems that shuffle and split (I call it permutation) is also an
iterator for cross-validation (same confusion about bootstraping)?

Yes but contrary to our deprecated Bootstrap class, the shuffle & split
strategy is a standard way to prepare folds for cross validation. You can
see it as a generalization of iterated randomized k fold cross validation
where you decouple test fold size from the number of folds / iterations.
Arman Eshaghi
2014-08-18 09:17:08 UTC
Permalink
thanks, very informative.
Post by Arman Eshaghi
Post by Arman Eshaghi
thanks for the discussion. Could you please what the right way of using
boostraping for confidence interval calculation (or other statistics) would
be? I mean what would you do to get, as olivier said a "generic resampling
method to estimate variance or confidence intervals"? I'm under the
impression that I need to define my own function for this as it is
not exactly what I had in mind?
For genericbootstrap confidence intervals you can use scikits.bootstrap (a
small separate project). I would personally be in favor of having such
tools in scipy.stats by default though.
Post by Arman Eshaghi
Also it seems that shuffle and split (I call it permutation) is also an
iterator for cross-validation (same confusion about bootstraping)?
Yes but contrary to our deprecated Bootstrap class, the shuffle & split
strategy is a standard way to prepare folds for cross validation. You can
see it as a generalization of iterated randomized k fold cross validation
where you decouple test fold size from the number of folds / iterations.
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Sebastian Raschka
2014-08-18 14:15:44 UTC
Permalink
But the sklearn.cross_validation.Bootstrap currently implemented in sklearn is a cross validation iterator, not a generic resampling method to estimate variance or confidence intervals. Don't be mislead by the name. If we chose to deprecate and then remove this class, it's precisely because it causes confusion.
Hm, I can kind of see why the Bootstrap calls was initially put into sklearn.cross_validation, technically, the "approaches" (cross validation, bootstrap, jackknife) are very related. The only difference is that you have sampling "with replacement" in the bootstrap approach and that you would typically want to have >1000 iterations.
So, the suggestion would be to remove Bootstrap and use sklearn.utils.resample in future? I would say that it is good that the Bootstrap is implemented like an CV object, since it would make the "estimate" and "error" calculation more convenient, right?
Olivier Grisel
2014-08-18 16:15:37 UTC
Permalink
Post by Sebastian Raschka
Post by Olivier Grisel
But the sklearn.cross_validation.Bootstrap currently implemented in
sklearn is a cross validation iterator, not a generic resampling method to
estimate variance or confidence intervals. Don't be mislead by the name. If
we chose to deprecate and then remove this class, it's precisely because it
causes confusion.
Post by Sebastian Raschka
Hm, I can kind of see why the Bootstrap calls was initially put into
sklearn.cross_validation, technically, the "approaches" (cross validation,
bootstrap, jackknife) are very related. The only difference is that you
have sampling "with replacement" in the bootstrap approach and that you
would typically want to have >1000 iterations.
Post by Sebastian Raschka
So, the suggestion would be to remove Bootstrap and use
sklearn.utils.resample in future?

Well it depends why do you want to use bootstrapping for. If it's for model
evaluation (estimation of some validation score), then the recommended way
is to use ShuffleSplit or StratifiedShuffleSplit instead. If you want
generic bootstrap estimation features such as confidence interval
estimation (that does not exist in scikit-learn by the way), then I would
recommend you to have a look at scikits.bootstrap [1] which also implement
bias correction for skewed distribution which is non-trivial to do manually.

[1] https://scikits.appspot.com/bootstrap

sklearn.utils is meant only for internal use in the scikit-learn project.
For instance sklearn.utils.resample is useful to implement resampling
internally in bagging models if I remember correctly.
Post by Sebastian Raschka
I would say that it is good that the Bootstrap is implemented like an CV
object,

I precisely think the opposite. There is no point in using sampling with
replacement vs sampling without replacement to estimate the validation
error of a model. Traditional strategies for cross-validation as
implemented in Shuffle & Split are as flexible and simpler to interpret
than our weird Bootstrap cross-validation iterator.

See also:

Post by Sebastian Raschka
since it would make the "estimate" and "error" calculation more
convenient, right?

I don't understand what you mean "estimate" by "error". Both the model
parameters, its individual predictions and its cross-validation scores or
errors can be called "estimates": anything that is derived from sampled
data points is an estimate.
--
Olivier
j***@gmail.com
2014-08-18 16:28:48 UTC
Permalink
Post by Olivier Grisel
Post by Sebastian Raschka
Post by Olivier Grisel
But the sklearn.cross_validation.Bootstrap currently implemented in
sklearn is a cross validation iterator, not a generic resampling method to
estimate variance or confidence intervals. Don't be mislead by the name. If
we chose to deprecate and then remove this class, it's precisely because it
causes confusion.
Post by Sebastian Raschka
Hm, I can kind of see why the Bootstrap calls was initially put into
sklearn.cross_validation, technically, the "approaches" (cross validation,
bootstrap, jackknife) are very related. The only difference is that you
have sampling "with replacement" in the bootstrap approach and that you
would typically want to have >1000 iterations.
Post by Sebastian Raschka
So, the suggestion would be to remove Bootstrap and use
sklearn.utils.resample in future?
Well it depends why do you want to use bootstrapping for. If it's for
model evaluation (estimation of some validation score), then the
recommended way is to use ShuffleSplit or StratifiedShuffleSplit instead.
If you want generic bootstrap estimation features such as confidence
interval estimation (that does not exist in scikit-learn by the way), then
I would recommend you to have a look at scikits.bootstrap [1] which also
implement bias correction for skewed distribution which is non-trivial to
do manually.
[1] https://scikits.appspot.com/bootstrap
sklearn.utils is meant only for internal use in the scikit-learn project.
For instance sklearn.utils.resample is useful to implement resampling
internally in bagging models if I remember correctly.
Post by Sebastian Raschka
I would say that it is good that the Bootstrap is implemented like an CV
object,
I precisely think the opposite. There is no point in using sampling with
replacement vs sampling without replacement to estimate the validation
error of a model. Traditional strategies for cross-validation as
implemented in Shuffle & Split are as flexible and simpler to interpret
than our weird Bootstrap cross-validation iterator.
See also: http://youtu.be/BzHz0J9a6k0
Post by Sebastian Raschka
since it would make the "estimate" and "error" calculation more
convenient, right?
I don't understand what you mean "estimate" by "error". Both the model
parameters, its individual predictions and its cross-validation scores or
errors can be called "estimates": anything that is derived from sampled
data points is an estimate.
Just a remark from the sidelines,
(I hope to get bootstrap and cross-validation iterators into the next
version of statsmodels, borrowing some of the ideas and code from
scikit-learn, but emphasis in statsmodels will be on bootstrap and
permutation iterators.)

What I think sklearn doesn't have, are early stopping with randomized
selection for cross-validation iterators. If LOO/jacknife are expensive to
calculate for all LOO sets. Can you randomly select among the LOO sets, or
similar for other iterators?
Similar, permutation inference is often difficult because the set of
permutations is getting too large, then bootstrap is the usual alternative
for larger samples.

(I may be incorrect since I only briefly looked at the changes to your
cross-validation.)

Josef
Post by Olivier Grisel
--
Olivier
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Olivier Grisel
2014-08-18 16:43:25 UTC
Permalink
This post might be inappropriate. Click to display it.
j***@gmail.com
2014-08-18 17:37:59 UTC
Permalink
On Mon, Aug 18, 2014 at 12:15 PM, Olivier Grisel <
écrit
Post by Olivier Grisel
Post by Sebastian Raschka
Post by Olivier Grisel
But the sklearn.cross_validation.Bootstrap currently implemented in
sklearn is a cross validation iterator, not a generic resampling
method to
Post by Olivier Grisel
Post by Sebastian Raschka
Post by Olivier Grisel
estimate variance or confidence intervals. Don't be mislead by the
name. If
Post by Olivier Grisel
Post by Sebastian Raschka
Post by Olivier Grisel
we chose to deprecate and then remove this class, it's precisely
because it
Post by Olivier Grisel
Post by Sebastian Raschka
Post by Olivier Grisel
causes confusion.
Hm, I can kind of see why the Bootstrap calls was initially put into
sklearn.cross_validation, technically, the "approaches" (cross
validation,
Post by Olivier Grisel
Post by Sebastian Raschka
bootstrap, jackknife) are very related. The only difference is that
you have
Post by Olivier Grisel
Post by Sebastian Raschka
sampling "with replacement" in the bootstrap approach and that you
would
Post by Olivier Grisel
Post by Sebastian Raschka
typically want to have >1000 iterations.
So, the suggestion would be to remove Bootstrap and use
sklearn.utils.resample in future?
Well it depends why do you want to use bootstrapping for. If it's for
model evaluation (estimation of some validation score), then the
recommended
Post by Olivier Grisel
way is to use ShuffleSplit or StratifiedShuffleSplit instead. If you
want
Post by Olivier Grisel
generic bootstrap estimation features such as confidence interval
estimation
Post by Olivier Grisel
(that does not exist in scikit-learn by the way), then I would
recommend you
Post by Olivier Grisel
to have a look at scikits.bootstrap [1] which also implement bias
correction
Post by Olivier Grisel
for skewed distribution which is non-trivial to do manually.
[1] https://scikits.appspot.com/bootstrap
sklearn.utils is meant only for internal use in the scikit-learn
project.
Post by Olivier Grisel
For instance sklearn.utils.resample is useful to implement resampling
internally in bagging models if I remember correctly.
Post by Sebastian Raschka
I would say that it is good that the Bootstrap is implemented like an
CV
Post by Olivier Grisel
Post by Sebastian Raschka
object,
I precisely think the opposite. There is no point in using sampling with
replacement vs sampling without replacement to estimate the validation
error
Post by Olivier Grisel
of a model. Traditional strategies for cross-validation as implemented
in
Post by Olivier Grisel
Shuffle & Split are as flexible and simpler to interpret than our weird
Bootstrap cross-validation iterator.
See also: http://youtu.be/BzHz0J9a6k0
Post by Sebastian Raschka
since it would make the "estimate" and "error" calculation more
convenient, right?
I don't understand what you mean "estimate" by "error". Both the model
parameters, its individual predictions and its cross-validation scores
or
Post by Olivier Grisel
errors can be called "estimates": anything that is derived from sampled
data
Post by Olivier Grisel
points is an estimate.
Just a remark from the sidelines,
(I hope to get bootstrap and cross-validation iterators into the next
version of statsmodels, borrowing some of the ideas and code from
scikit-learn, but emphasis in statsmodels will be on bootstrap and
permutation iterators.)
What I think sklearn doesn't have, are early stopping with randomized
selection for cross-validation iterators. If LOO/jacknife are expensive
to
calculate for all LOO sets. Can you randomly select among the LOO sets,
or
similar for other iterators?
No, but that's would be good idea for ShuffleSplit as well. If I
understand correctly, you would pass something like tolerance
parameter (e.g. I want a validation score with precise to 2 decimals)
and use as few iterations as possible to each that precision and then
stop sampling. Is that right?
That's open to API decisions.
So far I have been going both ways, let users specify the number of
permutations and provide helper functions to check precision, or allow to
continue until a precision is reached. (my examples were usually to target
p-values)

I haven't made up my mind about one or the other or both.
Similar, permutation inference is often difficult because the set of
permutations is getting too large, then bootstrap is the usual
alternative
for larger samples.
(I may be incorrect since I only briefly looked at the changes to your
cross-validation.)
One thing to keep in mind is that sklearn.cross_validation.Bootstrap
is not the real bootstrap: it's a random permutation + split + random
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L718
This 2 steps procedures is done to make sure that no test samples is
part of the training fold at each iteration. A more natural way to
respect that constraint would be to sample with replacement from the
full dataset and then use out-of-bags samples for the validation set.
But then you would loose control on the size of the test fold. This
second strategy is more like the real bootstrap and is the one I
should have implemented initially instead of the weird beast that
sklearn.cross_validation.Bootstrap is currently.
I would have thought of a slightly simplified version, where the testset is
always the full set, so you have the bootstrap sampling only on the
training sample.

Or even simpler, keep the split between train and test sample fixed.
I might be thinking of different applications. The main focus for
statsmodels to complement the ones in scikit-learn will be for data without
independent observations, or a natural sequence, time series, correlated
data, ...

But, I've never seen bootstrap for cross-validation.

Josef
--
Olivier
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Sebastian Raschka
2014-08-18 18:44:28 UTC
Permalink
Post by Sebastian Raschka
since it would make the "estimate" and "error" calculation more convenient, right?
I don't understand what you mean "estimate" by "error". Both the model parameters, its individual predictions and its cross-validation scores or errors can be called "estimates": anything that is derived from sampled data points is an estimate.
For example, the calculation of the mean-accuracy from all iterations, and the calculation of the standard deviation/error of the mean (just like in regular Kfold cross-validation). I have to agree that there are probably better approaches and techniques as you mentioned, but I wouldn't remove it just because very few people use it in practice.
Olivier Grisel
2014-08-19 19:15:12 UTC
Permalink
Post by Olivier Grisel
Post by Sebastian Raschka
since it would make the "estimate" and "error" calculation more convenient, right?
I don't understand what you mean "estimate" by "error". Both the model
parameters, its individual predictions and its cross-validation scores or
errors can be called "estimates": anything that is derived from sampled data
points is an estimate.
For example, the calculation of the mean-accuracy from all iterations, and
the calculation of the standard deviation/error of the mean
Well this is not what sklearn.cross_validation.Bootstrap is doing.
It's doing some weird cross-validation splits that I made up a couple
of years ago (and that I now regret deeply) and that nobody uses in
the literature. Again read its docstring and have a look at the source
code:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L718

No-where you will see and estimate of the standard deviation of the
validation score nor the standard error of the mean validation score
across folds.
Post by Olivier Grisel
(just like in regular Kfold cross-validation).
The KFold cross-validation iterator in sklearn does not compute the
standard error of the mean score itself. The cross_val_score function
with cv=KFold(5) returns the score on computed each validation fold.
It would be interesting to estimate the standard deviation of the
validation score (or better a 95% confidence interval of it) but:

- this is not what sklearn.cross_validation.Bootstrap is doing: it
just compute CV folds as all the other iterators in the
sklearn.cross_validation module
- estimating is the standard error of the mean of 5 points (for 5-fold
CV for instance) using a bootstrapping procedure is prone to lead to
bad results.

Empirically I found that bootstrapping works fine to estimate
confidence intervals with *at least* 50 samples (and thousands of
bootstrap iterations).

Therefore to obtain good confidence intervals on CV scores, the right
approach (in my opinion) would be to:

1- have some kind of cross_val_predictions function that would return
individual predictions for each sample in any of the validation folds
of a CV procedure instead of the score on each folds as our
cross_val_score function does;

2- use a bootstrapping procedure by re-sampling many times with
replacement out of those predictions so as to compute a bootstrapped
distribution of the validation score using;

3- take a confidence interval on that bootstrapped distribution of the
validation score.

Furthermore as typical scoring functions are censored (for instance
the accuracy score is bounded by 0 and 1), it is very likely that the
bootstrapped distribution of the validation score is going to be
skewed (for instance a validation accuracy score distribution could
have a 95% confidence interval between 0.94 and 1.00 with a mean at
0.99). For skewed distributions a naive percentile interval is
typically wrong because of the bias introduced by the skewness. In
that case this bias can be corrected by using the Bias-Corrected
Accelerated Non-Parametric bootstrap procedure as implemented in
scikits.bootstrap:

https://github.com/cgevans/scikits-bootstrap/blob/master/scikits/bootstrap/bootstrap.py#L70

Having BCA bootstrap confidence intervals in scipy.stats would
certainly make it simpler to implement this kind of feature in
scikit-learn. But again what I just described here is completely
different from what we have in the sklearn.cross_validation.Bootstrap
class. The sklearn.cross_validation.Bootstrap class cannot be changed
to implement this as it does not even have the right API to do so. It
would be have to be an entirely new function or class.
Post by Olivier Grisel
I have to agree that there are probably better approaches and techniques as you mentioned, but I wouldn't remove it
just because very few people use it in practice.
We don't remove the sklearn.cross_validation.Bootstrap class because
few people are using it, but because too many people are using
something that is non-standard (I made it up) and very very likely not
what they expect if they just read its name. At best it is causing
confusion when our users read the docstring and/or its source code. At
worse it causes silent modeling errors in our users code base.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Loading...