Discussion:
Implementation of a cross_val_predict function in the cross_val module
(too old to reply)
Vincent Dubourg
2010-12-13 07:24:13 UTC
Permalink
Hi list,

Alex (Gramfort) and I came up with the premise that the cross_val_score
function raises an inconsistency when the default regressors' score
function (r2_score) is coupled with the LeaveOneOut iterator. Indeed,
r2_score(y_true, y_pred) would return a "-Inf"-full array as it computes
the variance on a single sample for each fold, thus y_true -
y_true.mean() = 0 and finally R2 = -Inf due to the division by zero...

And we also came up with a solution, isn't it nice?!

Why not implementing a cross_val_predict function that would return an
array with shape (n_folds, folds_size) containing the cross_val
predictions of the estimator on the folds? Using this function in
conjunction with eg the LeaveOneOut iterator would allow one to:
- perform and retrieve an exhaustive list of leave-one-out predictions
to build an adequation plot: y_pred_on_folds vs. y_true;
- plus making a cross_val estimate of any score function in the metrics
module.
from scikits.learn.gaussian_process import GaussianProcess
from scikits.learn.cross_val import cross_val_predict, LeaveOneOut
from scikits.learn.metrics import r2_score
gp = GaussianProcess()
gp.fit(X, y)
y_pred_on_folds = cross_val_predict(gp, X, y,
cv=LeaveOneOut(y.size), njobs=-1)
R2 = r2_score(y_true, y_pred_on_folds)
Cheers,
Vincent
Gael Varoquaux
2010-12-13 09:37:24 UTC
Permalink
Post by Vincent Dubourg
Hi list,
Alex (Gramfort) and I came up with the premise that the cross_val_score
function raises an inconsistency when the default regressors' score
function (r2_score) is coupled with the LeaveOneOut iterator. Indeed,
r2_score(y_true, y_pred) would return a "-Inf"-full array as it computes
the variance on a single sample for each fold, thus y_true -
y_true.mean() = 0 and finally R2 = -Inf due to the division by zero...
And we also came up with a solution, isn't it nice?!
Why not implementing a cross_val_predict function that would return an
array with shape (n_folds, folds_size) containing the cross_val
predictions of the estimator on the folds? Using this function in
- perform and retrieve an exhaustive list of leave-one-out predictions
to build an adequation plot: y_pred_on_folds vs. y_true;
- plus making a cross_val estimate of any score function in the metrics
module.
Makes me kinda sad, because I find that cross_val_predict is much more
specialized than cross_val_score. Namely there is no need for an
estimator to implement a prediction in order to be able to score data.
These problem are orthogonal. In addition cross_val_predict makes little
sens to me.

Does the problem that you have uncovered mean that the R2 score is not a
good model selection score? I don't have enough insight on the problem,
as I don't do regression for a living :). If so, I would suggest backing
out of the modification of the default score for regression estimators
for the release. In other words:

1. Restore the explained variance in the metrics.py module

2. Revert the changes in the base.py module

3. Keep the r2_score in the metrics.py and add a see also in the
score method of regression estimators.

One thing that this tells us, is that we should be really careful in the
changes we make before a release, as we are now debating fairly important
issues a few days before a scheduled release.

Gaël
Gael Varoquaux
2010-12-13 11:45:25 UTC
Permalink
Alexandre Gramfort
2010-12-13 14:32:02 UTC
Permalink
OK, I was indeed wondering why r2_score was subject to this problem, and
not explained_variance. As the problem does not seem like a regression
and requires a bit of thinking, I suggest that we don't try to squeeze in
a fix before the releasE.
explained_variance had the same pb.

I think the use case mentioned here by Vincent is :
- compute all y_pred for each fold
- concat them
- apply score on the concatenation

for now cross_val_score computes a score on each fold.

The iid keyword was supposed to do the trick but It fails in regression settings
due the variance computed in the denominator of r2_score

makes sense?
I would say that running a cross val loop with test set on which you
can't compute a score seems wrong to me. I don't know how others feel
about this however.
   I am just pointing out that looping cross_val directly over these built-in
   score functions is maybe not the best idea, because a cross-validation
   loop is intrusive by nature. In my opinion, you need to do the
   cross-validation predictions first and then evaluates a score function
   imported from the metrics module on the cross-validated predictions.
Can you think of a way to have an option to cross_val_score so that it
works with your usecase? There might be a bit of an effort to put in the
code so that it doesn't look too ugly, and doesn't blow up memory by
keeping huge objects around.
I think is possible with duck typing and overriding the score_func

I'll give it a try when I find the time
PS: please keep the discussion on the list: the opinion of the various
users and developers is very important in my eyes to take informed
decisions.
I agree. We discuss on github with Vincent. We should send an email
on the list whenever a thread starts there

Alex
Vincent Dubourg
2010-12-15 07:59:13 UTC
Permalink
Post by Alexandre Gramfort
OK, I was indeed wondering why r2_score was subject to this problem, and
not explained_variance. As the problem does not seem like a regression
and requires a bit of thinking, I suggest that we don't try to squeeze in
a fix before the releasE.
explained_variance had the same pb.
- compute all y_pred for each fold
- concat them
- apply score on the concatenation
for now cross_val_score computes a score on each fold.
The iid keyword was supposed to do the trick but It fails in regression settings
due the variance computed in the denominator of r2_score
makes sense?
I would say that running a cross val loop with test set on which you
can't compute a score seems wrong to me. I don't know how others feel
about this however.
I am just pointing out that looping cross_val directly over these
built-in
score functions is maybe not the best idea, because a
cross-validation
loop is intrusive by nature. In my opinion, you need to do the
cross-validation predictions first and then evaluates a score
function
imported from the metrics module on the cross-validated predictions.
Can you think of a way to have an option to cross_val_score so that it
works with your usecase? There might be a bit of an effort to put in the
code so that it doesn't look too ugly, and doesn't blow up memory by
keeping huge objects around.
I think is possible with duck typing and overriding the score_func
I'll give it a try when I find the time
PS: please keep the discussion on the list: the opinion of the various
users and developers is very important in my eyes to take informed
decisions.
I agree. We discuss on github with Vincent. We should send an email
on the list whenever a thread starts there
Alex
Hello again,

A colleague and I have been thinking about a workaround for the
inconsistency between the Regressor's default score function and LeaveOneOut
cross-validation loop. It would consists in assigning the RegressorMixin
default score function to return mean_square_error(y_true, y_pred). The
reason is that it still holds for a leave-one-out loop and it is compatible
Post by Alexandre Gramfort
from scikits.learn.gaussian_process import GaussianProcess
from scikits.learn.cross_val import cross_val_score, LeaveOneOut
gp = GaussianProcess()
gp.fit(X, y)
all_squared_deviations = cross_val_score(gp, X, y, cv=LeaveOneOut)
R2 = 1 - all_squared_deviations.sum() / y.var()
The iid keyword might then be useful for other cross-validation iterators to
return an array of unaveraged sums of squared deviations.

However, you may not like the fact that I name "score" a function that
actually returns a "loss"... And I have a feeling that losing the
cross-validation predictions is a bit of a shame as we may spend a lot of
time estimating them... And you also need to compute the r2_score by hand
cause you already made half of the job by computing the squared deviations.

@Alex: What do you mean with "duck typing" and "score_func overriding"?

Cheers,
Vincent

PS: It's clearly too late to think about incorporating such changes in the
forthcoming release... But still we can discuss this issue until we come up
with a nice solution.
Alexandre Gramfort
2010-12-15 13:11:04 UTC
Permalink
Hi,

indeed the pb is that R2_score is not linear. If you have two
(y1_pred, y1_true) and
(y2_pred, y2_true) you have :

R2(y1_pred + y2_pred, y1_true + y2_true) != R2(y1_pred, y1_true) +
R2(y2_pred, y2_true)

whereas the equality holds for a loss and for the zero_one_score.
The only possibility to do what you want vincent is to allow cross_val_score
to return the concatenation of all the y_pred for each fold and then
apply the score.
This is what the iid keyword should do.

if iid:
return score_func(y1_pred + .... , y1_true + ....)
else:
return sum(score_func(y1_pred, y1_true) + score_func(....))

I tried to do that but stopped when trying to have it working in the
unsupervised case.

btw personally I'd rather have the behavior above rather than going
back to the MSE.

hope this clarifies the pb

Alex


On Wed, Dec 15, 2010 at 2:59 AM, Vincent Dubourg
Post by Vincent Dubourg
Post by Alexandre Gramfort
OK, I was indeed wondering why r2_score was subject to this problem, and
not explained_variance. As the problem does not seem like a regression
and requires a bit of thinking, I suggest that we don't try to squeeze in
a fix before the releasE.
explained_variance had the same pb.
- compute all y_pred for each fold
- concat them
- apply score on the concatenation
for now cross_val_score computes a score on each fold.
The iid keyword was supposed to do the trick but It fails in regression settings
due the variance computed in the denominator of r2_score
makes sense?
I would say that running a cross val loop with test set on which you
can't compute a score seems wrong to me. I don't know how others feel
about this however.
   I am just pointing out that looping cross_val directly over these built-in
   score functions is maybe not the best idea, because a
cross-validation
   loop is intrusive by nature. In my opinion, you need to do the
   cross-validation predictions first and then evaluates a score function
   imported from the metrics module on the cross-validated predictions.
Can you think of a way to have an option to cross_val_score so that it
works with your usecase? There might be a bit of an effort to put in the
code so that it doesn't look too ugly, and doesn't blow up memory by
keeping huge objects around.
I think is possible with duck typing and overriding the score_func
I'll give it a try when I find the time
PS: please keep the discussion on the list: the opinion of the various
users and developers is very important in my eyes to take informed
decisions.
I agree. We discuss on github with Vincent. We should send an email
on the list whenever a thread starts there
Alex
Hello again,
A colleague and I have been thinking about a workaround for the
inconsistency between the Regressor's default score function and LeaveOneOut
cross-validation loop. It would consists in assigning the RegressorMixin
default score function to return mean_square_error(y_true, y_pred). The
reason is that it still holds for a leave-one-out loop and it is compatible
Post by Alexandre Gramfort
from scikits.learn.gaussian_process import GaussianProcess
from scikits.learn.cross_val import cross_val_score, LeaveOneOut
gp = GaussianProcess()
gp.fit(X, y)
all_squared_deviations = cross_val_score(gp, X, y, cv=LeaveOneOut)
R2 = 1 - all_squared_deviations.sum() / y.var()
The iid keyword might then be useful for other cross-validation iterators to
return an array of unaveraged sums of squared deviations.
However, you may not like the fact that I name "score" a function that
actually returns a "loss"... And I have a feeling that losing the
cross-validation predictions is a bit of a shame as we may spend a lot of
time estimating them... And you also need to compute the r2_score by hand
cause you already made half of the job by computing the squared deviations.
@Alex: What do you mean with "duck typing" and "score_func overriding"?
Cheers,
Vincent
PS: It's clearly too late to think about incorporating such changes in the
forthcoming release... But still we can discuss this issue until we come up
with a nice solution.
Olivier Grisel
2010-12-15 14:41:34 UTC
Permalink
Post by Alexandre Gramfort
Hi,
indeed the pb is that R2_score is not linear. If you have two
(y1_pred, y1_true) and
R2(y1_pred + y2_pred, y1_true + y2_true) != R2(y1_pred, y1_true) +
R2(y2_pred, y2_true)
whereas the equality holds for a loss and for the zero_one_score.
The only possibility to do what you want vincent is to allow cross_val_score
to return the concatenation of all the y_pred for each fold and then
apply the score.
This is what the iid keyword should do.
   return score_func(y1_pred + .... , y1_true + ....)
   return sum(score_func(y1_pred, y1_true) + score_func(....))
I tried to do that but stopped when trying to have it working in the
unsupervised case.
btw personally I'd rather have the behavior above rather than going
back to the MSE.
+1
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Gael Varoquaux
2010-12-15 14:56:52 UTC
Permalink
I am ok with this option, but I wonder if we shouldn't rename the iid keyword argument. In particular, your suggestion will not work with unsupervised, and the name 'iid' does not reflect this. However, I can't think of a better name.

:)

Gael

----- Original message -----
Post by Olivier Grisel
Post by Alexandre Gramfort
Hi,
indeed the pb is that R2_score is not linear. If you have two
(y1_pred, y1_true) and
R2(y1_pred + y2_pred, y1_true + y2_true) != R2(y1_pred, y1_true) +
R2(y2_pred, y2_true)
whereas the equality holds for a loss and for the zero_one_score.
The only possibility to do what you want vincent is to allow
cross_val_score to return the concatenation of all the y_pred for each
fold and then apply the score.
This is what the iid keyword should do.
   return score_func(y1_pred + .... , y1_true + ....)
   return sum(score_func(y1_pred, y1_true) + score_func(....))
I tried to do that but stopped when trying to have it working in the
unsupervised case.
btw personally I'd rather have the behavior above rather than going
back to the MSE.
+1
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Alexandre Gramfort
2010-12-15 14:56:02 UTC
Permalink
Post by Gael Varoquaux
your suggestion will not work with unsupervised,
I'm afraid so... hence the cross_val_predict for supervised settings.
Post by Gael Varoquaux
and the name 'iid' does not reflect this. However, I can't think of a better name.
yes iid is not a good name. Maybe scoring_full, scoring_global equal
to true or false... Not much inspiration either ...

Alex
Gael Varoquaux
2010-12-15 15:05:08 UTC
Permalink
----- Original message -----
Post by Alexandre Gramfort
Post by Gael Varoquaux
your suggestion will not work with unsupervised,
I'm afraid so... hence the cross_val_predict for supervised settings.
Yes but cross_val_predict is a meaningless function for me: I can't interpret it's return values without inspecting the cv object used. I think that it is simply exposing an implementation detail rather than an end-user need.

Gael, not listening to Vincent's defence :)
Post by Alexandre Gramfort
Post by Gael Varoquaux
and the name 'iid' does not reflect this. However, I can't think of a better name.
yes iid is not a good name. Maybe scoring_full, scoring_global equal
to true or false... Not much inspiration either ...
Alex
Olivier Grisel
2010-12-15 16:05:28 UTC
Permalink
Post by Alexandre Gramfort
Post by Gael Varoquaux
your suggestion will not work with unsupervised,
I'm afraid so... hence the cross_val_predict for supervised settings.
Post by Gael Varoquaux
and the name 'iid' does not reflect this. However, I can't think of a better name.
yes iid is not a good name. Maybe scoring_full, scoring_global equal
to true or false... Not much inspiration either ...
+1 for scoring_full, scoring_global or full_scoring or global_scoring
boolean flags (I have no preference as long as there is some docstring
documentation).
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Continue reading on narkive:
Loading...