Discussion:
[Scikit-learn-general] getting different results with sklearn gridsearchCV
Pagliari, Roberto
2014-09-12 14:20:57 UTC
Permalink
I am comparing the results of sklearn cross-validation and my own cross validation.

I tested linearSVC under the following conditions:

- Data scaling per grid search

- Data scaling + 2-level quantization, per grid search

Specifically, I have done the following:
Sklearn gridSearchCV

- Create a pipeline with [StandardScaler, LinearSVC] if no binning is used, or [StandardScaler, Binarizer, LinearSVC], if binning is used

- Invoke sklearn gridsearch (only C is provided as a parameter to optimize over)

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

My own grid search

- Search over all possible values of C (same range as above)

- For each value of C, use stratifiedKFold with random_seed set to a random number

o Scale train cross-validation datased, and test cross validation dataset with train cv mean and std

o If binning is used, apply binary binning (my own function), on top of StandardScaler

o For each value of C compute average score over all partition, where the score is defined as number of correctly classified samples / total number of samples

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

For some reason, I'm getting different results. In particular, sklearn gridsearch performs better than my own gridsearch when not using quantization, and it gets worse with quantization. With my own gridsearch I'm getting the opposite trend.

Is my understanding of sklearn gridsearch wrong, or are there any issues with it?

Thank you,
Pagliari, Roberto
2014-09-12 15:09:28 UTC
Permalink
Regarding my previous question, I suspect the difference lies in the scoring function.

What is the default scoring function used by gridsearch?

In my own implementation I am using
number of correctly classified samples (no weighting) / total number of samples

sklearn gridsearch function must be using something else, or maybe the same, but with weighting?

Thanks,


From: Pagliari, Roberto
Sent: Friday, September 12, 2014 10:21 AM
To: 'scikit-learn-***@lists.sourceforge.net'
Subject: getting different results with sklearn gridsearchCV

I am comparing the results of sklearn cross-validation and my own cross validation.

I tested linearSVC under the following conditions:

- Data scaling per grid search

- Data scaling + 2-level quantization, per grid search

Specifically, I have done the following:
Sklearn gridSearchCV

- Create a pipeline with [StandardScaler, LinearSVC] if no binning is used, or [StandardScaler, Binarizer, LinearSVC], if binning is used

- Invoke sklearn gridsearch (only C is provided as a parameter to optimize over)

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

My own grid search

- Search over all possible values of C (same range as above)

- For each value of C, use stratifiedKFold with random_seed set to a random number

o Scale train cross-validation datased, and test cross validation dataset with train cv mean and std

o If binning is used, apply binary binning (my own function), on top of StandardScaler

o For each value of C compute average score over all partition, where the score is defined as number of correctly classified samples / total number of samples

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

For some reason, I'm getting different results. In particular, sklearn gridsearch performs better than my own gridsearch when not using quantization, and it gets worse with quantization. With my own gridsearch I'm getting the opposite trend.

Is my understanding of sklearn gridsearch wrong, or are there any issues with it?

Thank you,
Andy
2014-09-12 16:11:58 UTC
Permalink
Hi Roberto.
GridSearchCV uses accuracy for selection if not other method is
specified, so there should be no difference.

Could you provide code?
Do you also create a pipeline when using your own grid search? I would
imagine there is some difference in how you do the fitting in the pipeline.

Cheers,
Andy
Post by Pagliari, Roberto
Regarding my previous question, I suspect the difference lies in the scoring function.
What is the default scoring function used by gridsearch?
In my own implementation I am using
number of correctly classified samples (no weighting) / total number of samples
sklearn gridsearch function must be using something else, or maybe the
same, but with weighting?
Thanks,
*From:* Pagliari, Roberto
*Sent:* Friday, September 12, 2014 10:21 AM
*Subject:* getting different results with sklearn gridsearchCV
I am comparing the results of sklearn cross-validation and my own cross validation.
-Data scaling per grid search
-Data scaling + 2-level quantization, per grid search
Sklearn gridSearchCV
-Create a pipeline with [StandardScaler, LinearSVC] if no binning is
used, or [StandardScaler, Binarizer, LinearSVC], if binning is used
-Invoke sklearn gridsearch (only C is provided as a parameter to
optimize over)
-When done with gridsearch,
oScale entire training set
oScale test set (with mean/std found on training set)
oQuantize, if quantization is used
o run LinearSVC, with best C value found
My own grid search
-Search over all possible values of C (same range as above)
-For each value of C, use stratifiedKFold with random_seed set to a
random number
oScale train cross-validation datased, and test cross validation
dataset with train cv mean and std
oIf binning is used, apply binary binning (my own function), on top of
StandardScaler
oFor each value of C compute average score over all partition, where
the score is defined as number of correctly classified samples / total
number of samples
-When done with gridsearch,
oScale entire training set
oScale test set (with mean/std found on training set)
oQuantize, if quantization is used
o run LinearSVC, with best C value found
For some reason, I’m getting different results. In particular, sklearn
gridsearch performs better than my own gridsearch when not using
quantization, and it gets worse with quantization. With my own
gridsearch I’m getting the opposite trend.
Is my understanding of sklearn gridsearch wrong, or are there any issues with it?
Thank you,
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Pagliari, Roberto
2014-09-12 16:31:12 UTC
Permalink
Hi Andy,
I don't think the accuracy is an issue. I explicitly provided a score function and the problem persists.
With my own gridsearch I don't use pipeline, just stratifiedKFold and average for every combination of the parameters.

This is an example with scaling+svm using sklearn pipeline:

estimators = [('scaler', StandardScaler()),
('linear_svm', svm.LinearSVC(class_weight='auto',))]

clf_pipeline = Pipeline(estimators)
params = dict(linear_svm__C=<some array of values>)
clf = grid_search.GridSearchCV(clf_pipeline, param_grid=params)
clf.fit(X_train, y_train) # here I'm not scaling since I assume gridsearch will do while searching

After this I make the predictions
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_predictions = clf.predict(X_test)

with binning, I would just add the Binarizer to the pipeline, and right before computing y_predictions.

Is there anything wrong with what I'm doing?

Thank you


From: Andy [mailto:***@gmail.com]
Sent: Friday, September 12, 2014 12:12 PM
To: scikit-learn-***@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] getting different results with sklearn gridsearchCV

Hi Roberto.
GridSearchCV uses accuracy for selection if not other method is specified, so there should be no difference.

Could you provide code?
Do you also create a pipeline when using your own grid search? I would imagine there is some difference in how you do the fitting in the pipeline.

Cheers,
Andy


On 09/12/2014 05:09 PM, Pagliari, Roberto wrote:
Regarding my previous question, I suspect the difference lies in the scoring function.

What is the default scoring function used by gridsearch?

In my own implementation I am using
number of correctly classified samples (no weighting) / total number of samples

sklearn gridsearch function must be using something else, or maybe the same, but with weighting?

Thanks,


From: Pagliari, Roberto
Sent: Friday, September 12, 2014 10:21 AM
To: 'scikit-learn-***@lists.sourceforge.net<mailto:scikit-learn-***@lists.sourceforge.net>'
Subject: getting different results with sklearn gridsearchCV

I am comparing the results of sklearn cross-validation and my own cross validation.

I tested linearSVC under the following conditions:

- Data scaling per grid search

- Data scaling + 2-level quantization, per grid search

Specifically, I have done the following:
Sklearn gridSearchCV

- Create a pipeline with [StandardScaler, LinearSVC] if no binning is used, or [StandardScaler, Binarizer, LinearSVC], if binning is used

- Invoke sklearn gridsearch (only C is provided as a parameter to optimize over)

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

My own grid search

- Search over all possible values of C (same range as above)

- For each value of C, use stratifiedKFold with random_seed set to a random number

o Scale train cross-validation datased, and test cross validation dataset with train cv mean and std

o If binning is used, apply binary binning (my own function), on top of StandardScaler

o For each value of C compute average score over all partition, where the score is defined as number of correctly classified samples / total number of samples

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

For some reason, I'm getting different results. In particular, sklearn gridsearch performs better than my own gridsearch when not using quantization, and it gets worse with quantization. With my own gridsearch I'm getting the opposite trend.

Is my understanding of sklearn gridsearch wrong, or are there any issues with it?

Thank you,





------------------------------------------------------------------------------

Want excitement?

Manually upgrade your production database.

When you want reliability, choose Perforce

Perforce version control. Predictably reliable.

http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk




_______________________________________________

Scikit-learn-general mailing list

Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Laurent Direr
2014-09-12 16:56:36 UTC
Permalink
Hi Roberto,

You do not need to scale here (you can remove the 3 first lines), the
point of the pipeline is actually to not have to do this:

After this I make the predictions

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_test = scaler.transform(X_test)

y_predictions = clf.predict(X_test)

You are calling the predict method of the (grid-searched) *Pipeline*
object which already includes the scaler: predict will call the
'transform' methods of every step in the pipeline (in your example
'scaler' and possibly 'binarizer') on X_test and then use the
transformed data to make the predictions using the final step (here
'linear_svm') predict method.
This is actually from the doc:

Definition: Pipeline.predict(self, X)
Docstring:
Applies transforms to the data, and the predict method of the
final estimator. Valid only if the final estimator implements
predict.


This means that when you do this:

X_test = scaler.transform(X_test)

y_predictions = clf.predict(X_test)

X_test if scaled twice, once by your scaler on the first line and once
by the Pipeline (not a big deal).
However if you also binarize the data before calling clf.predict I think
it would raise an issue as your /already binarized /data would be scaled
and binarized again (I would expect weird behaviour here!).

I don't think this answers all your questions but I'd suggest you clean
this part so that we can see more clearly about your other
interrogations ;).

Cheers,

Laurent
Post by Pagliari, Roberto
Hi Andy,
I don’t think the accuracy is an issue. I explicitly provided a score
function and the problem persists.
With my own gridsearch I don’t use pipeline, just stratifiedKFold and
average for every combination of the parameters.
estimators = [('scaler', StandardScaler()),
('linear_svm', svm.LinearSVC(class_weight=’auto’,))]
clf_pipeline = Pipeline(estimators)
params = dict(linear_svm__C=<some array of values>)
clf = grid_search.GridSearchCV(clf_pipeline, param_grid=params)
clf.fit(X_train, y_train) # here I’m not scaling since I assume
gridsearch will do while searching
After this I make the predictions
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_predictions = clf.predict(X_test)
with binning, I would just add the Binarizer to the pipeline, and
right before computing y_predictions.
Is there anything wrong with what I’m doing?
Thank you
*Sent:* Friday, September 12, 2014 12:12 PM
*Subject:* Re: [Scikit-learn-general] getting different results with
sklearn gridsearchCV
Hi Roberto.
GridSearchCV uses accuracy for selection if not other method is
specified, so there should be no difference.
Could you provide code?
Do you also create a pipeline when using your own grid search? I would
imagine there is some difference in how you do the fitting in the pipeline.
Cheers,
Andy
Regarding my previous question, I suspect the difference lies in
the scoring function.
What is the default scoring function used by gridsearch?
In my own implementation I am using
number of correctly classified samples (no weighting) / total number of samples
sklearn gridsearch function must be using something else, or maybe
the same, but with weighting?
Thanks,
*From:* Pagliari, Roberto
*Sent:* Friday, September 12, 2014 10:21 AM
*Subject:* getting different results with sklearn gridsearchCV
I am comparing the results of sklearn cross-validation and my own cross validation.
-Data scaling per grid search
-Data scaling + 2-level quantization, per grid search
Sklearn gridSearchCV
-Create a pipeline with [StandardScaler, LinearSVC] if no binning
is used, or [StandardScaler, Binarizer, LinearSVC], if binning is
used
-Invoke sklearn gridsearch (only C is provided as a parameter to
optimize over)
-When done with gridsearch,
oScale entire training set
oScale test set (with mean/std found on training set)
oQuantize, if quantization is used
o run LinearSVC, with best C value found
My own grid search
-Search over all possible values of C (same range as above)
-For each value of C, use stratifiedKFold with random_seed set to
a random number
oScale train cross-validation datased, and test cross validation
dataset with train cv mean and std
oIf binning is used, apply binary binning (my own function), on
top of StandardScaler
oFor each value of C compute average score over all partition,
where the score is defined as number of correctly classified
samples / total number of samples
-When done with gridsearch,
oScale entire training set
oScale test set (with mean/std found on training set)
oQuantize, if quantization is used
o run LinearSVC, with best C value found
For some reason, I’m getting different results. In particular,
sklearn gridsearch performs better than my own gridsearch when not
using quantization, and it gets worse with quantization. With my
own gridsearch I’m getting the opposite trend.
Is my understanding of sklearn gridsearch wrong, or are there any issues with it?
Thank you,
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Andy
2014-09-12 17:18:29 UTC
Permalink
As Laurent said using StandardScaler again is not necessary.
If you don't provide code for your custom grid-search, it is hard to say
what the difference might be ;)
Are the same parameters selected and are the scores during the
grid-search the same?
Post by Pagliari, Roberto
Hi Andy,
I don’t think the accuracy is an issue. I explicitly provided a score
function and the problem persists.
With my own gridsearch I don’t use pipeline, just stratifiedKFold and
average for every combination of the parameters.
estimators = [('scaler', StandardScaler()),
('linear_svm', svm.LinearSVC(class_weight=’auto’,))]
clf_pipeline = Pipeline(estimators)
params = dict(linear_svm__C=<some array of values>)
clf = grid_search.GridSearchCV(clf_pipeline, param_grid=params)
clf.fit(X_train, y_train) # here I’m not scaling since I assume
gridsearch will do while searching
After this I make the predictions
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_predictions = clf.predict(X_test)
with binning, I would just add the Binarizer to the pipeline, and
right before computing y_predictions.
Is there anything wrong with what I’m doing?
Thank you
*Sent:* Friday, September 12, 2014 12:12 PM
*Subject:* Re: [Scikit-learn-general] getting different results with
sklearn gridsearchCV
Hi Roberto.
GridSearchCV uses accuracy for selection if not other method is
specified, so there should be no difference.
Could you provide code?
Do you also create a pipeline when using your own grid search? I would
imagine there is some difference in how you do the fitting in the pipeline.
Cheers,
Andy
Regarding my previous question, I suspect the difference lies in
the scoring function.
What is the default scoring function used by gridsearch?
In my own implementation I am using
number of correctly classified samples (no weighting) / total number of samples
sklearn gridsearch function must be using something else, or maybe
the same, but with weighting?
Thanks,
*From:* Pagliari, Roberto
*Sent:* Friday, September 12, 2014 10:21 AM
*Subject:* getting different results with sklearn gridsearchCV
I am comparing the results of sklearn cross-validation and my own cross validation.
-Data scaling per grid search
-Data scaling + 2-level quantization, per grid search
Sklearn gridSearchCV
-Create a pipeline with [StandardScaler, LinearSVC] if no binning
is used, or [StandardScaler, Binarizer, LinearSVC], if binning is
used
-Invoke sklearn gridsearch (only C is provided as a parameter to
optimize over)
-When done with gridsearch,
oScale entire training set
oScale test set (with mean/std found on training set)
oQuantize, if quantization is used
o run LinearSVC, with best C value found
My own grid search
-Search over all possible values of C (same range as above)
-For each value of C, use stratifiedKFold with random_seed set to
a random number
oScale train cross-validation datased, and test cross validation
dataset with train cv mean and std
oIf binning is used, apply binary binning (my own function), on
top of StandardScaler
oFor each value of C compute average score over all partition,
where the score is defined as number of correctly classified
samples / total number of samples
-When done with gridsearch,
oScale entire training set
oScale test set (with mean/std found on training set)
oQuantize, if quantization is used
o run LinearSVC, with best C value found
For some reason, I’m getting different results. In particular,
sklearn gridsearch performs better than my own gridsearch when not
using quantization, and it gets worse with quantization. With my
own gridsearch I’m getting the opposite trend.
Is my understanding of sklearn gridsearch wrong, or are there any issues with it?
Thank you,
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Pagliari, Roberto
2014-09-12 17:30:26 UTC
Permalink
Thanks for the suggestions.

With that fix, scaling+gridsearch is giving me the same results (w.r.t. my own gridsearch). I will try add binning as well.

Thank you again!



From: Andy [mailto:***@gmail.com]
Sent: Friday, September 12, 2014 1:18 PM
To: scikit-learn-***@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] getting different results with sklearn gridsearchCV

As Laurent said using StandardScaler again is not necessary.
If you don't provide code for your custom grid-search, it is hard to say what the difference might be ;)
Are the same parameters selected and are the scores during the grid-search the same?




On 09/12/2014 06:31 PM, Pagliari, Roberto wrote:
Hi Andy,
I don't think the accuracy is an issue. I explicitly provided a score function and the problem persists.
With my own gridsearch I don't use pipeline, just stratifiedKFold and average for every combination of the parameters.

This is an example with scaling+svm using sklearn pipeline:

estimators = [('scaler', StandardScaler()),
('linear_svm', svm.LinearSVC(class_weight='auto',))]

clf_pipeline = Pipeline(estimators)
params = dict(linear_svm__C=<some array of values>)
clf = grid_search.GridSearchCV(clf_pipeline, param_grid=params)
clf.fit(X_train, y_train) # here I'm not scaling since I assume gridsearch will do while searching

After this I make the predictions
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
y_predictions = clf.predict(X_test)

with binning, I would just add the Binarizer to the pipeline, and right before computing y_predictions.

Is there anything wrong with what I'm doing?

Thank you


From: Andy [mailto:***@gmail.com]
Sent: Friday, September 12, 2014 12:12 PM
To: scikit-learn-***@lists.sourceforge.net<mailto:scikit-learn-***@lists.sourceforge.net>
Subject: Re: [Scikit-learn-general] getting different results with sklearn gridsearchCV

Hi Roberto.
GridSearchCV uses accuracy for selection if not other method is specified, so there should be no difference.

Could you provide code?
Do you also create a pipeline when using your own grid search? I would imagine there is some difference in how you do the fitting in the pipeline.

Cheers,
Andy


On 09/12/2014 05:09 PM, Pagliari, Roberto wrote:
Regarding my previous question, I suspect the difference lies in the scoring function.

What is the default scoring function used by gridsearch?

In my own implementation I am using
number of correctly classified samples (no weighting) / total number of samples

sklearn gridsearch function must be using something else, or maybe the same, but with weighting?

Thanks,


From: Pagliari, Roberto
Sent: Friday, September 12, 2014 10:21 AM
To: 'scikit-learn-***@lists.sourceforge.net<mailto:scikit-learn-***@lists.sourceforge.net>'
Subject: getting different results with sklearn gridsearchCV

I am comparing the results of sklearn cross-validation and my own cross validation.

I tested linearSVC under the following conditions:

- Data scaling per grid search

- Data scaling + 2-level quantization, per grid search

Specifically, I have done the following:
Sklearn gridSearchCV

- Create a pipeline with [StandardScaler, LinearSVC] if no binning is used, or [StandardScaler, Binarizer, LinearSVC], if binning is used

- Invoke sklearn gridsearch (only C is provided as a parameter to optimize over)

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

My own grid search

- Search over all possible values of C (same range as above)

- For each value of C, use stratifiedKFold with random_seed set to a random number

o Scale train cross-validation datased, and test cross validation dataset with train cv mean and std

o If binning is used, apply binary binning (my own function), on top of StandardScaler

o For each value of C compute average score over all partition, where the score is defined as number of correctly classified samples / total number of samples

- When done with gridsearch,

o Scale entire training set

o Scale test set (with mean/std found on training set)

o Quantize, if quantization is used

o run LinearSVC, with best C value found

For some reason, I'm getting different results. In particular, sklearn gridsearch performs better than my own gridsearch when not using quantization, and it gets worse with quantization. With my own gridsearch I'm getting the opposite trend.

Is my understanding of sklearn gridsearch wrong, or are there any issues with it?

Thank you,






------------------------------------------------------------------------------

Want excitement?

Manually upgrade your production database.

When you want reliability, choose Perforce

Perforce version control. Predictably reliable.

http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk





_______________________________________________

Scikit-learn-general mailing list

Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/scikit-learn-general





------------------------------------------------------------------------------

Want excitement?

Manually upgrade your production database.

When you want reliability, choose Perforce

Perforce version control. Predictably reliable.

http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk




_______________________________________________

Scikit-learn-general mailing list

Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Continue reading on narkive:
Loading...