A neuman
2016-05-12 09:53:13 UTC
Hello everyone,
I'm having a bit trouble with the parameters that I've got from
gridsearchCV.
For example:
If i'm using the parameter what i've got from grid seardh CV for example on
RF oder k-nn and i test the model on the train set, i get everytime an AUC
value about 1.00 or 0.99.
The dataset have 1200 samples.
Does that mean that i can't use the Parameters that i've got from the
gridsearchCV? Cause it was in actually every case. I already tried the
nested-CV to compare the algorithms.
example for RF with the values i have got from gridsearchCV (10-fold):
RandomForestClassifier(n_estimators=200,oob_score=True,max_features=None,random_state=1,min_samples_leaf=
2,class_weight='balanced_subsample')
and then i'm just using *clf.predict(X_train) *and test it on the*
y_train set. *
the AUC-value from the clf.predict(X_test) are about 0.73, so there is a
big difference from the train and test dataset.
best,
I'm having a bit trouble with the parameters that I've got from
gridsearchCV.
For example:
If i'm using the parameter what i've got from grid seardh CV for example on
RF oder k-nn and i test the model on the train set, i get everytime an AUC
value about 1.00 or 0.99.
The dataset have 1200 samples.
Does that mean that i can't use the Parameters that i've got from the
gridsearchCV? Cause it was in actually every case. I already tried the
nested-CV to compare the algorithms.
example for RF with the values i have got from gridsearchCV (10-fold):
RandomForestClassifier(n_estimators=200,oob_score=True,max_features=None,random_state=1,min_samples_leaf=
2,class_weight='balanced_subsample')
and then i'm just using *clf.predict(X_train) *and test it on the*
y_train set. *
the AUC-value from the clf.predict(X_test) are about 0.73, so there is a
big difference from the train and test dataset.
best,