Mamun Rashid
2016-03-15 11:44:20 UTC
Hi All,
I have asked this question couple of weeks ago on the list. I have a two class problem where my positive class ( Class 1 ) and negative class ( Class 0 )
is imbalanced. Secondly I care much less about the negative class. So, I specified both class weight (to a random forest classifier) and sample wright to
the fit function to give more importance to my positive class.
cl_weight = {0:weight1, 1:weight2}
clf = RandomForestClassifier(n_estimators=400, max_depth=None, min_samples_split=2, random_state=0, oob_score=True, class_weight = cl_weight, criterion=âgini")
sample_weight = np.array([weight if m == 1 else 1 for m in df_tr[label_column]])
y_pred = clf.fit(X_tr, y_tr,sample_weight= sample_weight).predict(X_te)
Despite specifying dramatically different class weight I do not observe much difference.
Example :: cl_weight = {0:0.001, 1:0.999} and cl_weight = {0:0.50, 1:0.50}.
Am I passing the class weight correctly ?
I am giving example of two folds from these two runs :: Fold 1 and Fold 2.
## cl_weight = {0:0.001, 1:0.999}
Fold_1 Confusion Matrix
0 1
0 1681 26
1 636 149
Fold_5 Confusion Matrix
0 1
0 1670 15
1 734 160
## cl_weight = {0:0.50, 1:0.50}
Fold_1 Confusion Matrix
0 1
0 1690 15
1 630 163
Fold_5 Confusion Matrix
0 1
0 1676 14
1 709 170
Thanks,
Mamun
I have asked this question couple of weeks ago on the list. I have a two class problem where my positive class ( Class 1 ) and negative class ( Class 0 )
is imbalanced. Secondly I care much less about the negative class. So, I specified both class weight (to a random forest classifier) and sample wright to
the fit function to give more importance to my positive class.
cl_weight = {0:weight1, 1:weight2}
clf = RandomForestClassifier(n_estimators=400, max_depth=None, min_samples_split=2, random_state=0, oob_score=True, class_weight = cl_weight, criterion=âgini")
sample_weight = np.array([weight if m == 1 else 1 for m in df_tr[label_column]])
y_pred = clf.fit(X_tr, y_tr,sample_weight= sample_weight).predict(X_te)
Despite specifying dramatically different class weight I do not observe much difference.
Example :: cl_weight = {0:0.001, 1:0.999} and cl_weight = {0:0.50, 1:0.50}.
Am I passing the class weight correctly ?
I am giving example of two folds from these two runs :: Fold 1 and Fold 2.
## cl_weight = {0:0.001, 1:0.999}
Fold_1 Confusion Matrix
0 1
0 1681 26
1 636 149
Fold_5 Confusion Matrix
0 1
0 1670 15
1 734 160
## cl_weight = {0:0.50, 1:0.50}
Fold_1 Confusion Matrix
0 1
0 1690 15
1 630 163
Fold_5 Confusion Matrix
0 1
0 1676 14
1 709 170
Thanks,
Mamun