Sheila the angel
2014-08-21 10:32:08 UTC
Hi,
Using GridSearchCV, I am trying to optimize two parameters values.
In total, I have 8 parameter combinations and doing 4 fold cross validation.
I want to run it in parallel environment.
My questions are:
1. What should be the n_jobs value, 8 or (8*4=) 32 ?
(I know I can specify n_jobs=-1 but due to some technical reasons, I want
to know how many jobs GridSearchCV will start.)
2. If I use the classifier such as RandomForestClassifier where 'n_jobs'
can be specified, will it make any difference if I specify "n_jobs" at the
classifier level also-
Thanks
--
Sheila
Using GridSearchCV, I am trying to optimize two parameters values.
In total, I have 8 parameter combinations and doing 4 fold cross validation.
I want to run it in parallel environment.
My questions are:
1. What should be the n_jobs value, 8 or (8*4=) 32 ?
(I know I can specify n_jobs=-1 but due to some technical reasons, I want
to know how many jobs GridSearchCV will start.)
2. If I use the classifier such as RandomForestClassifier where 'n_jobs'
can be specified, will it make any difference if I specify "n_jobs" at the
classifier level also-
clf = RandomForestClassifier(n_jobs=-1)
grid_search = GridSearchCV(clf, param_grid, n_jobs = -1)
Will this be faster compare to GridSearchCV(RandomForestClassifier() ) ?grid_search = GridSearchCV(clf, param_grid, n_jobs = -1)
Thanks
--
Sheila