I meant, how I make sure that I don't miss the "Good" combination that you mentioned?
Here, we are back to an exhaustive search on an infinitely small grid :). It's really about finding the "sweet spot" that is "practical" given your problem and available resources.
Also, for second point: Maybe considering computational time and then making sure that I have enough number of estimators in the parametric study?
What do you mean by parametric study, exactly? Do you mean that you are doing the hyperparam search for an empirical comparison study or do you just want to get a good model?
Thanks for your reply. So this mean I should start with e.g. "max_depth": [1,4,10,15], "min_samples_leaf":[1,10,20,30]. and if the max_depth=10 and min_samples_leaf=10, then I should explore values close to these values. Am I right?
Yes, this would work. However, keep in mind that you may be missing a "good" combination this way. And if you have a large number of n_estimators, tuning a random forest can be "relatively" expensive. Plus, you'd typically don't want or need to prune the trees here, that's basically the whole idea behind RF.
So I make sure that I don't miss the "Good" combination?
Shall I use small value of number of estimator, while conducting this parametric study.After that I can use a higher value while fitting my model?
Also here, the parameters that you tuned may only be good for the model based on the specific number of estimators. In general, I would maybe advice against tuning the hyperparameters at all and use the computational time to increase the number of n_estimators.
Maybe considering computational time and then making sure that I have enough number of estimators in the parametric study?
Hi Sebastian,
Thanks for your reply. So this mean I should start with e.g. "max_depth": [1,4,10,15], "min_samples_leaf":[1,10,20,30]. and if the max_depth=10 and min_samples_leaf=10, then I should explore values close to these values. Am I right?
Shall I use small value of number of estimator, while conducting this parametric study.After that I can use a higher value while fitting my model? Will this change other parameters, meaning is n_estimator depends on other parameters?
Also, should I use early stopping while doing GridSearchCV?
Thanks again.
Regards
Waseem
Hi, Waseem,
with a fine-enough grid, the GridSearchCV would be more "thorough" than the randomized search. However, the problem is essentially some sort of combinatorial explosion. Typically, I start with a "rougher" grid (the different parameters are more "spaced out" relative to each other). After that, I use a "finer" grid around the parameters that came up in the previous search.
However, it all comes down to computational time vs. being thorough. Or in other words, grid search is an exhaustive search whereas randomized search is a computationally "more efficient" approach.
Post by muhammad waseemHello All,
I am new to scikitlearn and ML, and trying to train my model using random forest and gradient boosting trees regressors. I was wondering what is the best way to do hyperparameter tuning, shall I use GridSearchCV or RandomisedSearchCV? I have read that the performance of RandomiseSeacrhCV is almost same as GridSearchCV (most of the times). If I go with RandomisedSearchCV then what should be the range of values for different parameters? How will I know that the range I am selecting is the correct one?
Also, what about the number of estimators? In the GridSearchCV or RandomisedSearchCV, shall I start with a low value and then after selecting other parameters, I will choose a large number of estimators for fitting purposes. Am I right?
Shall I always use early stopping, no matter if I use Grid search or Randomised Search?
P.S: Training data: Number of Inputs = 6
Number fo Outputs = 1
Number of samples (rows) = 8526
testing data: Number of samples (rows) = 1416
Thanks
Kindest Regards
Waseem
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140>
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________>
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140>
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
--
Dr Muhammad Waseem Ahmad
Research Associate,
BRE Center for Sustainable Construction,
School of Engineering,
Cardiff University,
Cardiff, UK.
--
Dr Muhammad Waseem Ahmad
Research Associate,
BRE Center for Sustainable Construction,
School of Engineering,
Cardiff University,
Cardiff, UK.
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general