Pedro Rodriguez
2016-01-27 17:01:02 UTC
Hi,
I am considering working on a project which would result in a PR to
scikit-learn, but would like to check that something like it doesn't
already exist or is in progress (in our out of SKLearn).
Goal: Implement the algorithm (TuPAQ) described here:
http://web.cs.ucla.edu/~ameet/tupaq_socc.pdf to make something similar to
GridSearchCV
Result: Potentially much faster training time over the parameter/model
space than GridSearchCV
Description of Algorithm:
1. Train all models by some number of iterations to kick start
2. Drop out all models that are not within some margin of the best model
3. Repeat steps 1 and 2 based on some heuristic
4. Return best model
Existing Code:
Didn't find anything in SKLearn like this, closest thing I found was this:
https://github.com/hyperopt/hyperopt-sklearn but it doesn't include some of
the other methods used in the paper (like early model termination)
Thanks!
I am considering working on a project which would result in a PR to
scikit-learn, but would like to check that something like it doesn't
already exist or is in progress (in our out of SKLearn).
Goal: Implement the algorithm (TuPAQ) described here:
http://web.cs.ucla.edu/~ameet/tupaq_socc.pdf to make something similar to
GridSearchCV
Result: Potentially much faster training time over the parameter/model
space than GridSearchCV
Description of Algorithm:
1. Train all models by some number of iterations to kick start
2. Drop out all models that are not within some margin of the best model
3. Repeat steps 1 and 2 based on some heuristic
4. Return best model
Existing Code:
Didn't find anything in SKLearn like this, closest thing I found was this:
https://github.com/hyperopt/hyperopt-sklearn but it doesn't include some of
the other methods used in the paper (like early model termination)
Thanks!
--
Pedro Rodriguez
PhD Student in Distributed Machine Learning | CU Boulder
UC Berkeley AMPLab Alumni
***@gmail.com | pedrorodriguez.io | 909-353-4423
Github: github.com/EntilZha | LinkedIn:
https://www.linkedin.com/in/pedrorodriguezscience
Pedro Rodriguez
PhD Student in Distributed Machine Learning | CU Boulder
UC Berkeley AMPLab Alumni
***@gmail.com | pedrorodriguez.io | 909-353-4423
Github: github.com/EntilZha | LinkedIn:
https://www.linkedin.com/in/pedrorodriguezscience