Olivier Grisel

2010-06-29 20:05:50 UTC

Hi all again,

It would be great if we could setup a standard API for hyper-parameter

admissible ranges definitions and settings. That would allow us to to

perform automated parameter tuning with. For instance one could have:

MyClassyClassifier(object):

hyperparameters = {

'l1': {'type': float, 'range': [0, 1e-3, 1e-2, 1e-1, 1, 1e1, 1e2, 1e3]},

'hidden_layer_units': {'type': int, 'range': 'range': [10, 50,

100, 500]},

'intercept': {'type': bool},

'normalize': {'type': bool},

}

def __init__(self, l1=10, hidden_layer_units=100, normalize=True,

intercept=True):

pass

def fit(self, X, y):

return self

def predict(self, X):

return -1

And then have a multiprocessing pool executor that queues cross

validation jobs for any combinations of the parameters (like the

grid_search.py script of the libsvm project. I think such a general

API + a default autotuner implementation would lower the barrier to

entry for newcomers and make scikits.learn concretely reach the goal

of "machine learning without learning the machinery".

Any opinion?

--

Olivier

http://twitter.com/ogrisel - http://github.com/ogrisel

