2010-11-29 14:39:11 UTC
Two suggestions in this mail (from crazy, to crazier):
* While we are talking of module renaming, and as this is going to teribly
annoy our users, I was thinking that I would prefer:
feature_extraction -> feature_extract
feature_selection -> feature_select
The reason is two-fold: it would be more consistent with the rest of the
scikit and scipy (as 'cluster' and not 'clustering'), and it is exactly 3
characters shorter (think of the keyboard!).
But maybe I am really nickpicking and waisting my time (I should be
working on my deadline,... ooops, my boss reads this mailing list). What
do others think.
* Finally (let's go crazy about API changes), I would like to have a
sub-package for model selection, because we have a lot of code related to
this that is scattered a bit. This brings a few questions:
1. Is it a good idea at all?
2. How should it be named (model_select seems long and heavy, maybe
'tune', but that seems a bit 'Jacky', and not very scientific)?
3. What should go in there. I was thinking of:
In the long run, I see also the option of having other wrapping
cross-validation estimators, as the GridSearchCV that would implement
other strategies that grid search.
Of course, all the objects useful for the end-user would be imported in
the __init__ of this sub-package, so that the user wouldn't have to go 2