Olivier Grisel
2011-03-27 13:26:45 UTC
For the twitter-impaired lurking around the mailing list, here is a
very interesting post by Alexander Smola to correct distribution
discrepancies between training set and test set using a simple
logistic regression model that is used to re-weights the training
samples:
http://blog.smola.org/post/4110255196/real-simple-covariate-shift-correction
This means that this approach could be implemented straightforwardly
using the SVC and SGD models which now both support sample
re-weighting. Does someone has an idea of a good dataset to
demonstrate this on an example?
One could use artificial dataset but it does not feel right :)
very interesting post by Alexander Smola to correct distribution
discrepancies between training set and test set using a simple
logistic regression model that is used to re-weights the training
samples:
http://blog.smola.org/post/4110255196/real-simple-covariate-shift-correction
This means that this approach could be implemented straightforwardly
using the SVC and SGD models which now both support sample
re-weighting. Does someone has an idea of a good dataset to
demonstrate this on an example?
One could use artificial dataset but it does not feel right :)
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel