Which dataset are you referring to?
From: Mathieu Blondel [mailto:***@mblondel.org]
Sent: Saturday, October 04, 2014 10:13 AM
Subject: Re: [Scikit-learn-general] error when using linear SVM with AdaBoost
On Sat, Oct 4, 2014 at 1:09 AM, Andy <***@gmail.com<mailto:***@gmail.com>> wrote:
I'm pretty sure that is wrong, unless you use the "decision_function"
and not "predict_proba" or "predict".
Mathieu said "predict" is used. Then it is still like a (very old
school) neural network with a thresholding layer,
and not like a linear model at all.
I don't think this is exactly like a neural network. In a neural network, the non-linear activation functions are part of the objective function, so they affect parameter estimation directly. Here, a linear SVC is first fitted *then* its weight in the ensemble is estimated, given the predictions fixed. Since np.sign (or predict_proba when available) is applied post-hoc, it should affect neither the linear SVC model nor its weight in the ensemble.
The main idea of AdaBoost is to increasingly focus on the difficult examples. This suggests that weak learners should be diverse enough, i.e., they should disagree in their predictions on most examples. My intuition is that a linear SVC doesn't fulfill this requirement. I would rather use a weak learner (oracle) with high variance, low bias.
I would be curious to see how AdaBoost + LinearSVC fares on MNIST. Since non-linear models outperform linear ones on this dataset, the results would be a good indicator.