It should. If not, please report a bug.
On 05/01/2015 11:16 AM, Pagliari, Roberto wrote:
> I agree with you.
> I'm just not sure whether scikit learn would handle that or not.
>
> thank you,
>
>
> ------------------------------------------------------------------------
> *From:* Michael Eickenberg [***@gmail.com]
> *Sent:* Friday, May 01, 2015 11:13 AM
> *To:* scikit-learn-***@lists.sourceforge.net
> *Subject:* Re: [Scikit-learn-general] class label hashing
>
> What do expect a classifier to predict on a label that it has never
> seen during training? If there were structure in the target, such as
> an order, then an appropriate regression may be able to infer unseen
> targets due to this structure. But in classification this information
> is entirely absent.
>
> Michael
>
> On Fri, May 1, 2015 at 5:07 PM, Pagliari, Roberto
> <***@appcomsci.com <mailto:***@appcomsci.com>> wrote:
>
> Hi Sebastian,
> if classes/labels are the same for both training and test, that
> should not be a problem. I've done that and never seen any issues.
> As far as I can see, scikit learn automatically maps classes into
> numbers from 0 to number of classes -1, which is something Spark,
> for example, does not do.
>
> With different set of classes, the simplest thing is to remove the
> ones in the test that do not appear in the training, to avoid
> messing with the confusion matrix [ in my case, different label
> numbers are really different classes ]
>
>
> ________________________________________
> From: Sebastian Raschka [***@gmail.com
> <mailto:***@gmail.com>]
> Sent: Thursday, April 30, 2015 11:08 PM
> To: scikit-learn-***@lists.sourceforge.net
> <mailto:scikit-learn-***@lists.sourceforge.net>
> Subject: Re: [Scikit-learn-general] class label hashing
>
> Roberto, I am not sure if this causes problems regarding the
> implementation, but in any case, I'd recommend you to use the
> LabelEncoder to have your classes mapped to a fixed range, e.g.,
> 0, 1, 2, 3, 4, 5. And having different class labels in training
> and test set that reference to the same class is not good practice
> and could cause all kinds of problems. I just wouldn't risk it
> even it it works.
>
> > On Apr 30, 2015, at 11:02 PM, Pagliari, Roberto
> <***@appcomsci.com <mailto:***@appcomsci.com>> wrote:
> >
> > Suppose I train a classifier with dataset1, which contains labels
> >
> > 0
> > 3
> > 4
> > 6
> > 7
> >
> > and then predict over dataset2 with labels
> >
> > 0
> > 3
> > 4
> > 8
> > 10
> >
> > will the hashing be the same for labels 0, 3 and 4? and will
> scikit learn get confused by seeing new labels such as 8 and 10?
> >
> > Thank you,
> >
> >
> ------------------------------------------------------------------------------
> > One dashboard for servers and applications across
> Physical-Virtual-Cloud
> > Widest out-of-the-box monitoring support with 50+ applications
> > Performance metrics, stats and reports that give you Actionable
> Insights
> > Deep dive visibility with transaction tracing using APM Insight.
> >
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-***@lists.sourceforge.net
> <mailto:Scikit-learn-***@lists.sourceforge.net>
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across
> Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable
> Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-***@lists.sourceforge.net
> <mailto:Scikit-learn-***@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across
> Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable
> Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-***@lists.sourceforge.net
> <mailto:Scikit-learn-***@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-***@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general