http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html
can work with.
Post by Enise BasaranHi,
I'm studying on web page classification and I have 32 categories like
'Adult', 'Business&Economy', 'Education', etc.
X_train = np.array(["new york is a hell of a town",
"new york was originally dutch",
"the big apple is great",
"new york is also called the big apple",
"nyc is nice",
"people abbreviate new york city as nyc",
"the capital of great britain is london",
"london is in the uk",
"london is in england",
"london is in great britain",
"it rains a lot in london",
"london hosts the british museum",
"new york is great and so is london",
"i like london better than new york"])
y_train = [[0],[0],[0],[0],[0],[0],[1],[1],[1],[1],[1],[1],*[**0,1],[0,1**]*]
But I don't want to label data as above [0,1], because as you know *it's very difficult to find multilabelled data*. So that I generated 32 binary dataset for 32 category. When a test content came for prediction, test content is being sent to all classifiers and I'm taking into account only classifiers that are returning 'Yes'. So I could make multilabelled classification with my own dataset.
I can evaluate precision, recall and f-measure values for each classifier(for each category) but how can I test my all dataset(all classifiers) ? Thanks for your help in advance.
Post by Joel NothmanOneVsRestClassifier already implements Binary Relevance. What is unclear
about our documentation on model evaluation and metrics?
Post by Enise BasaranHi everyone,
I want to learn binary classifier evaluation metrics please. I
implemented "Binary Relevance" method for multilabel classification.
*[1] * My classifiers say "Yes" or "No". How can I calculate accuracy
score of my dataset, what metrics can I use for my binary classifiers?
Thanks in advance.
*[1] Binary Relevance (BR)* is one of the most popular approaches as a
trans-formation method that actually creates k datasets (k = |L|, total
number of classes), each for one
class label and trains a classifier on each of these datasets. Each of
these datasets contains the same number of instances as the original data,
but each dataset D λ j , 1 †j †k positively labels instances that belong
to class λ j and negative otherwise.
Sincerely,
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
*Enise BaÅaran*
*Software Developer*
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general