Discussion:
which methods do I need to implement for a regressor?
(too old to reply)
Pagliari, Roberto
2015-02-16 05:50:24 UTC
Permalink
I'd like to implement my own regressor/classificator and possibly use it in a pipeline.

do I need to implement all methods below or can some of them be missing?

decision_function<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.decision_function>(X) Predict using the linear model
densify<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.densify>() Convert coefficient matrix to dense array format.
fit<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.fit>(X, y[, coef_init, intercept_init, ...]) Fit linear model with Stochastic Gradient Descent.
fit_transform<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.fit_transform>(X[, y]) Fit to data, then transform it.
get_params<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.get_params>([deep]) Get parameters for this estimator.
partial_fit<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.partial_fit>(X, y[, sample_weight]) Fit linear model with Stochastic Gradient Descent.
predict<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.predict>(X) Predict using the linear model
score<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.score>(X, y[, sample_weight]) Returns the coefficient of determination R^2 of the prediction.
set_params(*args, **kwargs)
sparsify<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.sparsify>() Convert coefficient matrix to sparse format.
transform<http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDRegressor.html#sklearn.linear_model.SGDRegressor.transform>(X[, threshold]) Reduce X to its most important features.
Gael Varoquaux
2015-02-16 05:55:56 UTC
Permalink
You need fit, predict, and set_params. But set_params you can get by
inheriting sklearn.base.BaseEstimator

G
Post by Pagliari, Roberto
I'd like to implement my own regressor/classificator and possibly use it in a pipeline.
do I need to implement all methods below or can some of them be missing?
┌───────────────────────────────────┬─────────────────────────────────────────┐
│decision_function(X) │Predict using the linear model │
├───────────────────────────────────┼─────────────────────────────────────────┤
│densify() │Convert coefficient matrix to dense array│
│ │format. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│fit(X, y │Fit linear model with Stochastic Gradient│
│[, coef_init, intercept_init, ...])│Descent. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│fit_transform(X[, y]) │Fit to data, then transform it. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│get_params([deep]) │Get parameters for this estimator. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│partial_fit(X, y[, sample_weight]) │Fit linear model with Stochastic Gradient│
│ │Descent. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│predict(X) │Predict using the linear model │
├───────────────────────────────────┼─────────────────────────────────────────┤
│score(X, y[, sample_weight]) │Returns the coefficient of determination │
│ │R^2 of the prediction. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│set_params(*args, **kwargs) │ │
├───────────────────────────────────┼─────────────────────────────────────────┤
│sparsify() │Convert coefficient matrix to sparse │
│ │format. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│transform(X[, threshold]) │Reduce X to its most important features. │
└───────────────────────────────────┴─────────────────────────────────────────┘
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Gael Varoquaux
Researcher, INRIA Parietal
Laboratoire de Neuro-Imagerie Assistee par Ordinateur
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-79-68
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
Pagliari, Roberto
2015-02-16 15:20:05 UTC
Permalink
Broadly speaking, I would like to add my own custom function into a pipeline.
However, my function is not really a classifier, nor a regressor.
What do you think would be the best way to go about it? Is there a shortcut that does not require implementing the functions below?


Thank you,


-----Original Message-----
From: Gael Varoquaux [mailto:***@normalesup.org]
Sent: Monday, February 16, 2015 12:56 AM
To: scikit-learn-***@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

You need fit, predict, and set_params. But set_params you can get by inheriting sklearn.base.BaseEstimator

G
Post by Pagliari, Roberto
I'd like to implement my own regressor/classificator and possibly use it in a pipeline.
do I need to implement all methods below or can some of them be missing?
┌───────────────────────────────────┬─────────────────────────────────────────┐
│decision_function(X) │Predict using the linear model │
├───────────────────────────────────┼─────────────────────────────────────────┤
│densify() │Convert coefficient matrix to dense array│
│ │format. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│fit(X, y │Fit linear model with Stochastic Gradient│
│[, coef_init, intercept_init, ...])│Descent. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│fit_transform(X[, y]) │Fit to data, then transform it. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│get_params([deep]) │Get parameters for this estimator. │
├───────────────────────────────────┼─────────────────────────────────
────────┤ │partial_fit(X, y[, sample_weight]) │Fit linear model with
Stochastic Gradient│
│ │Descent. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│predict(X) │Predict using the linear model │
├───────────────────────────────────┼─────────────────────────────────────────┤
│score(X, y[, sample_weight]) │Returns the coefficient of determination │
│ │R^2 of the prediction. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│set_params(*args, **kwargs) │ │
├───────────────────────────────────┼─────────────────────────────────────────┤
│sparsify() │Convert coefficient matrix to sparse │
│ │format. │
├───────────────────────────────────┼─────────────────────────────────────────┤
│transform(X[, threshold]) │Reduce X to its most important features. │
└───────────────────────────────────┴─────────────────────────────────
────────┘
----------------------------------------------------------------------
-------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
Server from Actuate! Instantly Supercharge Your Business Reports and
Dashboards with Interactivity, Sharing, Native Excel Exports, App
Integration & more Get technology previously reserved for
billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.
clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Gael Varoquaux
Researcher, INRIA Parietal
Laboratoire de Neuro-Imagerie Assistee par Ordinateur
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-79-68
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Michael Eickenberg
2015-02-16 16:04:57 UTC
Permalink
Can I conclude that you are looking to implement a transformer? Note that
scikit learn transformers only act on X data, not on y data at the moment.
If this is what you need, then you need to implement a transform method for
your class. fit will still be necessary though, as the pipeline calls it
always.

HTH,
Michael
Post by Pagliari, Roberto
Broadly speaking, I would like to add my own custom function into a pipeline.
However, my function is not really a classifier, nor a regressor.
What do you think would be the best way to go about it? Is there a
shortcut that does not require implementing the functions below?
Thank you,
-----Original Message-----
Sent: Monday, February 16, 2015 12:56 AM
Subject: Re: [Scikit-learn-general] which methods do I need to implement for a regressor?
You need fit, predict, and set_params. But set_params you can get by
inheriting sklearn.base.BaseEstimator
G
Post by Pagliari, Roberto
I'd like to implement my own regressor/classificator and possibly use it in a pipeline.
do I need to implement all methods below or can some of them be missing?
┌───────────────────────────────────┬─────────────────────────────────────────┐
Post by Pagliari, Roberto
│decision_function(X) │Predict using the linear model
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│densify() │Convert coefficient matrix to dense
array│
Post by Pagliari, Roberto
│ │format.
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│fit(X, y │Fit linear model with Stochastic
Gradient│
Post by Pagliari, Roberto
│[, coef_init, intercept_init, ...])│Descent.
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│fit_transform(X[, y]) │Fit to data, then transform it.
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│get_params([deep]) │Get parameters for this estimator.
│
Post by Pagliari, Roberto
├───────────────────────────────────┌─────────────────────────────────
───────── │partial_fit(X, y[, sample_weight]) │Fit linear model with
Stochastic Gradient│
│ │Descent.
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│predict(X) │Predict using the linear model
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│score(X, y[, sample_weight]) │Returns the coefficient of
determination │
Post by Pagliari, Roberto
│ │R^2 of the prediction.
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│set_params(*args, **kwargs) │
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│sparsify() │Convert coefficient matrix to
sparse │
Post by Pagliari, Roberto
│ │format.
│
├───────────────────────────────────┌──────────────────────────────────────────
Post by Pagliari, Roberto
│transform(X[, threshold]) │Reduce X to its most important
features. │
Post by Pagliari, Roberto
└───────────────────────────────────┮─────────────────────────────────
────────┘
----------------------------------------------------------------------
-------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
Server from Actuate! Instantly Supercharge Your Business Reports and
Dashboards with Interactivity, Sharing, Native Excel Exports, App
Integration & more Get technology previously reserved for
billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.
clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
--
Gael Varoquaux
Researcher, INRIA Parietal
Laboratoire de Neuro-Imagerie Assistee par Ordinateur
NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
Phone: ++ 33-1-69-08-79-68
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from
Actuate! Instantly Supercharge Your Business Reports and Dashboards with
Interactivity, Sharing, Native Excel Exports, App Integration & more Get
technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Pagliari, Roberto
2015-02-16 17:52:55 UTC
Permalink
I looked into some examples I found online but I’m a bit confused.

Supposed I want to implement my own transformer, something similar to the standard scaler. Would this be sufficient to be used in a pipeline, or should it be done differently?


class ModelTransformer(TransformerMixin):

def __init__(self, model):
self.my_param = None

def fit(self, *args, **kwargs):
# do some stuff
self.my_param = something
return self

def transform(self, X, **transform_params):
# do something with self.myparam and X.copy()
return X.copy()

def set_params(**params):
return self

def get_params(deep=True):
return None

Thank you,
Vlad Niculae
2015-02-16 18:04:08 UTC
Permalink
Hi Roberto,

This is all documented in more detail here: [1]

The transform looks good (just that you might want to add a flag to avoid memory copies when you can afford to destroy the original data).

It’s not clear what the intention of `my_param` is here. It’s not user specified, right? Conventionally, fitted attributes are suffixed with an underscore (`self.my_param_`) and you shouldn’t initialize them in `__init__` (see [2])

Also, if you do intend to have user-specified attributes, this would break grid search, because your `set_params` function does nothing. There are implementations of `set_params` and `get_params` in `sklearn.base.BaseEstimator`, as Gael said. Just inherit from the `BaseEstimator` and those should work, as long as you respect the scikit-learn convention that the `__init__` function doesn’t change the parameters (see [3])

Hope this helps!

Yours,
Vlad

[1] http://scikit-learn.org/stable/developers/index.html#rolling-your-own-estimator
[2] http://scikit-learn.org/stable/developers/index.html#estimated-attributes
[3] http://scikit-learn.org/stable/developers/index.html#parameters-and-init
I looked into some examples I found online but I’m a bit confused.
Supposed I want to implement my own transformer, something similar to the standard scaler. Would this be sufficient to be used in a pipeline, or should it be done differently?
self.my_param = None
# do some stuff
self.my_param = something
return self
# do something with self.myparam and X.copy()
return X.copy()
return self
return None
Thank you,
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Pagliari, Roberto
2015-02-16 19:01:42 UTC
Permalink
Hi Vlad/All,
Thanks for the pointers. The reason I return a copy of X is because I don't want to modify the dataset during grid search with cross validation (I'm not sure if the argument of transform is a deep copy or shallow copy).

I implemented the class like the below. Basically a transformer that does nothing, with no parameters.

class myTransformer(BaseEstimator, TransformerMixin):
def __init__(self):
pass
def fit(self, *args, **kwargs):
return self
def transform(self, X, **transform_params):
return X.copy()
def set_params(self, **params):
return self
def get_params(self, deep=True):
return None

I'm getting this error when using it in a pipeline (during grid search cv, where the pipeline is standard scaler + myTransformer + svm ):
'NoneType' object has no attribute 'iteritems'

Do you know what the issue might be?


Thank you,


-----Original Message-----
From: Vlad Niculae [mailto:***@gmail.com]
Sent: Monday, February 16, 2015 1:04 PM
To: scikit-learn-***@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

Hi Roberto,

This is all documented in more detail here: [1]

The transform looks good (just that you might want to add a flag to avoid memory copies when you can afford to destroy the original data).

It’s not clear what the intention of `my_param` is here. It’s not user specified, right? Conventionally, fitted attributes are suffixed with an underscore (`self.my_param_`) and you shouldn’t initialize them in `__init__` (see [2])

Also, if you do intend to have user-specified attributes, this would break grid search, because your `set_params` function does nothing. There are implementations of `set_params` and `get_params` in `sklearn.base.BaseEstimator`, as Gael said. Just inherit from the `BaseEstimator` and those should work, as long as you respect the scikit-learn convention that the `__init__` function doesn’t change the parameters (see [3])

Hope this helps!

Yours,
Vlad

[1] http://scikit-learn.org/stable/developers/index.html#rolling-your-own-estimator
[2] http://scikit-learn.org/stable/developers/index.html#estimated-attributes
[3] http://scikit-learn.org/stable/developers/index.html#parameters-and-init
I looked into some examples I found online but I’m a bit confused.
Supposed I want to implement my own transformer, something similar to the standard scaler. Would this be sufficient to be used in a pipeline, or should it be done differently?
self.my_param = None
# do some stuff
self.my_param = something
return self
# do something with self.myparam and X.copy()
return X.copy()
return self
return None
Thank you,
----------------------------------------------------------------------
-------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
Server from Actuate! Instantly Supercharge Your Business Reports and
Dashboards with Interactivity, Sharing, Native Excel Exports, App
Integration & more Get technology previously reserved for
billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.
clktrk_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Gael Varoquaux
2015-02-16 22:39:19 UTC
Permalink
Your get_params looks wrong to me: it is not returning a dictionary.

Sent from my phone. Please forgive brevity and mis spelling
Post by Pagliari, Roberto
Hi Vlad/All,
Thanks for the pointers. The reason I return a copy of X is because I
don't want to modify the dataset during grid search with cross
validation (I'm not sure if the argument of transform is a deep copy or
shallow copy).
I implemented the class like the below. Basically a transformer that
does nothing, with no parameters.
pass
return self
return X.copy()
return self
return None
I'm getting this error when using it in a pipeline (during grid search
'NoneType' object has no attribute 'iteritems'
Do you know what the issue might be?
Thank you,
-----Original Message-----
Sent: Monday, February 16, 2015 1:04 PM
Subject: Re: [Scikit-learn-general] which methods do I need to
implement for a regressor?
Hi Roberto,
This is all documented in more detail here: [1]
The transform looks good (just that you might want to add a flag to
avoid memory copies when you can afford to destroy the original data).
It’s not clear what the intention of `my_param` is here. It’s not user
specified, right? Conventionally, fitted attributes are suffixed with
an underscore (`self.my_param_`) and you shouldn’t initialize them in
`__init__` (see [2])
Also, if you do intend to have user-specified attributes, this would
break grid search, because your `set_params` function does nothing.
There are implementations of `set_params` and `get_params` in
`sklearn.base.BaseEstimator`, as Gael said. Just inherit from the
`BaseEstimator` and those should work, as long as you respect the
scikit-learn convention that the `__init__` function doesn’t change the
parameters (see [3])
Hope this helps!
Yours,
Vlad
[1]
http://scikit-learn.org/stable/developers/index.html#rolling-your-own-estimator
[2]
http://scikit-learn.org/stable/developers/index.html#estimated-attributes
[3]
http://scikit-learn.org/stable/developers/index.html#parameters-and-init
Post by Pagliari, Roberto
I looked into some examples I found online but I’m a bit confused.
Supposed I want to implement my own transformer, something similar to
the standard scaler. Would this be sufficient to be used in a pipeline,
or should it be done differently?
Post by Pagliari, Roberto
self.my_param = None
# do some stuff
self.my_param = something
return self
# do something with self.myparam and X.copy()
return X.copy()
return self
return None
Thank you,
----------------------------------------------------------------------
Post by Pagliari, Roberto
-------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
Server from Actuate! Instantly Supercharge Your Business Reports and
Dashboards with Interactivity, Sharing, Native Excel Exports, App
Integration & more Get technology previously reserved for
billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.
Post by Pagliari, Roberto
clktrk_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from
Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration &
more Get technology previously reserved for billion-dollar
corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and
Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Pagliari, Roberto
2015-02-17 03:15:37 UTC
Permalink
Hi Gael,
I think using list may cause problems with what I was doing. So I decided to change things in a way that I only need a scalar and everything works now.

But thanks for your help !

From: Gael Varoquaux [mailto:***@normalesup.org]
Sent: Monday, February 16, 2015 5:39 PM
To: scikit-learn-***@lists.sourceforge.net
Cc: scikit-learn-***@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] which methods do I need to implement for a regressor?


Your get_params looks wrong to me: it is not returning a dictionary.

Sent from my phone. Please forgive brevity and mis spelling
On Feb 16, 2015, at 20:02, "Pagliari, Roberto" <***@appcomsci.com<mailto:***@appcomsci.com>> wrote:

Hi Vlad/All,
Thanks for the pointers. The reason I return a copy of X is because I don't want to modify the dataset during grid search with cross validation (I'm not sure if the argument of transform is a deep copy or shallow copy).

I implemented the class like the below. Basically a transformer that does nothing, with no parameters.

class myTransformer(BaseEstimator, TransformerMixin):
def __init__(self):
pass
def fit(self, *args, **kwargs):
return self
def transform(self, X, **transform_params):
return X.copy()
def set_params(self, **params):
return self
def get_params(self, deep=True):
return None

I'm getting this error when using it in a pipeline (during grid search cv, where the pipeline is standard scaler + myTransformer + svm ):
'NoneType' object has no attribute 'iteritems'

Do you know what the issue might be?


Thank you,



-----Original Message-----
From: Vlad Niculae [mailto:***@gmail.com<mailto:***@gmail.com>]
Sent: Monday, February 16, 2015 1:04 PM
To: scikit-learn-***@lists.sourceforge.net<mailto:scikit-learn-***@lists.sourceforge.net>
Subject: Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

Hi Roberto,

This is all documented in more detail here: [1]

The transform looks good (just that you might want to add a flag to avoid memory copies when you can afford to destroy the original data).

It’s not clear what the intention of `my_param` is here. It’s not user specified, right? Conventionally, fitted attributes are suffixed with

an underscore (`self.my_param_`) and you shouldn’t initialize them in `__init__` (see [2])

Also, if you do intend to have user-specified attributes, this would break grid search, because your `set_params` function does nothing. There are implementations of `set_params` and `get_params` in `sklearn.base.BaseEstimator`, as Gael said. Just inherit from the `BaseEstimator` and those should work, as long as you respect the scikit-learn convention that the `__init__` function doesn’t change the parameters (see [3])

Hope this helps!

Yours,
Vlad

[1] http://scikit-learn.org/stable/developers/index.html#rolling-your-own-estimator
[2] http://scikit-learn.org/stable/developers/index.html#estimated-attributes
[3] http://scikit-learn.org/stable/developers/index.html#parameters-and-init

On 16 Feb 2015, at 12:52, Pagliari, Roberto <***@appcomsci.com<mailto:***@appcomsci.com>> wrote:

I looked into some examples I found online but I’m a bit confused.

Supposed I want to implement my own transformer, something similar to the standard scaler. Would this be sufficient to be used in a pipeline, or should it be done differently?


class ModelTransformer(TransformerMixin):

def __init__(self, model):
self.my_param = None

def fit(self, *args, **kwargs):
# do some

stuff
self.my_param = something
return self

def transform(self, X, **transform_params):
# do something with self.myparam and X.copy()
return X.copy()

def set_params(**params):
return self

def get_params(deep=True):
return None

Thank you,


________________________________

-------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
Server from Actuate! Instantly Supercharge Your Business Reports and
Dashboards with Interactivity, Sharing, Native Excel Exports, App
Integration & more Get technology previously reserved for
billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.
clktrk

________________________________

Scikit-learn-general mailing list
Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



________________________________

Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk

________________________________

Scikit-learn-general mailing list
Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

________________________________

Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk

________________________________

Scikit-learn-general mailing list
Scikit-learn-***@lists.sourceforge.net<mailto:Scikit-learn-***@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Pagliari, Roberto
2015-02-17 03:20:03 UTC
Permalink
Hi Vlad,
Thanks for your help. I'm requiring a scalar number now (not a list) and I think it is working.
I did not implement get_params and set_params because I don't need them and no complains from sklearn; I guess because they are derived from base.

Regarding the copy, I'm using fit and transform in grid search, so I cannot afford to destroy the data. Hwoever if you have any suggestions, you are welcome.

Thank you,


-----Original Message-----
From: Vlad Niculae [mailto:***@gmail.com]
Sent: Monday, February 16, 2015 1:04 PM
To: scikit-learn-***@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] which methods do I need to implement for a regressor?

Hi Roberto,

This is all documented in more detail here: [1]

The transform looks good (just that you might want to add a flag to avoid memory copies when you can afford to destroy the original data).

It’s not clear what the intention of `my_param` is here. It’s not user specified, right? Conventionally, fitted attributes are suffixed with an underscore (`self.my_param_`) and you shouldn’t initialize them in `__init__` (see [2])

Also, if you do intend to have user-specified attributes, this would break grid search, because your `set_params` function does nothing. There are implementations of `set_params` and `get_params` in `sklearn.base.BaseEstimator`, as Gael said. Just inherit from the `BaseEstimator` and those should work, as long as you respect the scikit-learn convention that the `__init__` function doesn’t change the parameters (see [3])

Hope this helps!

Yours,
Vlad

[1] http://scikit-learn.org/stable/developers/index.html#rolling-your-own-estimator
[2] http://scikit-learn.org/stable/developers/index.html#estimated-attributes
[3] http://scikit-learn.org/stable/developers/index.html#parameters-and-init
I looked into some examples I found online but I’m a bit confused.
Supposed I want to implement my own transformer, something similar to the standard scaler. Would this be sufficient to be used in a pipeline, or should it be done differently?
self.my_param = None
# do some stuff
self.my_param = something
return self
# do something with self.myparam and X.copy()
return X.copy()
return self
return None
Thank you,
----------------------------------------------------------------------
-------- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT
Server from Actuate! Instantly Supercharge Your Business Reports and
Dashboards with Interactivity, Sharing, Native Excel Exports, App
Integration & more Get technology previously reserved for
billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.
clktrk_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
Continue reading on narkive:
Loading...