[Scikit-learn-general] Adaline (adaptive linear neuron) classifier

Discussion:

Sebastian Raschka

2015-04-04 04:16:24 UTC

Hi,

maybe I overlooked something, but I couldn't find the classic adaline (ADAptive LInear NEuron) in scikit-learn. It's probably not that useful (anymore) since we have logistic regression and support vector machines, but maybe it would not be a bad idea to add for the sake of completeness (and since scikit-learn also has a perceptron)?

The implementation would be similar to logistic regression, but the cost function is the sum of the squared errors like in linear regression. It could be added to the SGDClassifier as loss='linear' or loss='adaline' plus a separate implementation using liblinear.

The reference would be:
B. Widrow et al. Adaptive ”Adaline” neuron using chemical ”memistors”. Number Technical Report 1553-2. Stanford Electron. Labs., Stanford, CA, October 1960

What do you think?

Best,
Sebastian

Andy

2015-04-05 15:00:38 UTC

Permalink

Hi Sebastian.
First off, if this is a classification algorithm with sum of squared
errors, you can just do it using linear regression + OvRClassifier, right?

In general, I (and I think most of the rest of the project) am weary
about adding something for "completeness".
Any algorithm we add creates a significant amount of maintenance burden.
See:
http://scikit-learn.org/dev/faq.html#can-i-add-this-classical-algorithm-from-the-80s
and
http://scikit-learn.org/dev/faq.html#why-are-you-so-selective-on-what-algorithms-you-include-in-scikit-learn

Furthermore, I have not heard of this algorithm, and it is not mentioned
in any of the prominent textbooks (ESL, Bishop, Murphy).
So while it might be foundational, I don't think it is necessary for
"completeness".

Andy

Post by Sebastian Raschka
Hi,
e
maybe I overlooked something, but I couldn't find the classic adaline (ADAptive LInear NEuron) in scikit-learn. It's probably not that useful (anymore) since we have logistic regression and support vector machines, but maybe it would not be a bad idea to add for the sake of completeness (and since scikit-learn also has a perceptron)?
The implementation would be similar to logistic regression, but the cost function is the sum of the squared errors like in linear regression. It could be added to the SGDClassifier as loss='linear' or loss='adaline' plus a separate implementation using liblinear.
B. Widrow et al. Adaptive ”Adaline” neuron using chemical ”memistors”. Number Technical Report 1553-2. Stanford Electron. Labs., Stanford, CA, October 1960
What do you think?
Best,
Sebastian
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Sebastian Raschka

2015-04-05 19:26:02 UTC

Permalink

After reading your message, I agree, Andy,
it's better to focus on the newer and more useful methods rather than adding those remnants from the past that no one really uses in practice anymore anyway. During the implementation of the VotingClassifier I kind of realized how much work is involved and thus it would be better to focus the energy and time on more important things.
While it's probably not a bad idea to have the perceptron for educational purposes, also Bishop mentions the adaline only in a side note at the end of the perceptron article, thus I think no one would really miss this algorithm in scikit-learn ;)

Best,
Sebastian

Post by Andy
Hi Sebastian.
First off, if this is a classification algorithm with sum of squared
errors, you can just do it using linear regression + OvRClassifier, right?
In general, I (and I think most of the rest of the project) am weary
about adding something for "completeness".
Any algorithm we add creates a significant amount of maintenance burden.
http://scikit-learn.org/dev/faq.html#can-i-add-this-classical-algorithm-from-the-80s
and
http://scikit-learn.org/dev/faq.html#why-are-you-so-selective-on-what-algorithms-you-include-in-scikit-learn
Furthermore, I have not heard of this algorithm, and it is not mentioned
in any of the prominent textbooks (ESL, Bishop, Murphy).
So while it might be foundational, I don't think it is necessary for
"completeness".
Andy

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Sturla Molden

2015-04-05 22:05:28 UTC

Permalink

Post by Sebastian Raschka
While it's probably not a bad idea to have the perceptron for educational
purposes, also Bishop mentions the adaline only in a side note at the end
of the perceptron article, thus I think no one would really miss this
algorithm in scikit-learn ;)

ADALINE was ste of the art in 1960.

:-)

Sturla Molden

2015-04-05 22:38:21 UTC

Permalink

Post by Sturla Molden

ADALINE was ste of the art in 1960.

Also it was not just an algorithm, but also an analog device. This is what
an Adaline looked like:

Loading Image...

We could implement this with a GUI, with knobs to turn and stuff :-D

Sturla

Sebastian Raschka

2015-04-05 23:34:01 UTC

Permalink

Haha, nice! I was always looking for a good reason to get some raspberry Pis :). No, seriously, I'll probably really tackle this in the second half of the year on rainy weekends when I am done with my currently tedious projects :)

Sent from my iPhone

Post by Sturla Molden

ADALINE was ste of the art in 1960.

Also it was not just an algorithm, but also an analog device. This is what
https://www.dropbox.com/s/py7m9jsfquhb6w4/Photo%2006.04.15%2000%2024%2020.png
We could implement this with a GUI, with knobs to turn and stuff :-D
Sturla
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Mathieu Blondel

2015-04-06 04:27:10 UTC

Permalink

Post by Andy
Hi Sebastian.
First off, if this is a classification algorithm with sum of squared
errors, you can just do it using linear regression + OvRClassifier, right?

This is also what RidgeClassifier does, only in a smarter way (Cholesky
decomposition is done only once regardless of the number of classes).

Mathieu

Sturla Molden

2015-04-06 10:46:21 UTC

Permalink

Post by Mathieu Blondel
This is also what RidgeClassifier does, only in a smarter way (Cholesky
decomposition is done only once regardless of the number of classes).

ADALINE used a gradient descent learning rule. The idea was to turn the
knobs randomly, update on pen and paper, adjust the knobs, update, adjust
again, etc. Then you could collect patterns in a ring binder and use the
same "adaline" box for multiple pattern recognition tasks. This was e.g.
used on submarines to process sonar data.

Not really relevant today though, except in a museum.

:-)

Sturla