[Scikit-learn-general] Difference in PLS regression with respect to Matlab

Fernando Quivira

2016-06-20 15:08:01 UTC

Oops, I forgot to ask the question:

Does anyone know what could be the problem? Am I using the methods
incorrectly or is there a formulation difference? I know that sklearn's
PLSregression uses the NIPALS agorithm while Matlab uses SIMPLS. However, I
figured that a change in algorithm wouldn't affect the outputs by a
non-negligible scale

Thanks

- Fernando

Post by Fernando Quivira
Hi
I'm trying to replicate some dimension reduction results computed with
Matlab's plsregress using sklearn's PLSRegression. However, I'm finding
that the output of the transform method in sklearn's PLSRegression differs
from Matlab results by a constant scale factor across each component
(constant across features but different across components).
I used some dummy data that I could load in Matlab to test this. I found
that if I normalized (with zscores) the sklearn and Matlab's outputs, I got
the same results (see attached figures). I have attached the code that can
replicate this. The whole test can be run from testPLS.m (you need matlab
2014+).
I'm using python3.5 64bit in Windows with the Anaconda environment and
sklearn 0.17.1-np110py35_1
Thanks
- Fernando