I don't have a strong opinion.
wondering why not go all the way to VBGMM.
addition.
Post by Wei Xue@Andreas, on the second thought, MAP EM seems not so important. It
just has more theoretic support. We might skip this.
Wei
Sorry for the confusion.
I am just saying min_covar that prevent singular covariance may be
not flexible. I think the value of min_covar is too large for
estimated covariance, sometimes. For example, a user first try a
small subset of training data using GMM with default min_covar =
0.001, then he use a larger data set but still use min_covar =
0.001. But he could set min_covar smaller in the larger data set.
In MAP EM, when we have more data instances, the effect of
min_covar would be *automatically* diminished.
min_covar is just a regularization technique. We could justify it
using MAP estimation, but there is slight difference in the
scalar coefficient before \alpha. So MAP EM is more convincing
than simply setting min_covar. I am not saying MAP EM is
preferable over VBGMM, but preferable over EM for GMM. Does that
make it clear?
Wei
Sorry, I'm not following.
I'm not sure what you are arguing for. I know how VBGMM works,
but I'm not sure how MAP EM would work, and why it would be
preferable over VBGMM.
Post by Wei XueVBGMM is a full Bayesian estimation in both 'E-step' and
'M-step' (although there is no such concept in VB) . The
parameters in VB are random variables, and described by a
posterior distribution. The posterior distribution is the
product of the likelihood and the prior distribution. On the
other hand, although MAP estimation use the posterior
distribution as well, but it is still represented by a single
value like in 'M-step' like in EM. For example, if we use
inverse Wishart distribution W^{-1}(\Sigma|\Phi, \nu) as the
prior distribution for covariance matrix and set the
parameter \Phi to be\alpha*I. We have \tilde{\Sigma} =
\frac{n}{\nu+d+1+n}(\hat{\Sigma} + \alpha*I)ïŒ where
\hat{\Sigma} is the classic estimation of covariance
matrix. As you can see, when the number of data instances
increase, the \tilde{\Sigma} is approximated by \hat{\Sigma}.
The effect \alpha is diminished. Therefore the effect of
min_covar ( \alpha ) is not prefixed, it also depends on the
number of training data we have.
Wei
On Wed, Mar 25, 2015 at 3:18 PM, Andreas Mueller
Thanks for your feedback.
Post by Wei XueThanks Andreas, Kyle, Vlad and Olivier for the detailed review.
1. For the part /Implementing VBGMM, /do you mean it
would be better if I add specific functions to be
I just felt the paragraph was a bit unclear, and would
benefit from saying what exactly you want to do.
Post by Wei Xue6. I would like to add a variance of EM estimation to
GMM module, MAP estimation. Currently, the m-step use
maximum likelihood estimation with min_covariance which
prevent singular covariance estimation. I think it would
be better to add MAP estimation for m-step, because the
fixed min_covariance in ML estimation might be too
aggressive in some cases. In MAP, the effect of
correcting covariance will be decreasing as the number
of data instances increases.
How is this different from the VBGMM?
Post by Wei Xue7. I would also like to add some functionality to deal
with missing values in GMM. The situation with missing
value in the training data is not uncommon and PRML book
also mentioned that.
I think this is outside the scope of this project, as we
generally have avoided dealing with missing values in
sklearn estimators directly.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go
Parallel Website, sponsored
by Intel and developed in partnership with Slashdot
Media, is your hub for all
things parallel software development, from weekly thought
leadership blogs to
news, videos, case studies, tutorials and more. Take a
look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now.http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel
Website, sponsored
by Intel and developed in partnership with Slashdot Media, is
your hub for all
things parallel software development, from weekly thought
leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general