Which version of scikit-learn are you using?
greatly reduced the size in some cases.
Post by Piotr PÅoÅskiThanks for comments! I put more details of my problem here
http://stackoverflow.com/questions/36523989/why-sklearn-randomforest-model-take-a-lot-of-disk-space-after-save
Indeed, saving with joblib takes less space but there is still a lot
of space used on the disk.
Best,
Piotr
You may also want to save your model using joblib (possibly with
compression enabled) instead of cPickle.
Mathieu
On Sun, Apr 10, 2016 at 9:13 AM, Piotr PÅoÅski
Hi All,
I am saving RandomForestClassifier model from sklearn library
with code below
|
with open('/tmp/rf.model', 'wb') as f: cPickle.dump(RF_model, f)
|
||It takes a lot of space on my hard drive. There are only 50
trees in the model, however it takes over 50 MB on disk
(analyzed dataset is ~ 20MB, with 21 features). Does anybody
have idea why? I observe similar behavior for
ExtraTreesClassifier.
Best,
Piotr
------------------------------------------------------------------------------
Find and fix application performance issues faster with
Applications Manager
Applications Manager provides deep performance insights into
multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
<http://pubads.g.doubleclick.net/gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532>
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Find and fix application performance issues faster with
Applications Manager
Applications Manager provides deep performance insights into
multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
<http://pubads.g.doubleclick.net/%0Agampad/clk?id=1444514301&iu=/ca-pub-7940484522588532>
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general