EnsembleSVMhttp://mloss.orgUpdates and additions to EnsembleSVMenWed, 13 Jul 2016 16:58:17 -0000EnsembleSVM 2.0<html><p>EnsembleSVM is an open-source machine learning project. The EnsembleSVM library offers functionality to perform ensemble learning using Support Vector Machine (SVM) base models. In particular, we offer routines for binary ensemble models using SVM base classifiers. </p> <p>The library enables users to efficiently train models for large data sets. Through a divide-and-conquer strategy, base models are trained on subsets of the data which makes training feasible for large data sets even when using nonlinear kernels. Base models are combined into ensembles with high predictive performance through a bagging strategy. Experimental results have shown the predictive performance to be comparable with standard SVM models but with drastically reduced training time. </p> <p>For more information, please refer to our website which contains a detailed manual of all available tools and some use cases to get familiar with the software. </p> <h2>Useful links:</h2> <ol> <li> <a href="">EnsembleSVM homepage</a> </li> <li> <a href="">EnsembleSVM @ GitHub</a> </li> <li> <a href="">EnsembleSVM GitHub wiki</a> </li> </ol></html>Marc Claesen, Frank De Smet, Johan A.K. Suykens, Bart De MoorMon, 31 Mar 2014 08:06:20 -0000 vector machinekernelclassificationlarge scale learninglibsvmbaggingensemble learning<b>Comment by Krzysztof Sopyla on 2013-03-27 22:42</b><p>Hi,</p> <p>Could you elaborate on ensemble methods which was implemented in this software? Is it averaging or majority voting? How many base classifiers are created for particular problem?</p> <p>Maybe this information's are available in article?</p> <p>Thanks in advanced</p> Krzysztof SopylaWed, 27 Mar 2013 22:42:18 -0000<b>Comment by Marc Claesen on 2013-03-27 22:49</b><p>Hi Krzysztof,</p> <p>Currently aggregation is performed through majority voting, but future releases will feature more flexibility in this regard.</p> <p>The software itself is versatile and can be used for all sorts of learning tasks. The optimal amount (and size) of base classifiers depends on the problem. In the use cases listed on our home page, you may find some complete examples.</p> <p>EnsembleSVM enables users to perform cross-validation, which is useful to tune SVM parameters but also to find optimal base classifiers so you may find this interesting.</p> <p>Generally, using more base classifiers will not degrade predictive performance but it will bloat the models and consequently reduce prediction speed.</p> <p>Best regards,</p> <p>Marc Claesen</p> Marc ClaesenWed, 27 Mar 2013 22:49:31 -0000<b>Comment by vu ha on 2014-06-27 02:29</b><p>Hi Marc, </p> <p>do you have instruction/example on using RESVM for learning with positives &amp; unlabeled data as described in this paper: (A Robust Ensemble Approach to Learn From Positive and Unlabeled Data Using SVM Base Models)?</p> <p>Thanks, Vu~</p> vu haFri, 27 Jun 2014 02:29:36 -0000<b>Comment by Marc Claesen on 2014-07-03 08:35</b><p>Dear Vu Ha,</p> <p>You can find a Python example (including a data set based on MNIST with label noise) in the following Github repository:</p> <p>This repository should be updated soon with more examples and a Python implementation that does not require EnsembleSVM (though the current one will remain available).</p> <p>Best regards, Marc</p> Marc ClaesenThu, 03 Jul 2014 08:35:44 -0000<b>Comment by Girish Ramachandra on 2014-08-02 05:06</b><p>Hi Marc,</p> <p>Just wanted to get a couple of queries clarified before I try out your RESVM python script: 1) I have my data in the format -- {label,feature<em>1,...,feature</em>25}. Can I use commonly available scripts to convert CSV to LIBSVM-format and run your script on it? 2) Any limitations on number of cases? Eg., I have about 31K cases with positive labels, and 200K unlabeled cases.</p> <p>Thanks much! -Girish</p> Girish RamachandraSat, 02 Aug 2014 05:06:37 -0000<b>Comment by Marc Claesen on 2014-08-04 11:55</b><p>Hi Girish,</p> <p>You can use the <em>sparse</em> tool which is included in EnsembleSVM to convert CSV files to LIBSVM format. Here's an example (labels must be in the first column of your CSV file):</p> <pre><code>sparse -data data.csv -o data.libsvm -labeled -delim , </code></pre> <p>There is no hard limit on the number of cases the script can deal with (within reasonable bounds). A couple of million instances should be no problem for the current implementation.</p> <p>Note that the RESVM script is just a reference implementation, e.g. it is not optimized for large-scale use (though this should not pose problems for you). You will need to have EnsembleSVM installed to run the script.</p> <p>Best regards, Marc</p> Marc ClaesenMon, 04 Aug 2014 11:55:55 -0000<b>Comment by Girish Ramachandra on 2014-08-06 06:56</b><p>Thanks, Marc! It worked just fine for my data. </p> <p>Quick question -- I understand that in case of traditional SVM, the value of the SVM decision function based on the lagrangian multipliers and support vectors should only be interpreted by its sign, i.e., if decision value &gt; 0, then assign label: +1, else assign label: -1. Does that change in your case? Because I did see cases where the decision value was &gt; 0, and the label was -1.</p> <p>Thanks! -Girish</p> Girish RamachandraWed, 06 Aug 2014 06:56:44 -0000<b>Comment by Marc Claesen on 2014-08-06 08:51</b><p>Hi Girish,</p> <p>The decision values used in the RESVM script are explained in this manuscript:</p> <p>Briefly: the decision values are the fraction of base models that predict positive (the default threshold for positive predictions is therefore 0.5 instead of 0.0). </p> <p>In case of unanimous votes by all base models, we use the sum of the SVM decision values of all base models. Effectively this means that RESVM decision values range from -infinity to +infinity, though they are usually between 0 and 1.</p> <p>Regards,</p> <p>Marc</p> Marc ClaesenWed, 06 Aug 2014 08:51:52 -0000<b>Comment by Ehsan Sadrfaridpour on 2016-07-13 16:58</b><p>Hi Marc,</p> <p>Have you used any model selection technique for training the models?</p> <p>Best, Ehsan</p> Ehsan SadrfaridpourWed, 13 Jul 2016 16:58:17 -0000