SMPyBanditshttp://mloss.orgUpdates and additions to SMPyBanditsenTue, 20 Mar 2018 20:12:13 -0000SMPyBandits 0.9.2<html><h1><em>SMPyBandits</em></h1> <p><strong>Open-Source Python package for Single- and Multi-Players multi-armed Bandits algorithms</strong>. </p> <p>This repository contains the code of <a href="">Lilian Besson's</a> numerical environment, written in <a href="">Python (2 or 3)</a>, for numerical simulations on <em>single</em>-player and <em>multi</em>-players <a href="">Multi-Armed Bandits (MAB)</a> algorithms. </p> <p>A complete Sphinx-generated documentation is on <a href=""></a>. </p> <h2>Quick presentation</h2> <p><em>SMPyBandits</em> contains the most complete collection of single-player (classical) bandit algorithms on the Internet (<a href="">over 65!</a>), as well as implementation of all the state-of-the-art <a href="">multi-player algorithms</a>. </p> <p>I follow very actively the latest publications related to Multi-Armed Bandits (MAB) research, and usually implement quite quickly the new algorithms (see for instance, <a href="">Exp3++</a>, <a href="">CORRAL</a> and <a href="">SparseUCB</a> were each introduced by articles (<a href="">for Exp3++</a>, <a href="">for CORRAL</a>, <a href="">for SparseUCB</a>) presented at COLT in July 2017, <a href="">LearnExp</a> comes from a <a href="">NIPS 2017 paper</a>, and <a href="">kl-UCB++</a> from an <a href="">ALT 2017 paper</a>.). </p> <ul> <li> Classical MAB have a lot of applications, from clinical trials, A/B testing, game tree exploration, and online content recommendation (my framework does <em>not</em> implement contextual bandit - yet). </li> <li> <a href="">Multi-player MAB</a> have applications in Cognitive Radio, and my framework implements <a href="">all the collision models</a> found in the literature, as well as all the algorithms from the last 10 years or so (<a href="">rhoRand</a> from 2009, <a href="">MEGA</a> from 2015, <a href="">MusicalChair</a>, and our state-of-the-art algorithms <a href="">RandTopM</a> and <a href="">MCTopM</a>). </li> </ul> <p>With this numerical framework, simulations can run on a single CPU or a multi-core machine, and summary plots are automatically saved as high-quality PNG, PDF and EPS (ready for being used in research article). Making new simulations is very easy, one only needs to write a configuration script and basically no code! See <a href=";q=configuration&amp;type=&amp;utf8=%E2%9C%93">these examples</a> (files named <code></code>). </p> <p>A complete <a href="">Sphinx</a> documentation for each algorithms and every piece of code (included constants in the configurations!) is available here: <a href=""></a>. </p> <p><img src="" alt="PyPI implementation"/> <img src="" alt="PyPI pyversions"/> <a href=""><img src="" alt="Maintenance"/></a> <a href=""><img src="" alt="Ask Me Anything"/></a> </p> <blockquote><p>Note: - <br /> <a href="">I (Lilian Besson)</a> have <a href="">started my PhD</a> in October 2016, and this is a part of my <strong>on going</strong> research since December 2016. - I launched the <a href="">documentation</a> on March 2017, I wrote my first research articles using this framework in 2017 and decided to (finally) open-source my project in February 2018. </p> </blockquote><hr /> <h2>How to cite this work?</h2> <p>If you use this package for your own work, please consider citing it with <a href="">this piece of BibTeX</a>: </p> <p>`<code>bibtex @misc{SMPyBandits, title = {{SMPyBandits: an Open-Source Research Framework for Single and Multi-Players Multi-Arms Bandits (MAB) Algorithms in Python}}, author = {Lilian Besson}, year = {2018}, url = {}, howpublished = {Online at: \url{}}, note = {Code at, documentation at} } `</code> </p> <p>I also wrote a small paper to present <em>SMPyBandits</em>, and I will send it to <a href="">JMLR MLOSS</a>. The paper can be consulted <a href="">here on my website</a>. </p> <hr /> <h2>Other interesting things</h2> <h3><a href="">Single-player Policies</a></h3> <ul> <li> More than 65 algorithms, including all known variants of the <a href="">UCB</a>, <a href="">kl-UCB</a>, <a href="">MOSS</a> and <a href="">Thompson Sampling</a> algorithms, as well as other less known algorithms (<a href="">OCUCB</a>, <a href="">BESA</a>, <a href="">OSSB</a> etc). </li> <li> <a href="">SparseWrapper</a> is a generalization of <a href="">the SparseUCB from this article</a>. </li> <li> Implementation of very recent Multi-Armed Bandits algorithms, e.g., <a href="">kl-UCB++</a> (from <a href="">this article</a>), <a href="">UCB-dagger</a> (from <a href="">this article</a>), or <a href="">MOSS-anytime</a> (from <a href="">this article</a>). </li> <li> Experimental policies: <a href="">BlackBoxOpt</a> or <a href="">UnsupervisedLearning</a> (using Gaussian processes to learn the arms distributions). </li> </ul> <h3>Arms and problems</h3> <ul> <li> My framework mainly targets stochastic bandits, with arms following <a href="">Bernoulli</a>, bounded (SMPyBandits/truncated) or unbounded <a href="">Gaussian</a>, <a href="">Exponential</a>, <a href="">Gamma</a> or <a href="">Poisson</a> distributions. </li> <li> The default configuration is to use a fixed problem for N repetitions (e.g. 1000 repetitions, use <a href="">MAB.MAB</a>), but there is also a perfect support for "Bayesian" problems where the mean vector µ1,...,µK change <em>at every repetition</em> (see <a href="">MAB.DynamicMAB</a>). </li> <li> There is also a good support for Markovian problems, see <a href="">MAB.MarkovianMAB</a>, even though I didn't implement any policies tailored for Markovian problems. </li> </ul> <h3>Other remarks</h3> <ul> <li> Everything here is done in an imperative, object oriented style. The API of the Arms, Policy and MultiPlayersPolicy classes is documented <a href="">in this page</a>. </li> <li> The code is <a href="">clean</a>, valid for both <a href="">Python 2</a> and <a href="">Python 3</a>. </li> <li> Some piece of code come from the <a href="">pymaBandits</a> project, but most of them were refactored. Thanks to the initial project! </li> <li> <a href="">G.Varoquaux</a>'s <a href="">joblib</a> is used for the <a href="">Evaluator</a> and <a href="">EvaluatorMultiPlayers</a> classes, so the simulations are easily parallelized on multi-core machines. (Put <code>n_jobs = -1</code> or <code>PARALLEL = True</code> in the config file to use all your CPU cores, as it is by default). </li> </ul> <hr /> <h2><a href="">How to run the experiments ?</a></h2> <blockquote><p>See this document: []( for more details. </p> </blockquote><p>TL;DR: this short bash snippet shows how to clone the code, install the requirements for Python 3 (in a <a href="">virtualenv</a>, and starts some simulation for N=100 repetitions of the default non-Bayesian Bernoulli-distributed problem, for K=9 arms, an horizon of T=10000 and on 4 CPUs (it should take about 20 minutes for each simulations): </p> <p>`<code></code>bash cd /tmp/ </p> <h1>just be sure you have the latest virtualenv from Python 3</h1> <p>sudo pip3 install --upgrade --force-reinstall virtualenv </p> <h1>create and active the virtualenv</h1> <p>virtualenv venv . venv/bin/activate type pip # check it is /tmp/venv/bin/pip type python # check it is /tmp/venv/bin/python </p> <p>pip install SMPyBandits # pulls latest version from </p> <h1>or you can also</h1> <p>pip install git+[full] # pulls latest version from </p> <h1>run a single-player simulation!</h1> <p>N=100 T=10000 K=9 N_JOBS=4 make single </p> <h1>run a multi-player simulation!</h1> <p>N=100 T=10000 M=3 K=9 N_JOBS=4 make more `<code></code> </p> <ul> <li> If speed matters to you and you want to use algorithms based on <a href="">kl-UCB</a>, you should take the time to build and install the fast C implementation of the utilities KL functions. Default is to use <a href=""></a>, but using <a href="">the C version from Policies/C/</a> really speeds up the computations. Just follow the instructions, it should work well (you need <code>gcc</code> to be installed). </li> <li> And if speed matters, be sure that you have a working version of <a href="">Numba</a>, it is used by many small functions to (try to automatically) speed up the computations. </li> </ul> <hr /> <h3>Warning</h3> <ul> <li> This work is still <strong>experimental</strong>! It's <a href="">active research</a>. It should be completely bug free and every single module/file should work perfectly(as <a href="">this pylint log</a> and <a href="">this other one</a> says), but bugs are sometimes hard to spot so if you encounter any issue, <a href="">please fill a bug ticket</a>. </li> <li> Whenever I add a new feature, I run experiments to check that nothing is broken. But <em>there is no unittest</em> (I don't have time). You would have to trust me! </li> <li> This project is NOT meant to be a library that you can use elsewhere, but a research tool. In particular, I don't take ensure that any of the Python modules can be imported from another directory than the main directory. </li> </ul> <h2>Contributing?</h2> <p>Contributions (issues, questions, pull requests) are of course welcome, but this project is and will stay a personal environment designed for quick research experiments, and will never try to be an industry-ready module for applications of Multi-Armed Bandits algorithms. </p> <p>If you want to contribute, please have a look to the <a href="">CONTRIBUTING</a> page, and if you want to be more seriously involved, read the [CODE_OF_CONDUCT]( page. </p> <ul> <li> You are welcome to <a href="">submit an issue</a>, if it was not previously answered, </li> <li> If you have interesting example of use of SMPyBandits, please share it! (<a href="">Jupyter Notebooks</a> are preferred). And fill a pull request to <a href="">add it to the notebooks examples</a>. </li> </ul> <hr /> <h2>License ? <a href=""><img src="" alt="GitHub license"/></a></h2> <p><a href="">MIT Licensed</a> (file <a href="">LICENSE</a>). </p> <p>© 2016-2018 <a href="">Lilian Besson</a>. </p> <p><a href=""><img src="" alt="Maintenance"/></a> <a href=""><img src="" alt="Ask Me Anything"/></a> <a href=""><img src="" alt="Analytics"/></a> <img src="" alt="PyPI implementation"/> <img src="" alt="PyPI pyversions"/> <a href=""><img src="" alt="Documentation Status"/></a> <a href=""><img src="" alt="ForTheBadge uses-badges"/></a> <a href=""><img src="" alt="ForTheBadge uses-git"/></a> <a href=""><img src="" alt="forthebadge made-with-python"/></a> <a href=""><img src="" alt="ForTheBadge built-with-science"/></a> </p></html>Lilian BessonTue, 20 Mar 2018 20:12:13 -0000 learningmulti armed banditsbanditsdistributed machine learningstochastic optimization