mloss.org SMPyBanditshttp://mloss.orgUpdates and additions to SMPyBanditsenTue, 20 Mar 2018 20:12:13 -0000SMPyBandits 0.9.2http://mloss.org/software/view/710/<html><h1><em>SMPyBandits</em></h1> <p><strong>Open-Source Python package for Single- and Multi-Players multi-armed Bandits algorithms</strong>. </p> <p>This repository contains the code of <a href="http://perso.crans.org/besson/">Lilian Besson's</a> numerical environment, written in <a href="https://www.python.org/">Python (2 or 3)</a>, for numerical simulations on <em>single</em>-player and <em>multi</em>-players <a href="https://en.wikipedia.org/wiki/Multi-armed_bandit">Multi-Armed Bandits (MAB)</a> algorithms. </p> <p>A complete Sphinx-generated documentation is on <a href="https://smpybandits.github.io/">SMPyBandits.GitHub.io</a>. </p> <h2>Quick presentation</h2> <p><em>SMPyBandits</em> contains the most complete collection of single-player (classical) bandit algorithms on the Internet (<a href="https://smpybandits.github.io/docs/Policies.html">over 65!</a>), as well as implementation of all the state-of-the-art <a href="https://smpybandits.github.io/docs/PoliciesMultiPlayers.html">multi-player algorithms</a>. </p> <p>I follow very actively the latest publications related to Multi-Armed Bandits (MAB) research, and usually implement quite quickly the new algorithms (see for instance, <a href="https://smpybandits.github.io/docs/Policies.Exp3PlusPlus.html">Exp3++</a>, <a href="https://smpybandits.github.io/docs/Policies.CORRAL.html">CORRAL</a> and <a href="https://smpybandits.github.io/docs/Policies.SparseUCB.html">SparseUCB</a> were each introduced by articles (<a href="https://arxiv.org/pdf/1702.06103">for Exp3++</a>, <a href="https://arxiv.org/abs/1612.06246v2">for CORRAL</a>, <a href="https://arxiv.org/abs/1706.01383">for SparseUCB</a>) presented at COLT in July 2017, <a href="https://smpybandits.github.io/docs/Policies.LearnExp.html">LearnExp</a> comes from a <a href="https://arxiv.org/abs/1702.04825">NIPS 2017 paper</a>, and <a href="https://smpybandits.github.io/docs/Policies.klUCBPlusPlus.html">kl-UCB++</a> from an <a href="https://hal.inria.fr/hal-01475078">ALT 2017 paper</a>.). </p> <ul> <li> Classical MAB have a lot of applications, from clinical trials, A/B testing, game tree exploration, and online content recommendation (my framework does <em>not</em> implement contextual bandit - yet). </li> <li> <a href="MultiPlayers.md">Multi-player MAB</a> have applications in Cognitive Radio, and my framework implements <a href="https://smpybandits.github.io/docs/Environment/CollisionModels.html">all the collision models</a> found in the literature, as well as all the algorithms from the last 10 years or so (<a href="https://smpybandits.github.io/docs/PoliciesMultiPlayers/rhoRand.html">rhoRand</a> from 2009, <a href="https://smpybandits.github.io/docs/Policies/MEGA.html">MEGA</a> from 2015, <a href="https://smpybandits.github.io/docs/Policies/MusicalChair.html">MusicalChair</a>, and our state-of-the-art algorithms <a href="https://smpybandits.github.io/docs/PoliciesMultiPlayers/RandTopM.html">RandTopM</a> and <a href="https://smpybandits.github.io/docs/PoliciesMultiPlayers/MCTopM.html">MCTopM</a>). </li> </ul> <p>With this numerical framework, simulations can run on a single CPU or a multi-core machine, and summary plots are automatically saved as high-quality PNG, PDF and EPS (ready for being used in research article). Making new simulations is very easy, one only needs to write a configuration script and basically no code! See <a href="https://github.com/SMPyBandits/SMPyBandits/search?l=Python&amp;q=configuration&amp;type=&amp;utf8=%E2%9C%93">these examples</a> (files named <code>configuration_...py</code>). </p> <p>A complete <a href="http://sphinx-doc.org/">Sphinx</a> documentation for each algorithms and every piece of code (included constants in the configurations!) is available here: <a href="https://smpybandits.github.io/">SMPyBandits.GitHub.io</a>. </p> <p><img src="https://img.shields.io/pypi/implementation/smpybandits.svg" alt="PyPI implementation"/> <img src="https://img.shields.io/pypi/pyversions/smpybandits.svg" alt="PyPI pyversions"/> <a href="https://GitHub.com/SMPyBandits/SMPyBandits/graphs/commit-activity"><img src="https://img.shields.io/badge/Maintained%3F-yes-green.svg" alt="Maintenance"/></a> <a href="https://GitHub.com/Naereen/ama"><img src="https://img.shields.io/badge/Ask%20me-anything-1abc9c.svg" alt="Ask Me Anything"/></a> </p> <blockquote><p>Note: - <br /> <a href="http://perso.crans.org/besson/">I (Lilian Besson)</a> have <a href="http://perso.crans.org/besson/phd/">started my PhD</a> in October 2016, and this is a part of my <strong>on going</strong> research since December 2016. - I launched the <a href="https://smpybandits.github.io/">documentation</a> on March 2017, I wrote my first research articles using this framework in 2017 and decided to (finally) open-source my project in February 2018. </p> </blockquote><hr /> <h2>How to cite this work?</h2> <p>If you use this package for your own work, please consider citing it with <a href="https://github.com/SMPyBandits/SMPyBandits/raw/master/SMPyBandits.bib">this piece of BibTeX</a>: </p> <p>`<code>bibtex @misc{SMPyBandits, title = {{SMPyBandits: an Open-Source Research Framework for Single and Multi-Players Multi-Arms Bandits (MAB) Algorithms in Python}}, author = {Lilian Besson}, year = {2018}, url = {https://github.com/SMPyBandits/SMPyBandits/}, howpublished = {Online at: \url{github.com/SMPyBandits/SMPyBandits}}, note = {Code at https://github.com/SMPyBandits/SMPyBandits/, documentation at https://smpybandits.github.io/} } `</code> </p> <p>I also wrote a small paper to present <em>SMPyBandits</em>, and I will send it to <a href="http://jmlr.org/mloss/">JMLR MLOSS</a>. The paper can be consulted <a href="https://perso.crans.org/besson/articles/SMPyBandits.pdf">here on my website</a>. </p> <hr /> <h2>Other interesting things</h2> <h3><a href="https://smpybandits.github.io/docs/Policies.html">Single-player Policies</a></h3> <ul> <li> More than 65 algorithms, including all known variants of the <a href="https://smpybandits.github.io/docs/Policies/UCB.html">UCB</a>, <a href="https://smpybandits.github.io/docs/Policies/klUCB.html">kl-UCB</a>, <a href="https://smpybandits.github.io/docs/Policies/MOSS.html">MOSS</a> and <a href="https://smpybandits.github.io/docs/Policies/Thompson.html">Thompson Sampling</a> algorithms, as well as other less known algorithms (<a href="https://smpybandits.github.io/docs/Policies/OCUCB.html">OCUCB</a>, <a href="https://smpybandits.github.io/docs/Policies/OCUCB.html">BESA</a>, <a href="https://smpybandits.github.io/docs/Policies/OSSB.html">OSSB</a> etc). </li> <li> <a href="https://smpybandits.github.io/docs/Policies.SparseWrapper.html#module-Policies.SparseWrapper">SparseWrapper</a> is a generalization of <a href="https://arxiv.org/pdf/1706.01383/">the SparseUCB from this article</a>. </li> <li> Implementation of very recent Multi-Armed Bandits algorithms, e.g., <a href="https://smpybandits.github.io/docs/Policies.klUCBPlusPlus.html">kl-UCB++</a> (from <a href="https://hal.inria.fr/hal-01475078">this article</a>), <a href="https://smpybandits.github.io/docs/Policies.UCBdagger.html">UCB-dagger</a> (from <a href="https://arxiv.org/pdf/1507.07880">this article</a>), or <a href="https://smpybandits.github.io/docs/Policies.MOSSAnytime.html">MOSS-anytime</a> (from <a href="http://proceedings.mlr.press/v48/degenne16.pdf">this article</a>). </li> <li> Experimental policies: <a href="https://smpybandits.github.io/docs/Policies.BlackBoxOpt.html">BlackBoxOpt</a> or <a href="https://smpybandits.github.io/docs/Policies.UnsupervisedLearning.html">UnsupervisedLearning</a> (using Gaussian processes to learn the arms distributions). </li> </ul> <h3>Arms and problems</h3> <ul> <li> My framework mainly targets stochastic bandits, with arms following <a href="https://smpybandits.github.io/docs/Arms/Bernoulli.html">Bernoulli</a>, bounded (SMPyBandits/truncated) or unbounded <a href="https://smpybandits.github.io/docs/Arms/Gaussian.html">Gaussian</a>, <a href="https://smpybandits.github.io/docs/Arms/Exponential.html">Exponential</a>, <a href="https://smpybandits.github.io/docs/Arms/Gamma.html">Gamma</a> or <a href="https://smpybandits.github.io/docs/Arms/Poisson.html">Poisson</a> distributions. </li> <li> The default configuration is to use a fixed problem for N repetitions (e.g. 1000 repetitions, use <a href="https://smpybandits.github.io/docs/Environment/MAB.html">MAB.MAB</a>), but there is also a perfect support for "Bayesian" problems where the mean vector µ1,...,µK change <em>at every repetition</em> (see <a href="https://smpybandits.github.io/docs/Environment/MAB.html">MAB.DynamicMAB</a>). </li> <li> There is also a good support for Markovian problems, see <a href="https://smpybandits.github.io/docs/Environment/MAB.html">MAB.MarkovianMAB</a>, even though I didn't implement any policies tailored for Markovian problems. </li> </ul> <h3>Other remarks</h3> <ul> <li> Everything here is done in an imperative, object oriented style. The API of the Arms, Policy and MultiPlayersPolicy classes is documented <a href="https://smpybandits.github.io/API.html">in this page</a>. </li> <li> The code is <a href="https://smpybandits.github.io/logs/main_pylint_log.txt">clean</a>, valid for both <a href="https://smpybandits.github.io/logs/main_pylint_log.txt">Python 2</a> and <a href="https://smpybandits.github.io/logs/main_pylint3_log.txt">Python 3</a>. </li> <li> Some piece of code come from the <a href="http://mloss.org/software/view/415/">pymaBandits</a> project, but most of them were refactored. Thanks to the initial project! </li> <li> <a href="http://gael-varoquaux.info/">G.Varoquaux</a>'s <a href="https://pythonhosted.org/joblib/">joblib</a> is used for the <a href="https://smpybandits.github.io/docs/Environment/Evaluator.html">Evaluator</a> and <a href="https://smpybandits.github.io/docs/Environment/EvaluatorMultiPlayers.html">EvaluatorMultiPlayers</a> classes, so the simulations are easily parallelized on multi-core machines. (Put <code>n_jobs = -1</code> or <code>PARALLEL = True</code> in the config file to use all your CPU cores, as it is by default). </li> </ul> <hr /> <h2><a href="How_to_run_the_code.md">How to run the experiments ?</a></h2> <blockquote><p>See this document: [How_to_run_the_code.md](https://smpybandits.github.io/How_to_run_the_code.html) for more details. </p> </blockquote><p>TL;DR: this short bash snippet shows how to clone the code, install the requirements for Python 3 (in a <a href="https://virtualenv.pypa.io/en/stable/">virtualenv</a>, and starts some simulation for N=100 repetitions of the default non-Bayesian Bernoulli-distributed problem, for K=9 arms, an horizon of T=10000 and on 4 CPUs (it should take about 20 minutes for each simulations): </p> <p>`<code></code>bash cd /tmp/ </p> <h1>just be sure you have the latest virtualenv from Python 3</h1> <p>sudo pip3 install --upgrade --force-reinstall virtualenv </p> <h1>create and active the virtualenv</h1> <p>virtualenv venv . venv/bin/activate type pip # check it is /tmp/venv/bin/pip type python # check it is /tmp/venv/bin/python </p> <p>pip install SMPyBandits # pulls latest version from https://pypi.org/project/SMPyBandits/ </p> <h1>or you can also</h1> <p>pip install git+https://github.com/SMPyBandits/SMPyBandits/#egg=SMPyBandits[full] # pulls latest version from https://github.com/SMPyBandits/SMPyBandits/ </p> <h1>run a single-player simulation!</h1> <p>N=100 T=10000 K=9 N_JOBS=4 make single </p> <h1>run a multi-player simulation!</h1> <p>N=100 T=10000 M=3 K=9 N_JOBS=4 make more `<code></code> </p> <ul> <li> If speed matters to you and you want to use algorithms based on <a href="https://smpybandits.github.io/docs/Policies/klUCB.html">kl-UCB</a>, you should take the time to build and install the fast C implementation of the utilities KL functions. Default is to use <a href="https://smpybandits.github.io/docs/Policies/kullback.html">kullback.py</a>, but using <a href="github.com/SMPyBandits/SMPyBandits/tree/master/SMPyBandits/Policies/C/">the C version from Policies/C/</a> really speeds up the computations. Just follow the instructions, it should work well (you need <code>gcc</code> to be installed). </li> <li> And if speed matters, be sure that you have a working version of <a href="https://numba.pydata.org/">Numba</a>, it is used by many small functions to (try to automatically) speed up the computations. </li> </ul> <hr /> <h3>Warning</h3> <ul> <li> This work is still <strong>experimental</strong>! It's <a href="https://github.com/SMPyBandits/SMPyBandits/graphs/contributors">active research</a>. It should be completely bug free and every single module/file should work perfectly(as <a href="https://smpybandits.github.io/logs/main_pylint_log.txt">this pylint log</a> and <a href="https://smpybandits.github.io/logs/main_pylint3_log.txt">this other one</a> says), but bugs are sometimes hard to spot so if you encounter any issue, <a href="https://github.com/SMPyBandits/SMPyBandits/issues/new">please fill a bug ticket</a>. </li> <li> Whenever I add a new feature, I run experiments to check that nothing is broken. But <em>there is no unittest</em> (I don't have time). You would have to trust me! </li> <li> This project is NOT meant to be a library that you can use elsewhere, but a research tool. In particular, I don't take ensure that any of the Python modules can be imported from another directory than the main directory. </li> </ul> <h2>Contributing?</h2> <p>Contributions (issues, questions, pull requests) are of course welcome, but this project is and will stay a personal environment designed for quick research experiments, and will never try to be an industry-ready module for applications of Multi-Armed Bandits algorithms. </p> <p>If you want to contribute, please have a look to the <a href="https://smpybandits.github.io/CONTRIBUTING.html">CONTRIBUTING</a> page, and if you want to be more seriously involved, read the [CODE_OF_CONDUCT](https://smpybandits.github.io/CODE_OF_CONDUCT.html) page. </p> <ul> <li> You are welcome to <a href="https://github.com/SMPyBandits/SMPyBandits/issues/new">submit an issue</a>, if it was not previously answered, </li> <li> If you have interesting example of use of SMPyBandits, please share it! (<a href="https://www.jupyter.org/">Jupyter Notebooks</a> are preferred). And fill a pull request to <a href="https://smpybandits.github.io/notebooks/README.html">add it to the notebooks examples</a>. </li> </ul> <hr /> <h2>License ? <a href="https://github.com/SMPyBandits/SMPyBandits/blob/master/LICENSE"><img src="https://img.shields.io/github/license/SMPyBandits/SMPyBandits.svg" alt="GitHub license"/></a></h2> <p><a href="https://lbesson.mit-license.org/">MIT Licensed</a> (file <a href="https://smpybandits.github.io/LICENSE">LICENSE</a>). </p> <p>© 2016-2018 <a href="https://GitHub.com/Naereen">Lilian Besson</a>. </p> <p><a href="https://GitHub.com/SMPyBandits/SMPyBandits/graphs/commit-activity"><img src="https://img.shields.io/badge/Maintained%3F-yes-green.svg" alt="Maintenance"/></a> <a href="https://GitHub.com/Naereen/ama"><img src="https://img.shields.io/badge/Ask%20me-anything-1abc9c.svg" alt="Ask Me Anything"/></a> <a href="https://GitHub.com/SMPyBandits/SMPyBandits/"><img src="https://ga-beacon.appspot.com/UA-38514290-17/github.com/SMPyBandits/SMPyBandits/README.md?pixel" alt="Analytics"/></a> <img src="https://img.shields.io/pypi/implementation/smpybandits.svg" alt="PyPI implementation"/> <img src="https://img.shields.io/pypi/pyversions/smpybandits.svg" alt="PyPI pyversions"/> <a href="https://smpybandits.readthedocs.io/en/latest/?badge=latest"><img src="https://readthedocs.org/projects/smpybandits/badge/?version=latest" alt="Documentation Status"/></a> <a href="http://ForTheBadge.com"><img src="http://ForTheBadge.com/images/badges/uses-badges.svg" alt="ForTheBadge uses-badges"/></a> <a href="https://GitHub.com/"><img src="http://ForTheBadge.com/images/badges/uses-git.svg" alt="ForTheBadge uses-git"/></a> <a href="https://www.python.org/"><img src="http://ForTheBadge.com/images/badges/made-with-python.svg" alt="forthebadge made-with-python"/></a> <a href="https://GitHub.com/Naereen/"><img src="http://ForTheBadge.com/images/badges/built-with-science.svg" alt="ForTheBadge built-with-science"/></a> </p></html>Lilian BessonTue, 20 Mar 2018 20:12:13 -0000http://mloss.org/software/rss/comments/710http://mloss.org/software/view/710/pythonmachine learningmulti armed banditsbanditsdistributed machine learningstochastic optimization