Project details for RLPy

Logo RLPy 1.2

by bobklein2 - October 9, 2013, 03:33:43 CET [ Project Homepage BibTeX Download ]

view (8 today), download ( 0 today ), 1 subscription

Description:

RLPy is a framework for performing sequential decision making experiments in Python. RLPy provides a fine-grained view of learning agents, breaking them into modular components and providing a library for each. Additionally, RLPy provides a wide variety of problem domains to test these agents - these are listed at bottom.

Parallelization: Easily scale experiments by running them in parallel on multiple cores of a single machine (user only needs to specify the number of cores) or on HTCondor computing cluster.

Hyperparameter Optimization: Built-in support for optimizing hyperparameters with state-of-the-art methods (using the hyperopt package). The user only needs to specify the parameters and their bounds.

Code Profiling: Easily identify performance bottlenecks of the code with built-in profiling support. A color-coded call graph of execution reveils slow functions.

Plotting: User specifies experimental configuration and number of runs for statistical significance. Then using merger.py tool, user need only specify quantities to appear on the graph; runs of the same configuration are automatically associated and averaged, and various configurations can be plotted simultaneously with confidence intervals.

Learning Agent Components:

Value Function Representations:

  • Bellman Error Basis Functions
  • Fourier Basis Functions
  • Incremental Feature Dependency Discovery (iFDD)
  • Radial Basis Functions
  • Tabular
  • Tile Coding

Exploration Policies:

  • Epsilon-Greedy
  • Gibbs
  • Uniform Random

Learning Algorithms:

  • Greedy-GQ
  • Least-Squares Policy Iteration
  • Natural Actor-Critic
  • Policy Iteration
  • Q-Learning
  • SARSA
  • Trajectory-based Value Iteration
  • Value Iteration

Problem Domains:

  • Acrobot
  • Bicycle Balancing
  • BlocksWorld
  • CartPole Balancing (2-state or 4-state)
  • CartPole Swingup (2-state or 4-state)
  • Fifty-State ChainMDP
  • FlipBoard
  • GridWorld
  • HIV Treatment
  • Helicopter Hovering
  • Intruder Monitoring
  • MountainCar
  • MultiTrack
  • Persistent Search and Track
  • Pac-Man
  • Pinball
  • PuddleWorld
  • RC Car
  • System Administrator
Changes to previous version:

Initial Announcement on mloss.org.

BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Platform Independent
Data Formats: Any
Tags: Python, Scalable, Reinforcement Learning Library, Modular, Parallelizable
Archive: download here

Other available revisons

Version Changelog Date
1.3a
  • Fixed bug where results using same random seed were different with visualization turned on/off
  • Created RLPy package on pypi (Available at https://pypi.python.org/pypi/rlpy)
  • Switched from custom logger class to python default
  • Added unit tests
  • Code readability improvements (formatting, variable names/ordering)
  • Restructured TD Learning heirarchy
  • Updated tutorials
August 28, 2014, 14:34:35
1.2

Initial Announcement on mloss.org.

October 9, 2013, 03:33:43

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.