Project details for RLLib Lightweight On or Off Policy Reinforcement Learning Library

Logo RLLib Lightweight On or Off Policy Reinforcement Learning Library 1.6

by saminda - December 1, 2013, 10:21:06 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view ( today), download ( today ), 0 subscriptions



(C++ Template Library to Learn Behaviors and Represent Learnable Knowledge using On/Off Policy Reinforcement Learning)

RLLib is a lightweight C++ template library that implements incremental, standard, and gradient temporal-difference learning algorithms in Reinforcement Learning. It is a highly optimized library that is designed and written specifically for robotic applications. The implementation of the RLLib library is inspired by the RLPark API, which is a library of temporal-difference learning algorithms written in Java.


  • Off-policy prediction algorithms:
  • GTD(lambda)
  • GQ(lambda)
  • Off-policy control algorithms:
  • Greedy-GQ(lambda)
  • Softmax-GQ(lambda)
  • Off-PAC (can be used in on-policy setting)
  • On-policy algorithms:
  • TD(lambda)
  • TD(lambda)AlphaBound
  • TD(lambda)True
  • Sarsa(lambda)
  • Sarsa(lambda)AlphaBound
  • Sarsa(lambda)True
  • Sarsa(lambda)Expected
  • Actor-Critic (continuous actions, discrete actions, discounted reward settting, averaged reward settings, and so on)
  • Supervised learning algorithms:
  • Adaline
  • IDBD
  • KI
  • SemiLinearIDBD
  • Autostep
  • Policies: Random, RandomX%Bias, Greedy, Epsilon-greedy, Boltzmann, Normal, and Softmax.
  • Dot product: An efficient implementation of the dot product for tile coding based feature representations (with culling traces).
  • Benchmarking environments: Mountain Car, Mountain Car 3D, Swinging Pendulum, Continuous Grid World, Bicycle, Cart Pole, Acrobot, Non-Markov Pole Balancing, and Helicopter.
  • Optimization: Optimized for very fast duty cycles (e.g., with culling traces, RLLib has been tested on the Robocup 3D simulator agent, and on the NAO V4 (cognition thread)).
  • Usage: The algorithm usage is very much similar to RLPark, therefore, swift learning curve.
  • Examples: There are a plethora of examples demonstrating on-policy and off-policy control experiments.
  • Visualization: We provide a Qt4 based application to visualize benchmark problems.


Off-PAC ContinuousGridworld AverageRewardActorCritic SwingPendulum (Continuous Actions)


RLLib is a C++ template library. The header files are located in the src directly. You can simply include this directory from your projects, e.g., -I./src, to access the algorithms.

To access the control algorithms:

#include "ControlAlgorithm.h"

To access the predication algorithms:

#include "PredictorAlgorithm"

To access the supervised learning algorithms:

#include "SupervisedAlgorithm.h"

RLLib uses the namespace:

using namespace RLLib


RLLib provides a flexible testing framework. Follow these steps to quickly write a test case.

  • To access the testing framework: #include "HeaderTest.h"


include "HeaderTest.h"


class YourTest Test: public YourTestBase { public: YourTestTest() {}

virtual ~Test() {}
void run();

private: void testYourMethod(); };

void YourTestBase::testYourMethod() {/* Your test code /}

void YourTestBase::run() { testYourMethod(); } `

  • Add YourTest to the test/test.cfg file.
  • You can use @YourTest to execute only YourTest. For example, if you need to execute only MountainCar test cases, use @MountainCarTest.

Test Configuration

The test cases are executed using:

  • 64-bit machines:

    • ./configure_m64
    • make
    • ./RLLibTest
  • 32-bit machines:

    • ./configure_m32
    • make
    • ./RLLibTest
  • Debugging:

    • ./configure_debug
    • make
    • ./RLLibTest


RLLib provides a QT4.8 based Reinforcement Learning problems and algorithms visualization tool named RLLibViz. Currently RLLibViz visualizes following problems and algorithms:

  • On-policy:

    • SwingPendulum problem with continuous actions. We use AverageRewardActorCritic algorithm.
  • Off-policy:

    • ContinuousGridworld and MountainCar problems with discrete actions. We use Off-PAC algorithm.
  • In order to run the visualization tool, you need to have QT4.8 installed in your system.

  • In order to install RLLibViz:

    • Change directory to visualization/RLLibViz
    • ./configure
    • ./RLLibVizSwingPendulum
    • ./RLLibVizContinuousGridworld
    • ./RLLibVizMountainCar



Dynamic Role Assignment using General ValueFunctions


Saminda Abeyruwan (

Changes to previous version:

Current release version is v1.6.

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Linux, Platform Independent, Windows Under Cygwin
Data Formats: Bin
Tags: Lightweight, Off Policy, On Policy, Reinforcement Learning Library, Standard
Archive: download here


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.