SGDhttp://mloss.orgUpdates and additions to SGDenTue, 11 Oct 2011 21:01:16 -0000SGD 2.0<html><p>Learning algorithms based on Stochastic Gradient approximations are known for their poor performance on optimization tasks and their extremely good performance on machine learning tasks (Bottou and Bousquet, 2008). Despite these proven capabilities, there were lingering concerns about the difficulty of setting the adaptation gains and achieving robust performance. Stochastic gradient algorithms have been historically associated with back-propagation algorithms in multilayer neural networks, which can be very challenging non-convex problems. Stochastic gradient algorithms are also notoriously hard to debug because they often appear to somehow work despite the bugs. Experimenters are then led to believe, incorrectly, that the algorithm itself is flawed. </p> <p>Therefore it is useful to see how Stochastic Gradient Descent performs on simple linear and convex problems such as linear Support Vector Machines (SVMs) or Conditional Random Fields (CRFs). This page proposes simple code examples illustrating the good properties of stochastic gradient descent algorithms. The provided source code values clarity over speed. </p> <p>The second major release of this code includes a robust implementation of the averaged stochastic gradient descent algorithm (Ruppert, 1988) which consists of performing stochastic gradient descent iterations and simultaneously averaging the parameter vectors over time. When the stochastic gradient gains decrease with an appropriately slow schedule, Polyak and Juditsky (1992) have shown that the algorithm converges like a second-order stochastic gradient descent but with much smaller computational costs. One can therefore hope to match the batch optimization performance after a single pass on the randomly shuffled training set (Fabian, 1978; Bottou and LeCun, 2004). Achieving one-pass learning in practice remains difficult because one often needs more than one pass to simply reach this favorable asymptotic regime. The gain schedule has a deep impact on this convergence. Finer analyses (Xu, 2010; Bach and Moulines, 2011) reveal useful guidelines to set these learning rates. Xu (2010) also describe a wonderful way to efficiently perform the averaging operation when the training data is sparse. The resulting algorithm reaches near-optimal test set performance after only a couple passes. </p></html>Leon BottouTue, 11 Oct 2011 20:59:41 -0000 scalesvmcrfstochastic gradient descent<b>Comment by Leon Bottou on 2008-09-23 21:42</b><p>Version 1.2 fixes bug in the preprocessing program.</p> Leon BottouTue, 23 Sep 2008 21:42:34 -0000<b>Comment by Olivier Grisel on 2009-01-26 14:33</b><p>Apparently the download link of this mloss entry points to the 1.1 version while the description mentions version 1.2.</p> Olivier GriselMon, 26 Jan 2009 14:33:55 -0000<b>Comment by Olivier Grisel on 2009-01-26 15:07</b><p>Also, to build it with gcc 4.3.1 I had to explicitly add:</p> <p>#include \<memory.h\&gt;</p> <p>in file lib/pstream.cpp to find the memcpy function definition and I also have the following (harmless) deprecation warning on the hash_map:</p> <p>g++ -g -O2 -Wall -I../lib -c -o preprocess.o preprocess.cpp In file included from /usr/include/c++/4.3/ext/hash<em>map:64, from preprocess.cpp:38: /usr/include/c++/4.3/backward/backward</em>warning.h:33:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated.</p> Olivier GriselMon, 26 Jan 2009 15:07:28 -0000<b>Comment by Leon Bottou on 2009-01-30 13:40</b><p>According to posix and ansi c, function memcpy is defined in header <string.h>. See <a href=""></a>. If your compilation needs memory.h, then something must be wrong with your compiler.</p> <p>Regarding the warning, it suggests to replace the gcc specific hash<em>map by an unordered</em>map. This is a c++0x extension. If you do the change, you'll get the following error: /usr/include/c++/4.3/c++0x_warning.h:36:2: error: #error This file requires compiler and library support for the upcoming ISO C++ standard, C++0x. This support is currently experimental, and must be enabled with the -std=c++0x or -std=gnu++0x compiler options.</p> Leon BottouFri, 30 Jan 2009 13:40:40 -0000<b>Comment by Olivier Grisel on 2009-01-30 23:27</b><p>Indeed, but neither pstream.cpp nor pstream.h include string.h either.</p> <p>Including string.h instead of memory.h as mentioned previously makes it build as expected.</p> Olivier GriselFri, 30 Jan 2009 23:27:24 -0000<b>Comment by Leon Bottou on 2009-01-31 14:01</b><p>Ooops. I thought that <cstring> was already included in lib/pstream.cpp.</p> <p>Here is the patch (to be included in a next release)</p> <pre><code>Index: pstream.cpp =================================================================== RCS file: /home/cvs/cvsroot/sgd/lib/pstream.cpp,v retrieving revision 1.4 retrieving revision 1.5 diff -u -4 -p -r1.4 -r1.5 --- pstream.cpp 2 Oct 2007 20:40:05 -0000 1.4 +++ pstream.cpp 31 Jan 2009 12:28:44 -0000 1.5 @@ -15,13 +15,14 @@ // You should have received a copy of the GNU General Public License // along with this program; if not, write to the Free Software // Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA -// $Id: pstream.cpp,v 1.4 2007/10/02 20:40:05 cvs Exp $ +// $Id: pstream.cpp,v 1.5 2009/01/31 12:28:44 cvs Exp $ #include "pstream.h" -#include &lt;stdio.h&gt; +#include &lt;cstdio&gt; +#include &lt;cstring&gt; pstreambuf* pstreambuf::open( const char *cmd, int open_mode) </code></pre> Leon BottouSat, 31 Jan 2009 14:01:59 -0000<b>Comment by Leon Bottou on 2009-01-31 14:18</b><p>Updated in version 1.3</p> Leon BottouSat, 31 Jan 2009 14:18:42 -0000<b>Comment by ApplyCreditCards on 2009-05-28 04:41</b><p>Hi, good post. I have been wondering about this issue,so thanks for posting.</p> ApplyCreditCardsThu, 28 May 2009 04:41:43 -0000<b>Comment by Leon Bottou on 2011-10-11 21:01</b><p>Released sgd-2.0 featuring ASGD.</p> Leon BottouTue, 11 Oct 2011 21:01:16 -0000