ICML Workshop on Machine Learning Open Source Software 2010

Workshop on Machine Learning Open Source Software 2010

The ICML Workshop on Machine Learning Open Source Software (MLOSS) will held in Haifa, Israel on the 25th of June, 2010 in the Dan Panorama Haifa, Room Erez.

Important Dates

  • Submission Date: April 23rd, Samoa time closed
  • Notification of Acceptance: May 8th, 2010
  • Workshop date: June 25th, 2010

Description

We believe that the wide-spread adoption of open source software policies will have a tremendous impact on the field of machine learning. The goal of this workshop is to further support the current developments in this area and give new impulses to it. Following the success of the inaugural NIPS-MLOSS workshop held at NIPS 2006, the Journal of Machine Learning Research (JMLR) has started a new track for machine learning open source software initiated by the workshop's organizers. Many prominent machine learning researchers have co-authored a position paper advocating the need for open source software in machine learning. To date 11 machine learning open source software projects have been published in JMLR. Furthermore, the workshop's organizers have set up a community website mloss.org where people can register their software projects, rate existing projects and initiate discussions about projects and related topics. This website currently lists 221 such projects including many prominent projects in the area of machine learning.

The main goal of this workshop is to bring the main practitioners in the area of machine learning open source software together in order to initiate processes which will help to further improve the development of this area. In particular, we have to move beyond a mere collection of more or less unrelated software projects and provide a common foundation to stimulate cooperation and interoperability between different projects. An important step in this direction will be a common data exchange format such that different methods can exchange their results more easily.

This year's workshop sessions will consist of two parts.

  • We have two invited speakers: Gary Bradski and Victoria Stodden.
  • Researchers are invited to submit their open source project to present it at the workshop.
  • In discussion sessions, important questions regarding the future development of this area will be discussed. In particular, we will discuss what makes a good machine learning software project and how to improve interoperability between programs. In addition, the question of how to deal with data sets and reproducibility will also be addressed.

Taking advantage of the large number of key research groups which attend ICML, decisions and agreements taken at the workshop will have the potential to significantly impact the future of machine learning software.

Workshop Program:

The 1 day workshop will be a mixture of talks (including a mandatory demo of the software) and panel/open/hands-on discussions.

Morning session: 9:00am - ­12:30am

Afternoon session: 14:00 - 18:00pm

  • Contributed Talks
  • Spotlight Talks
  • 15:35 - 16:25 Poster Session and Coffee Break
  • 16:25 - Invited Talk: (Victoria Stodden)

    Reproducible Research in Computational Science: Problems and Solutions For Data and Code Sharing

    Scientific computation is emerging as absolutely central to the scientific method, but the prevalence of very relaxed practices is leading to a credibility crisis. Reproducible computational research, in which all details of computations—code and data—are made conveniently available to others, is a necessary response to this crisis. Results from a 2009 survey of the Machine Learning community (NIPS participants) designed to elucidate factors that affect data and code sharing will be presented. Intellectual property concerns create a significant barrier to sharing, and I will also present work on the “Reproducible Research Standard” giving open licensing options designed to create an intellectual property framework for scientists consonant with longstanding scientific norms and facilitating reproducible research.

  • 17:10 - Discussion: Reproducible research

Invited Speakers

pascal2 logo

  • Gary Bradski

    Gary Bradski was previously responsible for the Open Source Computer Vision Library (OpenCV) that is used globally in research, government and commercial applications. He has also been responsible for the open source statistical Machine Learning Library and the Probabilistic Network Library. More recently Dr. Bradski led the vision team for Stanley, the Stanford robot that won the DARPA Grand Challenge autonomous race in 2005 and most recently helped found the Stanford Artificial Intelligence Robot (STAIR) project under the leadership of Professor Andrew Ng. Dr. Bradski recently published a new book for O'Reilly Press: Learning OpenCV: Computer Vision with the OpenCV Library.

  • Victoria Stodden

    Victoria is a Postdoctoral Associate in Law and a Kauffman Fellow in Law at the Information Society Project at Yale Law School. After completing her PhD in statistics at Stanford University in 2006 with advisor David Donoho, she obtained a Master in Legal Studies in 2007 from Stanford Law School. She is developing a new licensing structure for computational research and author of the award winning paper "Reproducible Research Standard" that describes her ideas.

Call for Contributions

The organizing committee is currently seeking abstracts for talks at MLOSS 2010. MLOSS is a great opportunity for you to tell the community about your use, development, or philosophy of open source software in machine learning. This includes (but is not limited to) numeric packages (as e.g. R,octave,numpy), machine learning toolboxes and implementations of ML-algorithms. The committee will select several submitted abstracts for 20-minute talks.

The submission process is very simple:
  • Tag your mloss.org project with the tag icml2010
  • Ensure that you have a good description (limited to 500 words)
  • Any bells and whistles can be put on your own project page, and of course provide this link on mloss.org
On April 23rd 2010, we will collect all projects tagged with icml2010 for review.

Note:Projects must adhere to a recognized Open Source License (cf. http://www.opensource.org/licenses/ ) and the source code must have been released at the time of submission. Submissions will be reviewed based on the status of the project at the time of the submission deadline.

Program Committee

All confirmed

Organizers:

  • Soeren Sonnenburg, Mikio Braun

    Technische Universität Berlin, Franklinstr. 28/29, FR 6-9, 10587 Berlin, Germany

  • Cheng Soon Ong

    ETH Zürich, Universitätstr. 6, 8092 Zürich, Switzerland

  • Patrik Hoyer

    Helsinki Institute for Information Technology, Gustaf Hällströmin katu 2b, 00560 Helsinki, Finland

Funding

The workshop is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)