Project details for sparkcrowd

Logo sparkcrowd 0.1.5

by enriquegrodrigo - December 13, 2017, 13:13:35 CET [ Project Homepage BibTeX Download ]

view (1 today), download ( 1 today ), 0 subscriptions


The use of crowdsourcing for labelling data for machine learning introduces several complications: the annotators may not understand the problem correctly, they may not have the expertise required, they may be random annotators or even try to deteriorate the results. To learn from this labels in contexts of Big Data, practitioners need to take into consideration, in some way, the quality of the annotators labelling the data, as these is crucial when the annotations are scarce. This package implements several methods for dealing with this situations using Apache Spark, to facilitate the transition to big scale problems.

Changes to previous version:

Changes: - Minor improvements in code and documentation

BibTeX Entry: Download
Supported Operating Systems: Linux, Windows, Osx
Data Formats: Agnostic
Tags: Distributed, Machine Learning, Crowdsourcing, Spark
Archive: download here

Other available revisons

Version Changelog Date

Changes: - Minor improvements in code and documentation

December 13, 2017, 13:13:35

Changes: - Improves project readme and documentation. - Improves results on corner cases where labels are not complete.

December 12, 2017, 19:33:09

Initial Announcement on

November 8, 2017, 13:42:08


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.