-
- Description:
The use of crowdsourcing for labelling data for machine learning introduces several complications: the annotators may not understand the problem correctly, they may not have the expertise required, they may be random annotators or even try to deteriorate the results. To learn from this labels in contexts of Big Data, practitioners need to take into consideration, in some way, the quality of the annotators labelling the data, as these is crucial when the annotations are scarce. This package implements several methods for dealing with this situations using Apache Spark, to facilitate the transition to big scale problems.
- Changes to previous version:
Changes: - Minor improvements in code and documentation
- BibTeX Entry: Download
- Supported Operating Systems: Linux, Windows, Osx
- Data Formats: Agnostic
- Tags: Distributed, Machine Learning, Crowdsourcing, Spark
- Archive: download here
Other available revisons
-
Version Changelog Date 0.1.5 Changes: - Minor improvements in code and documentation
December 13, 2017, 13:13:35 0.1.4 Changes: - Improves project readme and documentation. - Improves results on corner cases where labels are not complete.
December 12, 2017, 19:33:09 0.1.3 Initial Announcement on mloss.org.
November 8, 2017, 13:42:08
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.