
 Description:
JEMLA is a Java package for calculating Entropy which is essential in several Machine Learning Applications. 6 algorithms for handling missing attributes are implemented in JEMLA. These 6 methods are: 1: Ignore missing values. 2: For categorical features, replace missing values with most common value; for numerical features, replace missing values with mean value. 3: In each given class or concept, perform method method 2 individually. 4: Find the used values of the feature for each given class and sort them based on the rate of their uses. Missing values would be replaced with different values and not most common values. Therefore the rate of a value accounts for the fraction of missing values which replaced by that value. 5: Perform like method 3 except that for numerical features, replace missing values with mid value. 6: Perform the “closest fit” algorithm. “Closest fit” is a preprocessing algorithm in which for each instance in the data set, finds the instance which has minimum Euclidean distance with that instance. Then replaces missing values based on matching value in the other instance.
 Changes to previous version:
Discretizing numerical values is added to calculate mode of values and fractional replacement of missing ones. class diagram is on the web http://profs.basu.ac.ir/bathaeian/free_space/jemla.rar
 BibTeX Entry: Download
 URL: Project Homepage
 Supported Operating Systems: Linux, Windows, Platform Independent
 Data Formats: Ascii, Arff
 Tags: Machine Learning, Missing Data, Java, Entropy
 Archive: download here
Comments

 NargesSadat Bathaeian (on February 4, 2015, 11:37:00)
Today I put a simple user guide for JEMLA. Please download it from http://profs.basu.ac.ir/bathaeian/index.php?L=free
Leave a comment
You must be logged in to post comments.
Any documentation?