-
- Description:
AMiDST is a Java Toolbox for Scalable Probabilistic Machine Learning.
In AMiDST, you can model your problem using a flexible probabilistic language based on graphical models. Then, fit it with data using a Bayesian approach to handle modelling uncertainty.
AMIDST provides tailored parallel and distributed implementations of Bayesian parameter learning for batch and streaming data (multi-core and distributed processing). This processing is based on flexible and scalable message passing algorithms.
MAIN FEATURES
Probabilistic Graphical Models: You can specify your model using probabilistic graphical models with latent variables and temporal dependencies.
Scalable inference: Perform inference on your probabilistic models with powerful approximate and scalable algorithms based on novel variational message passing schemes.
Data Streams: Update your models when new data is available. This makes our toolbox appropriate for learning from (massive) data streams.
Large-scale Data: Use your defined models to process massive data sets in a distributed computer cluster using Flink or Spark.
Extensible: Code your models or algorithms within AMiDST and expand the toolbox functionalities. Flexible toolbox for researchers performing their experimentation in machine learning.
Interoperability: Leverage existing functionalities and algorithms by interfacing to other software tools such as Hugin, MOA, Weka, R, etc.
PUBLICATIONS: (per year)
2016
[1] Masegosa, A., R., Martinez, A. M. and Borchani, H. (2016). Probabilistic Graphical Models on Multi-Core CPUs Using Java 8. In IEEE Computational Intelligence Magazine, Vol. 11, No. 2., pages 41-54. DOI: 10.1109/mci.2016.2532267.
[2] Masegosa, A. R., Martinez., A. M., Ramos-López, D., Langseth, H., Nielsen, T. D., Salmerón, A., Cabañas, R., and Madsen, A. L. (2016). A Java Toolbox for Analysis of MassIve Data STreams using Probabilistic Graphical Models. Poster. Presented at the European Data Forum.
[3] Salmerón, A. Madsen, A.L., Jensen, F., Langseth, H., Nielsen, T. D., Ramos-López, D., Martinez, A. M., and Masegosa, A., R. (2016). Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block Designs. Accepted for ECAI.
2015
[1] Salmerón, A, Rumi, R., Langseth, H., Madsen, A. L., Nielsen, T. D. (2015). MPE Inference in Conditional Linear Gaussian Networks. In proceedings of ECSQARU on 15-17 July 2015 in Compiegne, France, pages 407-416. DOI: 10.1007/978-3-319-20807-7.
[2] Madsen, A. L. and Salmerón, A (2015). Analysis of massive data streams using R and AMIDST. In book of abstracts of useR!2015 on 30 June -3 July 2015 in Aalborg, Denmark, page 171.
[3] Borchani, H., Martinez, A. M., Masegosa, A, Langseth, H., Nielsen, T. D., Salmerón, A., Fernández, A., Madsen, A. L., Sáez, R. (2015). Modeling concept drift: A probabilistic graphical model based approach. In proceedings of The Fourteenth International Symposium on Intelligent Data Analysis, 22-24 October 2015 in Saint-Etienne, France, pages 72-83.
[4] Salmerón, A., Ramos-López, D., Borchani, H., Martinez, A. M., Masegosa, A., Fernández, A., Langseth, H., Madsen, A. L., Nielsen, T. D. (2015). Parallel importance sampling in conditional linear Gaussian networks. The XVI Conference of the Spanish Association for Artificial Intelligence (CAEPIA'15), pages 36-46.
[5] Madsen, A. L., Jensen, F., Salmerón, A., Langseth, H., Nielsen, T. D. (2015). Parallelization of the PC Algorithm (2015). The XVI Conference of the Spanish Association for Artificial Intelligence (CAEPIA'15), pages 14-24.
[6] Borchani, H., Martinez, A. M., Masegosa, A, Langseth, H., Nielsen, T. D., Salmerón, A., Fernández, A., Madsen, A. L., Sáez, R (2015). Dynamic Bayesian modeling for risk prediction in credit operations (2015). The 13th Scandinavian Conference on Artificial Intelligence, Halmstad, Sweden, November 5-6, 2015, pages 72-83.
[7] Masegosa, A, Martinez, A. M., Borchani, H., Ramos-Lopez, D., Nielsen, T. D., Langseth, H., Salmerón, Madsen, A. L. (2015). AMIDST: Analysis of MassIve Data STreams (2015). In proceedings of The 27th Benelux Conference on Artificial Intelligence, Hasselt, Belgium, November 5-6, 2015.
- Changes to previous version:
The new added functionalities include support to Flink for distributed learning of probabilistic graphical models and support for Latent Dirichlet Allocation Models for text analysis purposes.
Detailed information can be found in the toolbox's web page
- BibTeX Entry: Download
- Corresponding Paper BibTeX Entry: Download
- Supported Operating Systems: Platform Independent
- Data Formats: Arff
- Tags: Approximate Inference, Bayesian Networks, Data Streams, Multi Core, Bayesian Learning, Hidden Markov Models, Importance Sampling, Maximum Likelihood, Parallelisation, Varational Message Passing, Kalma
- Archive: download here
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.