Project details for AMIDST Toolbox

Screenshot AMIDST Toolbox 0.6.0

by ana - October 14, 2016, 19:35:27 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (3 today), download ( 0 today ), 0 subscriptions

Description:

AMiDST is a Java Toolbox for Scalable Probabilistic Machine Learning.

In AMiDST, you can model your problem using a flexible probabilistic language based on graphical models. Then, fit it with data using a Bayesian approach to handle modelling uncertainty.

AMIDST provides tailored parallel and distributed implementations of Bayesian parameter learning for batch and streaming data (multi-core and distributed processing). This processing is based on flexible and scalable message passing algorithms.

MAIN FEATURES

  • Probabilistic Graphical Models: You can specify your model using probabilistic graphical models with latent variables and temporal dependencies.

  • Scalable inference: Perform inference on your probabilistic models with powerful approximate and scalable algorithms based on novel variational message passing schemes.

  • Data Streams: Update your models when new data is available. This makes our toolbox appropriate for learning from (massive) data streams.

  • Large-scale Data: Use your defined models to process massive data sets in a distributed computer cluster using Flink or Spark.

  • Extensible: Code your models or algorithms within AMiDST and expand the toolbox functionalities. Flexible toolbox for researchers performing their experimentation in machine learning.

  • Interoperability: Leverage existing functionalities and algorithms by interfacing to other software tools such as Hugin, MOA, Weka, R, etc.

PUBLICATIONS: (per year)

2016

[1] Masegosa, A., R., Martinez, A. M. and Borchani, H. (2016). Probabilistic Graphical Models on Multi-Core CPUs Using Java 8. In IEEE Computational Intelligence Magazine, Vol. 11, No. 2., pages 41-54. DOI: 10.1109/mci.2016.2532267.

[2] Masegosa, A. R., Martinez., A. M., Ramos-López, D., Langseth, H., Nielsen, T. D., Salmerón, A., Cabañas, R., and Madsen, A. L. (2016). A Java Toolbox for Analysis of MassIve Data STreams using Probabilistic Graphical Models. Poster. Presented at the European Data Forum.

[3] Salmerón, A. Madsen, A.L., Jensen, F., Langseth, H., Nielsen, T. D., Ramos-López, D., Martinez, A. M., and Masegosa, A., R. (2016). Parallel Filter-Based Feature Selection Based on Balanced Incomplete Block Designs. Accepted for ECAI.

[4] Madsen, A. L., Jensen, F., Salmeron, A., Langseth, H., Nielsen, T. D. (2016). A Parallel Algorithm for Bayesian Network Structure Learning from Large Data Sets. Accepted for Knowledge-Based Systems. VBN. DOI.

[5] Masegosa, A., R., Martinez, A. M., Langseth, H., Nielsen, T. D., Salmeron, A., Ramos-Lopez, D., Madsen, A. L. (2016). d-VMP: Distributed Variational Message Passing. Accepted for PGM. VBN. Online access.

[6] Ramos-Lopez, D., Salmeron, A., Rumi, R., Martinez, A. M., Nielsen, T. D., Masegosa, A., R., Langseth, H., Madsen, A. L. (2016). Scalable MAP inference in Bayesian networks based on a Map-Reduce approach. Accepted for PGM. VBN. Online access.

2015

[1] Salmerón, A, Rumi, R., Langseth, H., Madsen, A. L., Nielsen, T. D. (2015). MPE Inference in Conditional Linear Gaussian Networks. In proceedings of ECSQARU on 15-17 July 2015 in Compiegne, France, pages 407-416. DOI: 10.1007/978-3-319-20807-7.

[2] Madsen, A. L. and Salmerón, A (2015). Analysis of massive data streams using R and AMIDST. In book of abstracts of useR!2015 on 30 June -3 July 2015 in Aalborg, Denmark, page 171.

[3] Borchani, H., Martinez, A. M., Masegosa, A, Langseth, H., Nielsen, T. D., Salmerón, A., Fernández, A., Madsen, A. L., Sáez, R. (2015). Modeling concept drift: A probabilistic graphical model based approach. In proceedings of The Fourteenth International Symposium on Intelligent Data Analysis, 22-24 October 2015 in Saint-Etienne, France, pages 72-83.

[4] Salmerón, A., Ramos-López, D., Borchani, H., Martinez, A. M., Masegosa, A., Fernández, A., Langseth, H., Madsen, A. L., Nielsen, T. D. (2015). Parallel importance sampling in conditional linear Gaussian networks. The XVI Conference of the Spanish Association for Artificial Intelligence (CAEPIA'15), pages 36-46.

[5] Madsen, A. L., Jensen, F., Salmerón, A., Langseth, H., Nielsen, T. D. (2015). Parallelization of the PC Algorithm (2015). The XVI Conference of the Spanish Association for Artificial Intelligence (CAEPIA'15), pages 14-24.

[6] Borchani, H., Martinez, A. M., Masegosa, A, Langseth, H., Nielsen, T. D., Salmerón, A., Fernández, A., Madsen, A. L., Sáez, R (2015). Dynamic Bayesian modeling for risk prediction in credit operations (2015). The 13th Scandinavian Conference on Artificial Intelligence, Halmstad, Sweden, November 5-6, 2015, pages 72-83.

[7] Masegosa, A, Martinez, A. M., Borchani, H., Ramos-Lopez, D., Nielsen, T. D., Langseth, H., Salmerón, Madsen, A. L. (2015). AMIDST: Analysis of MassIve Data STreams (2015). In proceedings of The 27th Benelux Conference on Artificial Intelligence, Hasselt, Belgium, November 5-6, 2015.

Changes to previous version:
  • Added sparklink module implementing the integration with Apache Spark. More information here.
  • Fluent pattern in latent-variable-models
  • Predefined model implementing the concept drift detection

Detailed information can be found in the toolbox's web page

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
Supported Operating Systems: Platform Independent
Data Formats: Arff, Json, Parquet
Tags: Approximate Inference, Bayesian Networks, Data Streams, Multi Core, Bayesian Learning, Hidden Markov Models, Importance Sampling, Maximum Likelihood, Parallelisation, Varational Message Passing, Kalma
Archive: download here

Other available revisons

Version Changelog Date
0.6.0
  • Added sparklink module implementing the integration with Apache Spark. More information here.
  • Fluent pattern in latent-variable-models
  • Predefined model implementing the concept drift detection

Detailed information can be found in the toolbox's web page

October 14, 2016, 19:35:27
0.5.1

The new added functionalities include support to Flink for distributed learning of probabilistic graphical models and support for Latent Dirichlet Allocation Models for text analysis purposes.

Detailed information can be found in the toolbox's web page

August 8, 2016, 13:45:24
0.4.1
  • Bugs fixed.
  • Improved usability.
April 20, 2016, 09:44:23
0.4.0

Initial Announcement on mloss.org.

April 8, 2016, 10:28:41
2.0

Initial Announcement on mloss.org.

March 31, 2016, 09:51:01

Comments

No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.