-
- Description:
Aika is a Java library that automatically extracts and annotates semantic information into text. In case this information is ambiguous, Aika will generate several hypothetical interpretations concerning the meaning of the text and pick the most likely one. The Aika algorithm is based on various ideas and approaches from the field of AI such as artificial neural networks, frequent pattern mining and logic based expert systems. It can be applied to a broad spectrum of text analysis task and combines these concepts in a single algorithm.
Aika allows to model linguistic concepts like words, word meanings (entities), categories (e.g. person name, city), grammatical word types and so on as neurons in a neural network. By choosing appropriate synapse weights, these neurons can take on different functions within the network. For instance neurons whose synapse weights are chosen to mimic a logical AND can be used to match an exact phrase. On the other hand neurons with an OR characteristic can be used to connect a large list of word entity neurons to determine a category like 'city' or 'profession'.
Aika is based on non-monotonic logic, meaning that it first draws tentative conclusions only. In other words, Aika is able to generate multiple mutually exclusive interpretations of a word, phrase, or sentence, and select the most likely interpretation. For example a neuron representing a specific meaning of a given word can be linked through a negatively weighted synapse to a neuron representing an alternative meaning of this word. In this case these neurons will exclude each other. These synapses might even be cyclic. Aika can resolve such recurrent feedback links by making tentative assumptions and starting a search for the highest ranking interpretation.
In contrast to conventional neural networks, Aika propagates activations objects through its network, not just activation values. These activation objects refer to a text segment and an interpretation.
Aika consists of two layers. The neural layer, containing all the neurons and continuously weighted synapses and underneath that the discrete logic layer, containing a boolean representation of all the neurons. The logic layer uses a frequent pattern lattice to efficiently store the individual logic nodes. This architecture allows Aika to process extremely large networks since only neurons that are activated by a logic node need to compute their weighted sum and their activation value. This means that the fast majority of neurons stays inactive during the processing of a given text.
To prevent that the whole network needs to stay in memory during processing, Aika uses the provider pattern to suspend individual neurons or logic nodes to an external storage like a mongo db.
- Changes to previous version:
Aika Version 0.17 2018-05-14
- Introduction of synapse relations. Previously the relation between synapses was implicitly modeled through word positions (RIDs). Now it is possible to explicitly model relations like: The end position of input activation 1 equals to the begin position of input activation 2. Two types of relations are currently supported, range relations and instance relations. Range relations compare the input activation range of a given synapse with that of the linked synapse. Instance relations also compare the input activations of two synapses, but instead of the ranges the dependency relations of these activations are compared.
- Removed the norm term from the interpretation objective function.
- Introduction of an optional distance function to synapses. It allows to model a weakening signal depending on the distance of the activation ranges.
- Example implementation of a context free grammar.
- Example implementation for co-reference resolution.
- Work on an syllable identification experiment based on the meta network implementation.
Aika Version 0.15 2018-03-16
- Simplified interpretation handling by removing the InterpretationNode class and moving the remaining logic to the Activation class.
- Moved the activation linking and activation selection code to separate classes.
- Ongoing work on the training algorithms.
- BibTeX Entry: Download
- Supported Operating Systems: Platform Agnostic
- Data Formats: Txt
- Tags: Information Extraction, Inference, Neural Network, Text Mining
- Archive: download here
Comments
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.