Overview

The Learning Framework is GATE’s most recent machine learning plugin. It’s still under active development, but stable enough to use. However future versions may introduce changes which may not be backwards compatible (meaning that pipelines may only work with the older version or saved models may not be compatible between versions)

It offers a wider variety of more up to date ML algorithms than the earlier machine learning plugins, currently the following is supported natively (directly integrated in the plugin code):

The following libraries and tools are available in the LearningFramework through a wrapper (see below):

Wrappers are software which runs the machine learning software or library in a separate process and the LearningFramework communicates with the wrapper software for training and application by providing a file or sending/receiving data. This solution is used for either or both of two reasons:

  1. the license of the machine learning library or tool is not compatible with the license of the LearningFramework (e.g. Weka) and therefore cannot get distributed with it
  2. the machine learning tool is written in a different language, e.g. Python (e.g. Keras, Pytorch, SciKit-Learn).

Finally, the application of a trained model can also be performed via the use of a HTTP model application server. The LearningFramework supports a very simple HTTP protocol for sending feature vectors to the server in JSON format, getting back the model predictions and applying them to the document that is being processed. See ServerForApplication

Supported Machine Learning Tasks

The Learning Framework supports the following tasks:

These are provided in separate processing resources (PRs), with separate PRs for training and application and evaluation plugins for classification and regression. Get started here!

In addition, the plugin contains PRs that help with the creation of features to use for a machine learning task:

Note that PRs from other plugins can also be very useful to generate features:

Feature Overview

Processing Resources:

Example pipelines, tutorials etc

Other important documentation pages:

Miscellaneous other pages: