Using Neural Networks

The LearningFramwork allows the use of Neural Networks for classification and sequence tagging through two different backends:

Documentation for Using Neural Networks

Overview

Support for neural networks through the Pytorch and Keras wrappers follows the same basic design and is based on the same representation of training data through an out-of-memory file that contains one JSON representation of an instance or sequence per line.

When a PR for training classification or chunking is used in GATE with the Pytorch or Keras wrapper,

When a PR for application is run:

Training when using a Neural Network backend

Note that training of a neural network on a large corpus can take a very long time. In some cases, and with very complex network architectures, it may be necessary to train on a different computer than where the GATE LearningFramework was run. Exploring different architectures or hyperparameters may require many training runs and sometimes it is convenient to run those in parallel on several computers.

Note also that re-training variations of the network or re-training with different hyperparameters does not actually need the step where the data and meta files are created from the original GATE document corpus: unless the feature specification is changed, these files will end up to be identical.

The way how the neural network backends are implemented makes it easy to just concentrate on re-running the actual training step (running the train.sh) directly from the command line, either on the same computer on which the content of the dataDirectory was created or on a different computer that has Python and the required Python packages installed by copying the whole directory to that computer. That way the user can experiment with modified networks or hyperparameters until the validation accuracy looks good. The model created that way can then be transferred back to the dataDirectory where the application to new documents should be carried out with GATE.