PytorchWrapper Modules - predefined networks

This page gives an overview over the pre-defined network architectures that are available in the PytorchWrapper. Modules can be used in place of the automatically generated network architectures, but are limited to specific tasks for specific features and targets. While the default automatically generated architecture will get adapted to the features defined for the task, each pre-defined module works only for a specific feature definition and expects a fixed input format.

Remember that before training starts with the PytorchWrapper, the LearningFramework performs the following steps:

So the training PR creates a copy of the gate-lf-pytorch-json library as dataDirectory/FileJsonPyTorch/gate-lf-pytorch-json. This library has a standard Python library directory structure, the actual package is in subdirectory gatelfpytorchjson. All the modules are defined in the subdirectory modules within that package directory. So the modules are in directory dataDirectory/FileJsonPyTorch/gate-lf-pytorch-json/gatelfpytorchjson/modules.

A module is a file that contains the definition of a Python class with the same name as the file (without the .py extension), for example the file SeqTaggerLSTMSimple.py contains a class SeqTaggerLSTMSimple. In order to use the module, the algorithm parameter --module SeqTaggerLSTMSimple can be used.

The class defined for the module must inherit from gatelfpytorchjson.CustomModule which in turn inherits from torch.nn.Module. It is thus a Pytorch module which works essentially like a standard Pytorch module, can be saved and re-loaded like a Pytorch module but needs to follow the following (additional) conventions:

IMPORTANT/TODO: consolidate the interface and document here, we probably need to return more values for forward and loss.

Example

The following shows a simple example module (some non-essential code not included here) which is included by default, SeqTaggerLSTMSimple:

import sys, logging, torch.nn, torch.nn.functional as F
from gatelfpytorchjson import CustomModule, EmbeddingsModule

# set up the logger here to log to sys.stderr

class SeqTaggerLSTMSimple(CustomModule):

    def __init__(self, dataset, config={}):
        super().__init__(config=config)
        self.n_classes = dataset.get_info()["nClasses"]
        feature = dataset.get_index_features()[0]
        vocab = feature.vocab
        self.layer_emb = EmbeddingsModule(vocab)
        emb_dims = self.layer_emb.emb_dims

        self.lstm_hiddenunits = 200
        self.lstm_nlayers = 1
        self.lstm_is_bidirectional = False
        self.layer_lstm = torch.nn.LSTM( input_size=emb_dims, hidden_size=self.lstm_hiddenunits,
            num_layers=self.lstm_nlayers, dropout=0.0, bidirectional=self.lstm_is_bidirectional,
            batch_first=True)
        lin_units = self.lstm_hiddenunits*2 if self.lstm_is_bidirectional else self.lstm_hiddenunits
        self.lstm_totalunits = lin_units
        self.layer_lin = torch.nn.Linear(lin_units, self.n_classes)
        logger.info("Network created: %s" % (self, )) # display the network

    def forward(self, batch):
        batch = torch.LongTensor(batch[0])  # only expect/use a single feature
        if self.on_cuda():
            batch.cuda()
        tmp_embs = self.layer_emb(batch)
        lstm_hidden, (lstm_c_last, lstm_h_last) = self.layer_lstm(tmp_embs)
        tmp_lin = self.layer_lin(lstm_hidden)
        out = F.log_softmax(tmp_lin, 2)
        return out

    def get_lossfunction(self, config={}):
        return torch.nn.NLLLoss(ignore_index=-1)

    def get_optimizer(self, config={}):
        parms = filter(lambda p: p.requires_grad, self.parameters())
        optimizer = torch.optim.Adam(parms, lr=0.015, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
        return optimizer

MORE DOCUMENTATION TO BE DONE, BUT THE IMPLEMENTATION OF THIS WILL CHANGE A BIT FIRST!