Web Service Client Annotators

Web service client annotators are annotators which use a web service to annotate documents: for each document that gets processed, data is sent to a HTTP endpoint, processed there and information is sent back that is then used to annotate the document.

Currently the following client annotators are implemented:

from gatenlp import Document
from gatenlp.processing.client import GateCloudAnnotator

Lets try annotating a document with the English Named Entity Recognizer on GATE cloud (https://cloud.gate.ac.uk/shopfront/displayItem/annie-named-entity-recognizer).

The information page for that service shows that the following annotation types can be requested of which the first 5 are requested by default if no alternate list is specified:

We create a GateCloudAnnotator an specify the full list of all supported annotation types. We also specify the URL of the service endpoint as provided on the info page and specify that the annotations should be put into the annotation set “ANNIE”. Note that a limited number of documents can be annotated for free and without authentication, so we do not need to specify the api_key and api_password parameters.

annotator = GateCloudAnnotator(
    url="https://cloud-api.gate.ac.uk/process-document/annie-named-entity-recognizer", 
    out_annset="ANNIE", 
    ann_types=":Address,:Date,:Location,:Organization,:Person,:Money,:Percent,:Token,:SpaceToken,:Sentence"
)
# an example document to annotate
doc = Document("Barack Obama visited Microsoft in New York last May.")
# Run the annotator and show the annotated document
doc = annotator(doc)
doc