# Stanza pipeline

If `gatenlp` has been installed with the stanza extra (`pip install gatenlp[stanza]` or `pip install gatenlp[all]`) you can run a Stanford Stanza pipeline on a document and get the result as `gatenlp` annotations. 



In [1]:
from gatenlp import Document
from gatenlp.lib_stanza import AnnStanza
import stanza

print("Stanza version:", stanza.__version__)

Stanza version: 1.3.0


In [2]:
# In order to use the English pipeline with stanza, the model has to get downloaded first
stanza.download('en')

Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.3.0.json:   0%|   …

2022-06-28 00:02:18,131|INFO|stanza|Downloading default packages for language: en (English)...
2022-06-28 00:02:22,054|INFO|stanza|File exists: /data/johann/stanza_resources/en/default.zip.
2022-06-28 00:02:33,832|INFO|stanza|Finished downloading models and saved to /data/johann/stanza_resources.


In [3]:
doc = Document.load("https://gatenlp.github.io/python-gatenlp/testdocument2.txt")
doc

## Annotating the document using Stanza

In order to annotate one or more documents using Stanza, first create a AnnStanza annotator object
and the run the document(s) through this annotator:

In [4]:
stanza_annotator = AnnStanza(lang="en")

2022-06-28 00:02:34,035|INFO|stanza|Loading these models for language: en (English):
| Processor    | Package   |
----------------------------
| tokenize     | combined  |
| pos          | combined  |
| lemma        | combined  |
| depparse     | combined  |
| sentiment    | sstplus   |
| constituency | wsj       |
| ner          | ontonotes |

  return torch._C._cuda_getDeviceCount() > 0
2022-06-28 00:02:34,039|INFO|stanza|Use device: cpu
2022-06-28 00:02:34,040|INFO|stanza|Loading: tokenize
2022-06-28 00:02:34,070|INFO|stanza|Loading: pos
2022-06-28 00:02:34,330|INFO|stanza|Loading: lemma
2022-06-28 00:02:34,418|INFO|stanza|Loading: depparse
2022-06-28 00:02:34,793|INFO|stanza|Loading: sentiment
2022-06-28 00:02:35,171|INFO|stanza|Loading: constituency
2022-06-28 00:02:35,618|INFO|stanza|Loading: ner
2022-06-28 00:02:36,211|INFO|stanza|Done loading processors!


In [5]:
doc = stanza_annotator(doc)
doc

### Notebook last updated

In [6]:
import gatenlp
print("NB last updated with gatenlp version", gatenlp.__version__)

NB last updated with gatenlp version 1.0.8.dev3
