# Project configuration

The Configuration tab in the Project management page allows you to change project settings including what annotations are captured.

Project configurations can be imported and exported in the format of a JSON file.

The project can be also be cloned (have configurations copied to a new project). Note that cloning does not copy documents, annotations or annotators to the new project.

# Configuration fields

  • Name - The name of this annotation project.

  • Description - The description of this annotation project that will be shown to annotators. Supports markdown and HTML.

  • Annotator guideline - The description of this annotation project that will be shown to annotators. Supports markdown and HTML.

  • Annotations per document - The project completes when each document in this annotation project have this many number of valid annotations. When a project completes, all project annotators will be un-recruited and be allowed to annotate other projects.

  • Maximum proportion of documents annotated per annotator (between 0 and 1) - A single annotator cannot annotate more than this proportion of documents.

  • Timeout for pending annotation tasks (minutes) - Specify the number of minutes a user has to complete an annotation task (i.e. annotating a single document).

  • Reject documents - Switching this off will mean that annotators for this project will be unable to choose to reject documents.

  • Document ID field - The field in your uploaded documents that is used as a unique identifier. GATE's json format uses the name field. You can use a dot limited key path to access subfields e.g. enter features.name to get the id from the object {'features':{'name':'nameValue'}}

  • Training stage enable/disable - Enable or disable training stage, allows testing documents to be uploaded to the project.

  • Test stage enable/disable - Enable or disable testing stage, allows test documents to be uploaded to the project.

  • Auto elevate to annotator - The option works in combination with the training and test stage options, see table below for the behaviour:

    Training stage Testing stage Auto elevate to annotator Desciption
    Disabled Disabled Enabled/Disabled User allowed to annotate without manual approval.
    Enabled Disabled Disabled Manual approval required.
    Disabled Enabled Disabled "
    Enabled Disabled Enabled User always allowed to annotate after training phase completed
    Disabled Enabled Enabled User automatically allowed to annotate after passing test, if user fails test they have to be manually approved.
    Enabled Enabled Enabled "
  • Test pass proportion - The proportion of correct test annotations to be automatically allowed to annotate documents.

  • Gold standard field - The field in document's JSON/column that contains the ideal annotation values and explanation for the annotation.

  • Pre-annotation - Pre-fill the form with annotation provided in the specified field. See Importing Documents with pre-annotation section for more detail.

# Anotation configuration

The annotation configuration takes a json string for configuring how the document is displayed to the user and types of annotation will be collected. Here's an example configuration and a preview of how it is shown to annotators:

Within the configuration, it is possible to specify how your documents will be displayed. The Document input preview box can be used to provide a sample of your document for rendering of the preview.

// Example contents for the Document input preview
{
  "text": "Sometext with <strong>html</strong>"
}

The above configuration displays the value from the text field from the document to be annotated. It then shows a set of 3 radio inputs that allows the user to select a Negative, Neutral, or Positive sentiment with the label name sentiment.

All fields require the properties name and type, it is used to name our label and determine the type of input/display to be shown to the user respectively.

Another field can be added to collect more information, e.g. a text field for opinions:

Note that for the above case, the optional field is added ensure that allows user to not have to input any value. This optional field can be used on all components.

Some fields are available to configure which are specific to components, e.g. the options field are only available for the radio, checkbox and selector components. See details below on the usage of each specific component.

The captured annotation results in a JSON dictionary, an example can be seen in the Annotation output preview box. The annotation is linked to a Document and is converted to a GATE JSON annotation format when exported.

# Displaying text

The htmldisplay widget allows you to display the text you want annotated. It accepts almost full range of HTML input which gives full styling flexibility.

Any field/column from the document can be inserted by surrounding a field/column name with double or triple curly brackets. Double curly brackets renders text as-is and triple curly brackets accepts HTML string:

The widget makes no assumption about your document structure and any field/column names can be used, even sub-fields by using the dot notation e.g. parentField.childField:

If your documents are plain text and include line breaks that need to be preserved when rendering, this can be achieved by using a special HTML wrapper which sets the white-space CSS property (opens new window).

white-space: pre-line preserves line breaks but collapses other whitespace down to a single space, white-space: pre-wrap would preserve all whitespace including indentation at the start of a line, but would still wrap lines that are too long for the available space.

# Text input

# Textarea input

# Radio input

# Checkbox input

# Selector input

# Alternative way to provide options for radio, checkbox and selector

A dictionary (key value pairs) and also be provided to the options field of the radio, checkbox and selector widgets but note that the ordering of the options are not guaranteed as javascript does not sort dictionaries by the order in which keys are added.

# Dynamic options for radio, checkbox and selector

All the examples above have a "static" list of available options for the radio, checkbox and selector widgets, where the complete options list is enumerated in the project configuration and every document offers the same set of options. However it is also possible to take some or all of the options from the document data rather than the configuration data. For example:

"fromDocument" is a dot-separated property path leading to the location within each document where the additional options can be found, for example "fromDocument":"candidates" looks for a top-level property named candidates in each document, "fromDocument": "options.custom" would look for a property named options which is itself an object with a property named custom. The target property in the document may be in any of the following forms:

  • an array of objects, each with value and label properties, exactly as in the static configuration format - this is the format used in the example above
  • an array of strings, where the same string will be used as both the value and the label for that option
  • an arbitrary "dictionary" object mapping values to labels
  • a single string, which is parsed into a list of options

The "single string" alternative is designed to be easier to use when importing documents from CSV files. It allows you to provide any number of options in a single CSV column value. Within the column the options are separated by semicolons, and each option is of the form value=label. Whitespace around the delimiters is ignored, both between options and between the value and label of a single option. For example given CSV document data of

text options
Favourite fruit apple=Apples; orange = Oranges; kiwi=Kiwi fruit

a {"fromDocument": "options"} configuration would produce the equivalent of

[
  {"value": "apple", "label": "Apples"},
  {"value": "orange", "label": "Oranges"},
  {"value": "kiwi", "label": "Kiwi fruit"}
]

If your values or labels may need to contain the default separator characters ; or = you can select different separators by adding extra properties to the configuration:

{"fromDocument": "options", "separator": "~~", "valueLabelSeparator": "::"}
text options
Favourite fruit apple::Apples ~~ orange::Oranges ~~ kiwi::Kiwi fruit

The separators can be more than one character, and you can set "valueLabelSeparator":"" to disable label splitting altogether and just use the value as its own label.

# Mixing static and dynamic options

Static and fromDocument options may be freely interspersed in any order, so you can have a fully-dynamic set of options by specifying only a fromDocument entry with no static options, or you can have static options that are listed first followed by dynamic options, or dynamic options first followed by static, etc.