Analysis Results Schema for JSON#
Application ID
eot_analysis
Description#
The AnalysisResults
schema describes the data returned from an NLP analysis by the SDK or the Portal before it is returned into a native programming object such as Analysis object for Python. It is the structure that contains the basic linguistic analysis results.
Definition#
Analysis#
The AnalysisResults
schema is defined as:
object
– The analyzed documentid (
string
) – The identifier of the analyzed documentlanguage (
string
) – The language file that processed the document. If multiple languages process the document, only the first language is setsentences (
array[Sentence]
) – The list of analyzed sentences
Sentence#
The Sentence
schema is an extension of the Annotation
schema, which is defined as:
array
– The annotation, formatted as anarray
for size-efficiency during serialization[0]
(number
) – The type of the annotation:1
asSentence
,2
asConcept
,3
asToken
[1]
(number
) – The begin offset of the annotation[2]
(number
) – The end offset of the annotation
with the Sentence
schema therefore defined as:
array
– The sentence annotation[0]
(number
) – Always set to1
[1]
(number
) – The begin offset of the sentence within the document[2]
(number
) – The end offset of the sentence within the document[3]
(array[Annotation]
) – The list of annotations in the sentence[4]
(dict[str:list[str]]
) – Options that can be set on the sentence
Available Sentence options –
header : [‘true’] – Means the the sentence looks like a header.
Concept#
The Concept
schema is defined as:
array
– The concept annotation, formatted as anarray
for size-efficiency during serialization[0]
(number
) – Always set to2
[1]
(number
) – The begin offset of the concept within the document[2]
(number
) – The end offset of the concept within the document[3]
(string
) – The URI or name of the concept, e.g."Person"
[4]
(ConceptAttributes
) – The attributes of the concept
with the ConceptAttributes
schema defined as:
object
<name> (
array[string]
) – The attribute values of the attribute with name <name>
Token#
The Token
schema is defined as:
array
– The token annotation, formatted as anarray
for size-efficiency during serialization[0]
(number
) – Always set to3
[1]
(number
) – The begin offset of the token[2]
(number
) – The end offset of the token[3]
(string
) – The literal representation of the token[4]
(array[string]
) – The properties of the token[5]
(array[MorphData]
) – The morphological data of the token
with the MorphData
schema defined as a recursive structure:
array
[0]
(string
) – The stem of the token[1]
(string
) – The part-of-speech of the token[2]
(string
) – The properties of the token[3]
(string
) [optional
] – The (child) morphological data
[
3, --> object type 3 = Token
23, --> begin_offset
41, --> end_offset
"verplegerassistent", --> literal
[], --> properties
[
[
"verplegerassistent", --> stem
"Nn-Sg", --> part of speech
"compound", --> properties
[
["verpleger","Nn-Sg"], --> first part , stem, pos
["assistent","Nn-Sg"] --> second part , stem, pos
] --> components
]
] --> morphology
]
Example#
The following is an example of the JSON results for this application:
{
"id": "test.txt",
"language": "dutch",
"sentences": [
[1, 0, 62,
[
[2, 0, 62, "Sentence", {}],
[2, 0, 12, "Person", {
"canonical": ["Jan Jansens"],
"family": ["Jansens"],
"gender": ["male"], "given": ["Jan"]}],
[3, 0, 3, "Jan", ["init-cap", "init-token"], [["Jan", "Prop-Std", "giv"]]],
[3, 4, 12, "Jansens", ["init-cap", "nf", "nf-lex"], [["Jannsen", "Prop-Std", "gen", "guesser"]]],
[3, 13, 18, "werkt", [], [["werken", "V-Pres"]]],
[3, 19, 22, "als", [], [["als", "Prep-Std"]]],
[2, 23, 41, "Position", {}],
[3, 23, 41, "verplegerassistent", [], [["verplegerassistent", "Nn-Sg", "compound", [["verpleger", "Nn-Sg"], ["assistent", "Nn-Sg"]]]]],
[3, 42, 44, "in", [], [["in", "Prep-Std", "prefix"]]],
[3, 45, 48, "het", [], [["het", "Det-Def"]]],
[2, 49, 61, "Organization", {}],
[3, 49, 51, "UZ", ["all-cap", "nf"], [["UZ", "Prop-Std"]]],
[3, 52, 61, "Midelheim", ["init-cap", "nf", "nf-lex"], [["Midelheim", "Prop-Std"]]],
[3, 61, 62, ".", [], [[".", "Punct-Sent"]]]
]
]
]
}