JSON Schemas#

Information exchange both from the Portal and the SDK is represented using a JSON format for the exchanged data. Within this format, several schemas are defined. These schemas provide a format for what JSON data can be expected from (or for) a given application and how to interact with it. As a user, applying these standards on the returned JSON results lets you enforce consistency and data validation.

Document Schema#

At the core of the NLP library is the document schema, which represents the structure in which all results for a processed document are represented. In short, each processed document is a assigned a JSON object with:

  • a uniquely given or assigned – through use of input providers – document identifier

  • any applied applications adding their results under a unique application identifier

As an example, the following diagram represents a visualization of the schema:

Document schema

Note that <doc_id> is a placeholder for the actual document id. Likewise, <app_id> is a placeholder for an application id.

Application Schemas#

As seen on the diagram of the document schema, applications can generate results and diagnostics. Result schemas are application-specific whereas the schema of diagnostics are fixed.

Results#

Each application processing a document will add its own results under a unique application identifier in the document schema. Of course, each application has its own result schema. The most prominent is that of the core NLP analysis, refered to as the analysis schema as it forms the basis for many subsequent applications to extend or build upon.

The following list gives an overview of available application result schemas:

Diagnostics#

Each application processing a document can generate diagnostics: a list of messages that notify the client. The schema of diagnostics is fixed.

Language Schemas#