Language Identification#
Application ID
eot_language_identifier
Application Aliases
language-identifier.app lid.app eot/lid.app eot/language-identifier.app
Description#
The Language Identification (lid) application identifies the language of a document and even pinpoint the different sections with their according languages:
Configuration#
The configuration is an object defined as (bold = required, italic = optional):
default_language (
string
) – The default language code to return when the language of a section cannot be detected. Default:english
language_candidates (
array[string]
) – List of the languages to considersections (
boolean
) – Analyze the full document and return the sections with their corresponding language. Default:False
section_data (
boolean
) – Add the text of the sections in the results. Default:False
Example#
return the language of the sentence
wow -p "lid.app" \
-i "Ik ga naar het werk met de fiets"
which yields:
{
"language": "dutch"
}
return the section of the different paragraphs.
wow -p 'lid(sections=true,section_data=true).app' \
-i "Ik ga met de fiets naar het werk, en ik kom terug met de train.
But I'm driving to de gym with my car."
which yields:
{
"sections": [
{
"begin_offset": 0,
"end_offset": 65,
"language": "dutch",
"text": "Ik ga met de fiets naar het werk, en ik kom terug met de train.\n\n"
},
{
"begin_offset": 65,
"end_offset": 103,
"language": "english",
"text": "But I'm driving to de gym with my car."
}
]
}
For an interpretation of the JSON data, refer to the application’s JSON schema.