Token API for Python#

class eot.wowool.annotation.Token#

Bases: eot.wowool.annotation.annotation.Annotation

Token is a class that contains all the information of a token

__init__(begin_offset: int, end_offset: int, literal: str)#

Initialize a Token instance

Parameters
  • begin_offset (int) – Begin offset of the token

  • end_offset (int) – End offset of the token

  • literal (str) – Literal string from the input document

Returns

An initialized token

Return type

Token

literal: str#
morphology: List[eot.wowool.annotation.token.MorphData]#
properties: MutableSet[str]#
rich()#
Returns

The rich string representation of the token

Return type

str

property stem: str#
Returns

The first stem of the token or an empty string if absent

Type

str

property pos: str#
Returns

The first part-of-speech of the token or an empty string if absent

Type

str

has_property(prop: str) bool#
Parameters

prop (str) – Property name. For example "NF"

Returns

Whether a given property is set on the token

Return type

bool

has_pos(pos: str) bool#
Parameters

pos (str) – Part-of-speech. For example "Nn"

Returns

Whether a given part-of-speech is set on the token

Return type

bool

get_morphology(pos: str)#
Parameters

pos (str) – Part-of-speech. For example "Nn"

Returns

Whether a given part-of-speech is set on the token

Return type

bool

static iter(object) Generator[eot.wowool.annotation.token.Token, None, None]#

Iterate over the concepts in a document, an analysis, a sentence or a concept. For example:

document = analyzer("Hello from Antwerp, said John Smith.")
for token in Token.iter(document):
    print(token)
Parameters

object (Analysis, Sentence or Concept) – Object to iterate

Returns

A generator expression yielding tokens

Return type

Token

static next(sentence, object) eot.wowool.annotation.token.Token#

returns the next Token

static prev(sentence, object: Union[eot.wowool.annotation.token.Concept, eot.wowool.annotation.token.Token]) eot.wowool.annotation.token.Token#

returns the prev Token

property begin_offset: int#
Returns

The begin offset of the annotation

Type

int

property end_offset: int#
Returns

The end offset of the annotation

Type

int

property index: int#
property is_concept: bool#
Returns

Whether the annotation is a Concept

Return type

bool

property is_sentence: bool#
Returns

Whether the annotation is a Sentence

Return type

bool

property is_token: bool#
Returns

Whether the annotation is a Token

Return type

bool

class eot.wowool.annotation.token.MorphData#

MorphData is a class that contains the morphological data. For example:

for md in token.morphology:
    print( md.pos , md.stem )
Parameters
  • pos (str) – Part-of-speech

  • stem (str) – Stem

rich() str#
Returns

The rich string representation of the morphological data

Return type

str

See also the corresponding JSON schema.